Monday, June 17, 2013

TextMe App: Lesson Learned from Unusual Tables

I recently had the opportunity to help a colleague with an iPhone database that was not supported by his automated tools. The application was the TextMe application, and predictably, the texting app stored its chat in a SQLite database. What made the database interesting was the fact that there was no immediately obvious way to identify to whom a message was sent.

Let me illustrate: A quick scan of the database reveals some of the capabilities of the application: texting (ZMESSAGE table) and calling (ZCALL, ZVOICEMAIL)

SQLite Command Line Interface
$ sqlite3 TextMe2.sqlite .tables

The subject of this investigation was the text messages, so I needed to see how the table was constructed.

SQLite Command Line Interface
$ sqlite3 TextMe2.sqlite ".schema zmessage"

The CREATE TABLE statement shows us there are 14 fields in the table, and a majority are integers: Z_PK, Z_ENT, Z_OPT, ZSTATUS, ZCALL, ZDISCUSSION, ZHEIGHT, ZSENDER, Z3_SENDER, ZTIMESTAMP, ZBODY, ZGUID, ZLOCATION, ZREMOTEID. But, don’t fall into the trap that the declared type of each column (e.g. INTEGER, TIMESTAMP, VARCHAR)actually constrain the data to those types, they don’t. Treat the type-name as informative only, but verify the data before forming any conclusions.

Inspecting the columns, we see some of obvious value in a forensic examination:

  • Z_PK (which is the primary key, an auto incrementing integer who’s chief value is assisting us in identifying if messages have been deleted from the database)





Others might be grabbing your attention, but I’m going to keep this discussion focused on these columns. Some sample data is in order:

SQLite Command Line Interface
$ sqlite3 -header TextMe2.sqlite "select z_pk, zstatus, zsender,
ztimestamp, zbody from zmessage limit 5;"
3|4|10|386362603|hey, what are you doing?
4|3|2|386362630|I'm checking out this new app
5|3|2|386362634|It might be a challenge to decode
6|3|2|386362644|But I'll figure it out...

We see we have a couple of interpretation issues, here: the status is an integer that needs to be interpreted, and is the sender. The date is some form of epoch time, and my eyes tell me its likely Mac Absolute Time. I expect to find the interpretations in other tables in the database. But what jumps off the screen at me is that there is no obvious answer the following question: To whom is the sender sending the message? The time stamp gives us a sequence of messages, but how do we know that sender "2" is sending messages to sender "10"? Couldn’t sender "2" be sending his message to, say, sender "5" and in the midst, receives a message from sender "10"? Absolutely!

So, how to we rectify this issue? Well, I sort of mischievously left off the zdiscussion column in my query. I did this to steer the conversation and simulate what can happen when an investigator encounters a new database for the first time: overlook an important column. If we include the column, we see something interesting:

SQLite Command Line Interface
$ sqlite3 -header TextMe2.sqlite "select z_pk, zstatus, zdiscussion,
zsender, ztimestamp, zbody from zmessage limit 5;"
3|4|2|10|386362603|hey, what are you doing?
4|3|2|2|386362630|I'm checking out this new app
5|3|2|2|386362634|It might be a challenge to decode
6|3|2|2|386362644|But I'll figure it out...

Now we see that the conversation is all part of the same discussion. And if we studied the whole database, we’d see example of where more than one conversation was occurring a the same time, and by sorting on the discussion field, we make sense of those conversations. But date stamp alone does not clue us in.

This might not seem like a big deal, but most messaging databases I have encountered have the remote party in the message record for both sent and received messages. This works well and leads to easy interpretation, e.g., "sent to Joe" and "received from Joe". But this database without properly understanding the discussion column, is the equivalent of "sent by Joe" and "sent by Jane", leading to the question "to whom?"

Rather than breakdown the rest of the analysis, I’m just going to share the query I used to complete the analysis:

  m.z_pk as "Key",
  datetime(ztimestamp + 978307200, 'unixepoch', 'localtime') as "Date",
    when m.z_ent then (select z_name from z_primarykey natural join zmessage)
    else "Unknown(" || m.z_ent || ")" end
    as "Type",
  case zstatus
    when 2 then "Sent"
    when 3 then "Delivered"
    when 4 then "Received"
    else "Unknown(" || zstatus || ")" end
    as "Status",
  zdiscussion as "DiscussionID",
  zusername as "Contact",
  zbody as "Message"
from zmessage as m, zcontact as c
on m.zsender = c.z_pk
order by discussionid asc, date asc;

By way of brief description:

  • the AS statements, such as that seen in m.z_pk as "Key", create aliases, effectively renaming the columns (or in the FROM statement, the tables) to make the output more informative.

  • The first CASE statement queries the z_primarykey to interpret the Z_ENT integer into its textual value.

  • The second CASE statement interprets the ZSTATUS flag into its English equivalent, which was not stored in the database, but determined by studying the application user-interface. This begs the question, then where does the application get the textual interpretation? Probably within its code.

  • The FROM statement queries two tables, zmessage and zcontact, to interpret the ZSENDER integer. But wait, you say, there is no ZSENDER in the select statement! (see next bullet point…)

  • The ON statement is the SQLite equivalent of an SQL INNER JOIN, which is an intersection of the two tables, i.e., it select rows from both tables where the columns match. In this case, the columns to be matched are ZSENDER from the ZMESSAGE table and Z_PK from the ZCONTACT table. The effect is that the SELECT statement knows which ZUSERNAME from ZCONTACT to return based on the ZSENDER value in the ZMESSAGE table.

  • The ORDER BY statement sorts the output first by the ZDISCUSSION column, then by the ZDATESTAMP, both in ascending order. Note that the column alias names are used.

I hope this gives some insight into the workings of TextMe2.sqlite databases and helps you in the future with the analysis of never-before-seen SQLite databases.

Sunday, June 16, 2013

Why I do What I do: Thanks Dad!

This post has both nothing and everything to do with why and how I do data forensics. I hope you'll take a moment to read it. 

I am a tenacious do-it-your-selfer in many areas of my life, including home repair, veterinary medicine (just ask my poor dogs), and data forensics. Some of it I even do reasonably well once in a while (even a broken clock is right twice a day, right?). There has been a strong influence in my life that lead me inevitably to this station in life: my father.

My father is a retired firefighter--scratch that--fireman.  Some would say he was tough on me while I was growing up.  I'm sure I've said that myself more than once.  But it would be more accurate to say that from an early age, my father instilled in me a strong sense of right-and-wrong and as self-reliance.  No excuses, no B.S.  He never failed to try to teach me, never missed an opportunity to impart wisdom that came from much experience.  He saw some of the worst life had to offer and lost friends in the service of his city, and I think he was hell bent on helping me avoid the pitfalls of life.  Unfortunately, my career choices betray how closely I listened to the "avoiding harm" lessons: I became a soldier and then a police officer!

Our relationship during my teen years was rocky at times, with frequent arguments over differences of opinion on important life matters (so important, that I can't remember a single one of them).  I didn't know it then, but we argued because we were so alike.  We have a lot of the same qualities: strong wills, the desire to be right (not for the sake of being better than others, but for the sake of not being correct in what we believe, do, and say), and the wish to pass what we know onto others. It was really just our point of view that differed.

My father doesn't just talk-the-talk, though.  He, above all, walks-the-walk.  I admire him greatly in this, and I strive to be more like him in this way.  He put his family first: I can remember how happy it made me when he bought a HiFi stereo system with all the bells and whistles, not because I had a new stereo to play with (oh no, DO NOT touch the equalizer settings!), but because he finally, after something like 15 years of my life, spent money on himself for something non-essential!  It made him happy, and that made me happy, too (and eventually, I was permitted to touch the equalizer... once I was schooled in the proper shape of sound).

So, on this Father's Day, I choose to write a non-technical post, but an important one because it acknowledges the source of my beliefs:

  • Thanks Dad, for making me care about right and wrong: I may take a little too long to reach a conclusion in a case, but I'm not likely to state something untrue because I have checked and double-checked the facts to the best of my ability.  
  • Thanks Dad, for giving me and insatiable curiosity for the world around me, so that I can now study a file or file system and come to understand how they work (or hours, days, and sometimes weeks trying!).
  • Thanks Dad, for teaching me to do for myself, so that now I can write programs to solve forensics problems to get investigators and prosecutors the essential information they need.
  • Thanks Dad, for teaching me so I know how to teach others, so that all that I have learned I willingly pass on and hopefully advance my field, if only a little.
  • Thanks Dad, for helping me find perspective in what I do, not letting work step in-line ahead of my family (the real reason I am no longer a gunslinger).

Happy Father's Day.  I hope I grow up to be just like you.

Time Perspective

Telling time in forensic computing can be complicated. User interfaces hide the complexity, usually displaying time stamps in a human reada...