Wednesday, February 2, 2011

Parsing the iPhone SMS Database

I was asked recently to help recover deleted messages from an iPhone SMS database.  Conveniently, this is called "sms.db" on the iPhone and it is located in the /mobile/Library/SMS/ directory.  It is a sqlite3 file type, and there are several GUI tools to read sqlite3 databases, as well as a sqlite3 command line shell.

It seems like a pretty straight forward exercise at first blush.  However, "How do I recover deleted messages from an iPhone database" is not as simple as question as it first seems.  When a user chooses to delete a message on the iPhone, the record is flagged as deleted in the database, but the record is not deleted from the database.  This means that records flagged as deleted are still recoverable with sqlite3 tools.

But when the phone is synced with iTunes, the records flagged "deleted" are actually removed (deleted) from the database itself and no longer recoverable with sqlite3 tools.  However, the data is still within the database, but the space it occupies is added to a "free list" for use by new data.  In other words, the data can still be recovered before it is overwritten by new data or the database is "vacuumed", a sqlite3 process that rebuilds the database removing all the free space and reducing the size of the database.  As a caveat, sqlite3 can be configured to overwrite records immediately upon deletion, but this is not the case for the iPhone at present.

So really, there are two types of deleted data to be sought: records flagged as deleted (not really deleted records at all), and records deleted from the database.  I'll discuss the first type in the remainder of this discussion.  I'll consider ways to recover deleted records in another post.

The sms.db has the following tables:
_SqliteDatabaseProperties, msg_group, group_member, msg_pieces, message

The message table contains the text messages of interest.  The contents of the table can be displayed from the shell with:
$ sqlite3 -header sms.db "select * from message"
ROWID|address|date|text|flags|replace|svc_center|group_id|association_id|height|UIFlags|version|subject|country|headers|recipients|read
1|+17132619725|1281033415|Hey, what's up?|2|0||1|0|0|4|0||us|||1
...
As you can see the interesting fields (revealed because the -header argument was used) are "ROWID," "address" (which is the source phone number), "date" (in unix epoch format), "text," and of less obvious value--"flags."  Flags indicates the type of message, i.e.,:
  • 2 = received
  • 3 = sent
  • 33 = Message send failure
  • 129 = deleted
  • (source: Adam Crosby)
With a little sqlite razzle dazzle, we can get a well formatted output in a more human readable form:
$ sqlite3 -header sms.db "select ROWID as row, case flags when 2 then 'rcvd' when 3 then 'sent' when 33 then 'fail' when 129 then '*del' else 'unkn' end as type, address as phone_no, datetime(date,'unixepoch','localtime') as date, text as message from message"
row|type|phone_no|date|text
1|rcvd|+17132619725|2010-08-05 11:36:55|Hey, what's up?
...
SQL syntax can be a bit tricky, and look a bit intimidating.  But by using internal commands, you can get a tremendous speed boost over using external text tools.  By way of explanation, I'll break down the command:

  1. sqlite3 -header sms.db  #open the database with sqlite and display column names 
  2. "select  #display the following columns from the table
  3. ROWID as row,  #display the ROWID column first, but rename it "row" in the output
  4. case flags when 2 then 'rcvd' when 3 then 'sent' when 33 then 'fail' when 129 then '*del' else 'unkn' end as type,  #display the flags column next, but change the "2" flag to "rcvd", "3" to "sent", etc., and rename the row "type"
  5. address as phone_no,  #display the address column renamed as "phone_no"
  6. datetime(date,'unixepoch','localtime') as date,  #convert the content of the date field to local time and display the column as "date"
  7. text as message  #display the text column renamed as "message"
  8. from message"  #all the columns of data are read from the message table
The command can be simplified if column renaming isn't required.  I included it here to make the output as clear as possible, and since the command can be incorporated into a script, it need only be typed once.  The quoted part of the command could be inserted into a GUI sqlite browser if that is your tool of preference.  The query can be adjusted to show just deleted messages, for example, by appending "where type like '*del'".

Notice I did not link the data in the message table to any other tables in the database.  While this can be done, my task was to seek out deleted messages.  And as I said earlier, I will explore methods for recovering deleted records from sqlite databases in a future post.

As always I welcome any comments or suggestions....