I sometimes get questions about showing attachments in Apple iDevice messaging databases. The questions, however, seem to come at a time when I don’t have any databases on hand to study the issue. Well, this week I stumbled on the chats.db during an exam of a MacBook Air. The database contains iMessage and SMS messages, and can be configured to sync with the user’s other iDevices (iPhone, for example) through iCloud. So, I took a look at the database an determined a way to match the attachments with the messages.
The chats.db is found in the users directory in the Library/Messages folder.
Library/Messages/
Library/Messages/Attachments
Library/Messages/chat.db
Library/Messages/chat.db-shm
Library/Messages/chat.db-wal
As you can see, message attachments are located in the Attachments sub-folder. But how are they referenced in the chats.db, and how are they matched to the correct message? The database schema gives us the clues we need.
CREATE TABLE _SqliteDatabaseProperties (key TEXT, value TEXT, UNIQUE(key)); CREATE TABLE chat (ROWID INTEGER PRIMARY KEY AUTOINCREMENT, guid TEXT UNIQUE NOT NULL, style INTEGER, state INTEGER, account_id TEXT, properties BLOB, chat_identifier TEXT, service_name TEXT, room_name TEXT, account_login TEXT, is_archived INTEGER DEFAULT 0, last_addressed_handle TEXT, display_name TEXT, group_id TEXT); CREATE TABLE attachment (ROWID INTEGER PRIMARY KEY AUTOINCREMENT, guid TEXT UNIQUE NOT NULL, created_date INTEGER DEFAULT 0, start_date INTEGER DEFAULT 0, filename TEXT, uti TEXT, mime_type TEXT, transfer_state INTEGER DEFAULT 0, is_outgoing INTEGER DEFAULT 0, user_info BLOB, transfer_name TEXT, total_bytes INTEGER DEFAULT 0); CREATE TABLE handle ( ROWID INTEGER PRIMARY KEY AUTOINCREMENT UNIQUE, id TEXT NOT NULL, country TEXT, service TEXT NOT NULL, uncanonicalized_id TEXT, UNIQUE (id, service) ); CREATE TABLE chat_handle_join ( chat_id INTEGER REFERENCES chat (ROWID) ON DELETE CASCADE, handle_id INTEGER REFERENCES handle (ROWID) ON DELETE CASCADE, UNIQUE(chat_id, handle_id)); CREATE TABLE message (ROWID INTEGER PRIMARY KEY AUTOINCREMENT, guid TEXT UNIQUE NOT NULL, text TEXT, replace INTEGER DEFAULT 0, service_center TEXT, handle_id INTEGER DEFAULT 0, subject TEXT, country TEXT, attributedBody BLOB, version INTEGER DEFAULT 0, type INTEGER DEFAULT 0, service TEXT, account TEXT, account_guid TEXT, error INTEGER DEFAULT 0, date INTEGER, date_read INTEGER, date_delivered INTEGER, is_delivered INTEGER DEFAULT 0, is_finished INTEGER DEFAULT 0, is_emote INTEGER DEFAULT 0, is_from_me INTEGER DEFAULT 0, is_empty INTEGER DEFAULT 0, is_delayed INTEGER DEFAULT 0, is_auto_reply INTEGER DEFAULT 0, is_prepared INTEGER DEFAULT 0, is_read INTEGER DEFAULT 0, is_system_message INTEGER DEFAULT 0, is_sent INTEGER DEFAULT 0, has_dd_results INTEGER DEFAULT 0, is_service_message INTEGER DEFAULT 0, is_forward INTEGER DEFAULT 0, was_downgraded INTEGER DEFAULT 0, is_archive INTEGER DEFAULT 0, cache_has_attachments INTEGER DEFAULT 0, cache_roomnames TEXT, was_data_detected INTEGER DEFAULT 0, was_deduplicated INTEGER DEFAULT 0, is_audio_message INTEGER DEFAULT 0, is_played INTEGER DEFAULT 0, date_played INTEGER, item_type INTEGER DEFAULT 0, other_handle INTEGER DEFAULT -1, group_title TEXT, group_action_type INTEGER DEFAULT 0, share_status INTEGER, share_direction INTEGER, is_expirable INTEGER DEFAULT 0, expire_state INTEGER DEFAULT 0, message_action_type INTEGER DEFAULT 0, message_source INTEGER DEFAULT 0); CREATE TABLE chat_message_join ( chat_id INTEGER REFERENCES chat (ROWID) ON DELETE CASCADE, message_id INTEGER REFERENCES message (ROWID) ON DELETE CASCADE, PRIMARY KEY (chat_id, message_id)); CREATE TABLE message_attachment_join ( message_id INTEGER REFERENCES message (ROWID) ON DELETE CASCADE, attachment_id INTEGER REFERENCES attachment (ROWID) ON DELETE CASCADE, UNIQUE(message_id, attachment_id));
I’ll provide a summary of the database as I understand it. Messages are predictably stored in the message table. The message table lacks a reference to attachments, other than the fact that one exists: + cache_has_attachments INTEGER DEFAULT 0+. The default setting is zero, meaning no attachements are stored. A value of 1 indicates there is an attachment in the Attachments sub-folder. One other issue we find when examining the message table is that there is a reference to the remote party in the conversation (handle_id INTEGER DEFAULT 0
), but not the extact address—email, account identifier, or phone number—that an investigator would desire. That information is stored in the handle table. It is up to us to figure out how to relate the tables together.
Can’t we all just get along?
The difficulty in examining SQLite databases is determining how they are intended to relate information. There is seldom anything in the database itself that explains its intended use. It can similar to stumbling upon raw building materials and trying to figure out what is being built. Sometimes it’s easy, other times, not so much. But with the chats.db database, three table schema entries give us a clue as to the database design.
CREATE TABLE chat_handle_join ( chat_id INTEGER REFERENCES chat (ROWID) ON DELETE CASCADE, handle_id INTEGER REFERENCES handle (ROWID) ON DELETE CASCADE, UNIQUE(chat_id, handle_id)); CREATE TABLE chat_message_join ( chat_id INTEGER REFERENCES chat (ROWID) ON DELETE CASCADE, message_id INTEGER REFERENCES message (ROWID) ON DELETE CASCADE, PRIMARY KEY (chat_id, message_id)); CREATE TABLE message_attachment_join ( message_id INTEGER REFERENCES message (ROWID) ON DELETE CASCADE, attachment_id INTEGER REFERENCES attachment (ROWID) ON DELETE CASCADE, UNIQUE(message_id, attachment_id));
INFO: The message_attachment_join table shows us that the message_id
column in the table refers to the message table rowid
column. Likewise, the attachment_id
refers to the attachment table rowid
. Thus, the message_attachment_join table is used to match attachments to messages.
Each of the table names above ends in the word join. As used, the word join is just part of a table name, but it hints at a SQLite table operation called a JOIN
. A join combines two tables into one, and in SQLite there are two basic joins: INNER
and OUTER
. Inner joins, which come three variations, result in a combined table that includes only rows matching the join criteria. That is, the combined table only includes records with rows from each table that have one or more matching column values. While these are the default type of JOIN in SQLite, we are interested in results that show all messages, not just those with attachments.
OUTER joins, by contrast, do not require the records from each table to have a matching column. This means we can have combined table that shows all message rows, and if properly joined to the attachment table, rows containing messages with attachments will show attachment details. Further, if we join the handle table to the message table, we have everything we might want for an investigation.
I will be using a LEFT OUTER JOIN
, which is shortened in syntax to LEFT JOIN
. The basic syntax is "SELECT column(s) FROM left_table LEFT JOIN right_table ON left_table.columnName = right_table.columnName
". A LEFT JOIN
returns all rows of the left_table regardless of matching rows in the right table. Where rows match in the right table, they are joined to the matching left table row.
Tip
|
It is easier to understand and troubleshoot SQL queries by reading them backwards: Predicate, then subject. For example, reading the query in the paragraph above as "FROM left_table LEFT JOIN right_table ON left_table.columnName = right_table.columnName SELECT column(s) " can lend clarity to the output. |
Applying a LEFT JOIN
to the chats table, we can create a "Super table" combining the message, attachment, and handle tables.
SELECT * FROM message AS m LEFT JOIN message_attachment_join AS maj ON message_id = m.rowid LEFT JOIN attachment AS a ON a.rowid = maj.attachment_id LEFT JOIN handle AS h ON h.rowid = m.handle_id
Tip
|
The "expr1 AS expr2" statement sets expr2 as an alias for expr1, saving keystrokes and making the lines easier to read. Thus message_attachement_join.attachment_id becomes maj.attachment_id . |
Entirely accurate, but probably containing more information than we need, the above query results in the following columns:
ROWID |
guid |
text |
replace |
service_center |
handle_id |
subject |
country |
attributedBody |
version |
type |
service |
account |
account_guid |
error |
date |
date_read |
date_delivered |
is_delivered |
is_finished |
is_emote |
is_from_me |
is_empty |
is_delayed |
is_auto_reply |
is_prepared |
is_read |
is_system_message |
is_sent |
has_dd_results |
is_service_message |
is_forward |
was_downgraded |
is_archive |
cache_has_attachments |
cache_roomnames |
was_data_detected |
was_deduplicated |
is_audio_message |
is_played |
date_played |
item_type |
other_handle |
group_title |
group_action_type |
share_status |
share_direction |
is_expirable |
expire_state |
message_action_type |
message_source |
message_id |
attachment_id |
ROWID |
guid |
created_date |
start_date |
filename |
uti |
mime_type |
transfer_state |
is_outgoing |
user_info |
transfer_name |
total_bytes |
ROWID |
id |
country |
service |
uncanonicalized_id |
Note
|
If you look carefully at the schema at the top of this article, and the column listing above, you will notice that the columns are those of all four tables combined and in the order they are referenced. |
We can refine the output by identifying specific columns we wish to display from each row. We can use the DATETIME function to convert the Mac Absolute Time in the date column to local time (by first converting to Unix epoch by adding a few more than 978 million seconds) and interpret the is_from_me column from integer to text using a CASE
statement.
SELECT m.rowid, DATETIME(date +978307200, 'unixepoch', 'localtime') AS date, id AS address, m.service, CASE is_from_me WHEN 0 THEN "Received" WHEN 1 THEN "Sent" ELSE is_from_me END AS type, text, CASE cache_has_attachments WHEN 0 THEN Null WHEN 1 THEN filename END AS attachment FROM message AS m LEFT JOIN message_attachment_join AS maj ON message_id = m.rowid LEFT JOIN attachment AS a ON a.rowid = maj.attachment_id LEFT JOIN handle AS h ON h.rowid = m.handle_id
With this query, we end up with an easy to read output containing interpreted values with the following columns:
ROWID |
date |
address |
service |
type |
text |
attachment |
Tip
|
Why include message table ROWID ? Row id’s a generated automatically for each message added to the database. A break in sequence will show a record has been deleted. Since it is possible to recover deleted records from SQLite databases, it is a convenient way to alert the investigator more analysis is required. Further, in the case of attachments multiple attachments, there will be one row for each attachment in a message. A repeating ROWID indicates two or more attachments are present for the message. |
I hope this discussion of SQLite JOIN
operations as they relate to the Apple iOS chats.db will help you in your examination of SQLite databases.