Wednesday, May 29, 2013

"Hashes? We don't need no stinking hashes!"


This is admittedly a long post. I hope you’ll find it worthy of your time. It demonstrates a methodology for obtaining images from Android devices that are locked USB debugging disabled. A popular device, the Samsung Galaxy S3 is used for the demonstration. This is not a step-by-step guide for all Android devices, but can be used as a framework to be adapted to your situation.

Samsung has recently upped the ante when it comes to Android pin/password locks in Jelly Bean. Past Android releases concatenated the the pin/password with the password salt and calculated both a SHA1 and MD5 hash of string. The hashes were then concatenated and stored in the 72-byte /data/system/password.key file. I talked about cracking the password.key here and here. But now, Samsung has changed the game.

Changing Times

I first encountered the change in March. I was contacted to assist a law enforcement agency with cracking a Samsung S2 pin. I planned to use hashcat to crack the pin, and the agency had those files at the ready. But, to my surprise, the password.key file was only 40-bytes in length. I won’t bore you with the steps I took to try to crack it, but suffice it to say that I didn’t crack it.

I later learned that Samsung had changed the pin/password hashing schema in their Android JellyBean releases. The new schema calls for the pin/password to be concatenated as before, but the SHA1 hash is repeated 1024 times thusly:

  • sha1(lastHash + iteration + pin/password + saltHex)

Lets assume a pin of 1234. Bearing in mind that in computerese, counting starts at zero and there is no lastHash in the first iteration, the process would look something like this:

Table 1. New Samsung hashing schema


Concatentated Elements



0 + 1234 + a1b2c3d4e5f6abbc



SHA1 + 1 + 1234 + a1b2c3d4e5f6abbc



SHA1 + 2 + 1234 + a1b2c3d4e5f6abbc



SHA1 + 1023 + 1234 + a1b2c3d4e5f6abbc


Knowing the schema makes cracking the resulting hash possible, though much slower than the stock android schema. But, this post isn’t about cracking the hash—rumor has it that Samsung is planning to change its hashing method again now that the secret is out--it’s about ignoring it entirely!

Traveling back in Time

Back to March… I mentioned that I didn’t crack the pin that was sent to me, ut, it got me thinking: is there another way to cope with locked Android devices. I knew from experience and study that the Android pattern lock was stored as a sha1 hash in the data/system/gesture.key binary file and the pin/password was stored in the data/system/gesture.key text file. So, I went to work on android Emulators for Android 2.3, 4.0, and 4.1.

And what I discovered was this: Simply renaming (preferred to deleting so the data can still be examined) the gesture.key or password.key file (e.g., "gesture.key.bak") disabled the locks! A reboot removes the hash values from memory and they are not reloaded by the operating system because they no longer exist. In the emulator, I was still presented with the lock screens, but by entering any pattern, or by touching OK/Done on the pin/password screen, I had access to the device. In real world devices, the lock screens were bypassed altogether.

The law enforcement agency I was assisting had root access to the device in question, or they would not have been able to send me the password.key and settings.db files. I contacted them and advised them to rename the password.key and as expected, they bypassed the pin lock.

Got Root?

There’s the rub, though: getting root access. In Android 2.3, this could be done, for the most part, with privilege escalation exploits like RageAgainstTheCage, GingerBreak, and ZergRush, to name a few of the most popular. The exploits, generally speaking, crashed services on the devices and prevented them from descalating privileges on restart. For the exploits to work, USB debugging needed to be enabled so the software could be pushed through the [Android Debug Bridge] (adb) from the desktop to the device. While I receive some devices with USB debugging enabled, the vast majority aren’t in that condition. And, the days of Android 2.3 are waning; I now see more Android 4.0+ devices which have been strengthened against such attacks.

What then can be done about a locked Android device where USB debugging is disabled, or in an Android version that is not vulnerable to a privilege exploit attack? How do we communicate with a device that just plain refuses to talk to us?

Recovery isn’t Just for Addicts

The Android operating system relies on several partitions to segregate and protect data. There are likely exceptions to the rule, but you can expect to find a boot partition, a system partition (the operating system that is the user experience), a data partition (user applications and data), a cache partition, and a recovery partition. There are, in fact, many more partitions in most devices. But we are going to focus on the recovery partition in this discussion.

Recovery is an operating system of its own. It’s small and limited, and most manufacturers limit its functionality. In stock recoveries, the functions are limited to the installation system updates, known as ROMs, and the wiping of the data and cache partitions (e.g. to remove user data before selling or returning the device).

Lesser known is that many stock recoveries have USB debugging enabled, i.e, a desktop system can communicate with the device through the adb. The privileges through the stock recovery are limited, but some information can be gleaned from the system. Let’s look at a Samsung Galaxy S3 that is pin-locked without USB debugging enabled…

Booted into the system ROM (the standard operating system), we expect to be unable to connect to the device through adb.

$ adb devices
List of devices attached


Sure enough, the device is not detected because the adb daemon on the device is not running. After shutting down the device, we boot into recovery by pressing and holding home + up volume + power.


Release the power button after the device vibrates but continue to hold the other hardware buttons until the Android graphic appears.

After the Android graphic appears, the stock recovery menu is displayed. The menu has the limited functionality described above, but we’re not here to look at the menu, but to determine if the adb daemon is running.

$ adb devices
List of devices attached
e7bc4973        recovery


Nice! We now have communication with the device. If we dropped to a shell with the adb shell command, or checked our privileges with adb shell id, we’d see that we have limited privileges as the shell user. But one thing we are able to do, which can be quite handy while looking for ways to defeat the security, is to determine what version of Android is installed.

$ adb shell cat default.prop

the default.prop text file is created on boot and shows many of the device settings. The first thing we see is what we already know: the device is locked (, and the USB debugging is disabled in the system ROM (ro.debuggable=0). But further along in the file we see some interesting tidbits (exerpted):

ADB Feb 14 15:05:00 KST 2013

We now know important details of the device: make, model, and Android version. Researching the device, we find there are custom recovery operating systems by both ClockWorkMod (CWM) and Team Win Recovery Project (TWRP). Both recoveries are rooted, meaning that they boot with administrative privileges in place. And in our case, that’s the only feature in which we are interested.

Samsung devices have a download mode, also known as "Odin" mode, for uploading data to partitions. The proprietary but leaked sofware tool "Odin" is used to accomplish this task. There is an open source tool, Heimdall, that is the functional equivalent of Odin. Odin is Windows-only software, but Heimdall is cross platform. Our goal is to use Odin or Heimdall to install one of the custom recoveries. We don’t want to fully root the device as an end user might, as that makes changes to the system and data partitions.

Powerful Norse Gods

Odin and Heimdall are named for Norse Gods. The software has the power to push software to a Samsung Android device while the device is in Download mode. They become hamstrung when the bootloader is locked, however. Luckily, an unlocked bootloader can be installed first to allow installation of the custom recovery. Though I prefer to use open tools whenever possible and Heimdall detected the device, it did not work with the bootchain archive from the exploit, so it was necessary to use Odin.

The process is straight forward and covered in the tutorial. I recommend installing TWRP because it has netcat and dd installed in the patch. The CWM version in the tutorial does not include netcat which is necessary for imaging.


The main point to remember when using Android forums for tutorials or information is that our goal as a forensic examiners is not the same as those on the forums:

  • Users want to achieve permanent root privileges and install custom operating systems

  • we want to obtain unaltered or minimally altered data.

Read the information with that in mind and make sure you understand what changes are taking place before you proceed.

Heeding the warning above, we can condense the steps in the tutorial to just four:

# Download and install the Offical Samsung Drivers # Download ODIN v3.07 # Download and flash the VRALEC Bootchain # Download and install TWRP (now at version

It is important in the TWRP installation that you boot immediately into recovery (home + volume up + power) or the boot process will detect the custom recovery and replace it with stock. This is the reason the tutorial mentions unchecking "Auto-reboot" in the Odin options. It you fail to boot into recovery and instead boot the stock operating system, you’ll have to repeat the third fourth step above.

Once you’ve successfully booted into the custom recovery, you are ready to image the device memory.


When I first broke into Android forensics, I was using Micro-SD cards to capture device images. Luckily, I have discovered another means for imaging, and a good thing, too: many devices don’t have SD Card slots (Kindle Fire, for example). The process involves using adb to do port forwarding and then using netcat and dd to pipe the raw NAND flash data to the connected desktop computer.

ADB port forwarding to pipe data to the desktop
$ adb forward tcp:5555 tcp:5555
$ adb shell
~ # nc -l -p 5555 -e dd if=/dev/block/mmcblk0

Above, we use the adb forward <local> <remote> command to forward Android tcp port 5555 to the desktop tcp port 5555. Then we enter the Android shell with the adb shell command. Immediately, you should notice the shell prompt which indicates we are the root user.

The netcat command, nc, is used to open connections between devices. Here, we set netcat to listen (-l) for and incoming connection on port (-p) 5555. When the connection is made, we direct it to execute (-e) a dd command that uses the device’s whole memory block as its input. At this point, nothing further happens until the connection is made. To make the connection, we open another terminal and use netcat on our desktop system:

Receiving the data
$ nc 5555 | pv > image.dd
14.7GB 0:45:32 [ 5.5MB/s] [                               <=>                ]

In the new window (remember, our first session is waiting for a connection), we simply make the connection by telling netcat the destination ( and port (5555) with which to connect. We optionally pipe the data through the pv command, which measures the data passing through the connection and redirect the output to a file called image.dd. Without the pv command, in the event you don’t have it installed or don’t wish to measure the output, would be: nc 5555 > image.dd.

As you can see from the pv output, we imaged the 16GB Galaxy S3 in a little more than 45 minutes. The JTAG process, which is your only other recourse for imaging a locked S3, averages 27-32 hours, not to mention the need for disassembly, soldering, reassembly and specialized hardware. Does this make the custom recovery technique better than JTAG? Not at all. As always, the needs of the case should dictate your approach.

But what kind of data did we actually net? Let’s take a quick look:

Image partition listing
$ partx -o nr,start,end,size,name image_129349.dd
 1     8192   131071   60M modem
 2   131072   131327  128K sbl1
 3   131328   131839  256K sbl2
 4   131840   132863  512K sbl3
 5   132864   136959    2M aboot
 6   136960   137983  512K rpm
 7   137984   158463   10M boot
 8   158464   159487  512K tz
 9   159488   160511  512K pad
10   160512   180991   10M param
11   180992   208895 13.6M efs
12   208896   215039    3M modemst1
13   215040   221183    3M modemst2
14   221184  3293183  1.5G system
15  3293184 28958719 12.2G userdata
16 28958720 28975103    8M persist
17 28975104 30695423  840M cache
18 30695424 30715903   10M recovery
19 30715904 30736383   10M fota
20 30736384 30748671    6M backup
21 30748672 30754815    3M fsg
22 30754816 30754831    8K ssd
23 30754832 30765071    5M grow

Looks like we have the whole enchilada (the purest reading this will think, "no, that only proves you have the partition table" and they’d be right!). On further examination, we’d find that we can mount the partitions and examine them without difficulty. As an aside, I have found partx to be the ideal tool for reading raw Android images. Though I use The Sleuthkit for disk forensics, its not well suited for Android images, basically because Android uses non-standards compliant file name schema.

Traveling Forward in Time

Android 4.2.2 might be the death knell to the custom recovery approach.

Note: When you connect a device running Android 4.2.2 or higher to your computer, the system shows a dialog asking whether to accept an RSA key that allows debugging through this computer. This security mechanism protects user devices because it ensures that USB debugging and other adb commands cannot be executed unless you’re able to unlock the device and acknowledge the dialog. This requires that you have adb version 1.0.31 (available with SDK Platform-tools r16.0.1 and higher) in order to debug on a device running Android 4.2.2 or higher.

— "Android Developers"

But, then again, maybe not. The Quote above states that adb will not work until it is authorized by the User through the device interface, i.e., the standard operating system. What about recovery? It remains to be seen.

Sunday, May 19, 2013

iOS6 Photo Streams: "Recover" Deleted Camera Roll Photos


Recovery of deleted files in iDevices since iOS 4 has been impractical to impossible because of the use of an encrypted file system. This does not mean that deleted data cannot be retrieved, however. Deleted records can be recovered from allocated SQLite databases because of the of the construct of the database file: dropped records are not always overwritten. This article will demonstrate how the interaction between the iOS Photo Streams service and the Camera Roll application can help examiners identify and "recover" deleted photographs.

The dawning of Apple iCloud in 2011, a new service was born: the iCloud Photo Stream. Photo Stream syncs photos taken with an iDevice with other devices registered by the user. The user must have an iCloud account and enable Photo Stream through the Settings | iCloud | Photo Stream menu for the service to work.

Photo Stream comes in two flavors, if you will: the basic Photo Stream and Shared Photo Stream. Basic Photo Stream syncs photos only between the users devices, but Shared Photo Stream alows the user to share photos with other people through a public website or directly on their Photo Stream enabled devices.

Photo Stream Requirements

Shared Photo Stream

  • iDevice with iOS 6.0+

  • Mac with OS X v10.8.2+ and iPhoto 9.4+ or Aperture 3.4+

  • PC with Windows Vista (SP2)/7/8 and the iCloud Control Panel 2.1+

  • Apple TV (Gen 2) with software update 5.1+

Basic Photo Stream

  • iDevice with iOS 5.1+

  • Mac with OS X v10.7.5+ and iPhoto 9.2.2+ or Aperture 3.2.3+

  • PC with Windows Vista (SP2)/7 and iCloud Control Panel v2.0+

  • Apple TV (Gen 2) with Software Update 5.0+

You might be getting ahead of me already and thinking, "Excellent! If a user takes a photograph on his iPhone and it gets synced with with other Photo Stream enabled devices, I can find deleted photos from the iPhone on the other devices!" To that I say, "whoa there big fella," and I pull firmly back on your reins. It’s not quite that simple, and while that’s a possibility if the photograph is moved out of Photo Stream storage and into a long term storage location, there are a few things you need to know…

Pictures in the Sky

When a user takes a photograph using a Photo Stream enabled iDevice, like an iPhone, it is sent to iCloud storage and held there for 30 days to give all the users registered devices a chance to sync. The data is only transfered from the iPhone when connected to WiFi, it doesn’t transmit photographs over cellular networks. iDevices hold the last 1000 photos (JPG, TIFF, PNG, and RAW) in the stream, so a user must move photographs to a long term storage location like the Camera Roll if they want to keep them longer. Macs and PCs don’t have the 1000 photo limitation because of their larger storage capacities. Videos are not synced through Photo Stream.

If another Photo Stream enabled device in the user’s network takes a photograph, or a photograph is loaded into Photo Stream on a Mac or PC from another source like a digital camera, it is synced with the iPhone, as you’ve probably guessed. If an image is deleted from Photo Stream on any one of the networked devices, it is deleted from all the devices on the network. As our friend Mork for Ork used to say, "Shazbot!" It’s not going to be that easy. And besides, how often do we get our hands on all the devices when performing forensics, anyway?

So, Why the Post?

While performing an examination on an iPhone 4s with iOS 6.1.3 recently, I noted lots of duplicate photographs with different filenames. The filename differences were not from a user renaming them (if that’s even possible in a non-jail broken device), but images with the same apparent image content and the same created time existed in two distinct paths, had distinct names (like "IMG_1027.JPG" and "IMG_1098.JPG") and existed in different domains. Intrigued, I investigated.

Through study and a little experimentation with a colleagues iPhone, I discovered that when a user takes a photograph, it is simultaneously stored in the Camera Roll Domain in /Media/DCIM/ subdirectories and the Media Domain in the /Media/PhotoStreamsData/ subdirectories. The filename is incremental: if the last photograph was "IMG_1027.JPG", the next is "IMG_1028.JPG". This is true even if "IMG_1027.JPG" is deleted before the next image is taken—filenames are not reused.

Likewise, the photograph that is written to the Photo Stream is given an incremental filename. But, it is not based on the Camera Roll Domain filenames, but instead on the Photo Stream image filenames stored in the Media Domain. Recall that images in Photo Stream can come from any networked device and it will make sense that they need to have unique filenames created on the local device to avoid overwriting. With that knowledge, you begin to understand why two files with the same image content exist in different locations and have different filenames.

But, there’s more. I mentioned that videos were not synced in Photo Stream. They are, however, stored in the Camera Roll domain of the iDevice in which they were created, and they are given an incremental filename that mixes them into the photographs though they have a different extension. I’ve printed a sample listing to illustrate.

Partial listing of files in the Camera Roll Domain

Thus, video are included in the Camera Roll Domain incrementing filename schema, but since they are not written to the Media Domain, the filenames there unaffected, i.e, not incremented.

Another factor to consider is that the Camera Roll domain can be used to save videos and images not made with the device being examined. The user has the option to save image and videos from Internet browsers, applications like Facebook, and MMS messages by pressing and holding them. Whatever the source of the files, they are saved to Camera Roll with sequential filenames.

Camera Roll saves more that photos and videos taken with device

Again, the JPEG, TIFF, PNG, and RAW images are automatically added to the Photo Stream when the service is enabled. If only there were some data source to help us figure out how to match the Camera Roll image filenames to those in the Photo Streams directories. Cryptographic hashing (e.g., MD5) won’t help because the EXIF data is different in .jpg images, and besides, the point of this article was to "recover" deleted images, which means you are missing one half of the comparison! Once again, enter SQLite…

Recovering images deleted from the Camera Roll

The /CameraRollDomain/Media/PhotoData/Photos.sqlite database contains data about images in the Camera Roll Domain and the Media Domain. Incidentally, it also tracks the videos in the Camera Roll Domain. The ZGENERICASSET asset table is the table of interest and contains file dates, paths, and names.

ZGENERICASSET table schema
$ sqlite3 -header CameraRollDomain/Media/PhotoData/Photos.sqlite \
".schema zgenericasset"

A lot of the information in the table is inconsequential to our analysis. Some is of passing interest, and might be relevant to another type of examination. But lets hone in on our quarry: file dates, names, and paths. To make SQLite work for you, you must learn its Structured Query Language.

First off, lets get an overview of the files that this table tracks. It would be nice to see just the unique paths in the ZDIRECTORY field. Fortunately, SQLite gives us the Distinct function for just this purpose.

SQLite DISTINCT() function
$ sqlite3 -header CameraRollDomain/Media/PhotoData/Photos.sqlite \
'SELECT DISTINCT(zdirectory) FROM zgenericasset ORDER BY zdirectory ASC'


So, we see from the output that the DCIM and PhotoData directories, which are in the Camera Roll Domain, and the PhotoStreamData directory, which is located in the Media Domain, are the source root directories for the data tracked by this table. How would I know that if I wasn’t familiar with the iPhone file structure? Hunt and peck? No!

BASH find command with grep
$ find -type d | grep -E '(DCIM|Photo(Streams)?Data)$'

Here, we tell the find command to return only directories with the -type d option and we filter the results with a regular expression. The regular expression looks for lines ending with DCIM, PhotoData, or PhotoStreamsData. That saves alot of hunting and pecking!

Now, we’ll get export the files from the database sorted by creation date. From this, maybe we can draw some conclusions, or at least have a list of files on which to focus for further analysis.

SQLite Query (truncated output)
$ sqlite3 -header -list CameraRollDomain/Media/PhotoData/Photos.sqlite "SELECT \
z_pk, DATETIME(zdatecreated + 978307200, 'unixepoch', 'localtime') AS \
zdatecreated, zdirectory, zfilename FROM zgenericasset
9598|2013-04-27 20:53:44|DCIM/105APPLE|IMG_5303.JPG
9628|2013-04-27 20:53:44|PhotoStreamsData/97527241/104APPLE|IMG_4245.JPG
9599|2013-04-27 20:53:49|DCIM/105APPLE|IMG_5304.JPG
9629|2013-04-27 20:53:49|PhotoStreamsData/97527241/104APPLE|IMG_4246.JPG
9600|2013-04-27 21:16:41|DCIM/105APPLE|IMG_5305.JPG
9630|2013-04-27 21:16:41|PhotoStreamsData/97527241/104APPLE|IMG_4247.JPG
9601|2013-04-27 21:16:49|DCIM/105APPLE|IMG_5306.JPG
9631|2013-04-27 21:16:49|PhotoStreamsData/97527241/104APPLE|IMG_4248.JPG

Time stamps in the database were in Mac Absolute Time and had to be converted to unixepoch by adding 978307200 seconds for the SQLite DATETIME() function to perform the conversion to local time.

Here we see a pattern developing. Look at the lines in pairs: the created dates match in each pair. The DCIM filenames increment by one as do the PhotoStreamsData filenames. This leads us to believe that image in the DCIM directory is related to the image in the PhotoStreamsData directory. If we viewed the images, we would in fact see that the image content appears to be the same. However, some image optimizations occur in the creation of the Photo Stream image and we will not match by MD5, and not just because the embedded metadata is slightly different (original filename, for example, but the image data is different.

iCloud: Photo Stream FAQ

On your Mac or PC, your photos are downloaded and stored in full resolution. On your iPhone, iPad, iPod touch, and Apple TV, your Photo Stream photos are delivered in a device-optimized resolution that speeds downloads and saves storage space. While actual dimensions will vary, an optimized version of a photo taken by a standard point-and-shoot camera will have a 2048 x 1536 pixel resolution when pushed to your devices. Panoramic photos can be up to 5400 pixels wide.
— Apple Inc.

But does this make a claim that they are the same image invalid. Not at all! The similarities are overwhelming: embedded modified and created time stamps are identical (in the database and in the file system), gps data is identical, device information is identical, and image content appears identical, and so on. In the purest sense, they are not the same any more than a film negative is the same as a print from the negative, yet we would argue they are the same because one is a reproduction of the other. Such is the argument here.

Finally, lets extend this analysis to deleted files. What if we were presented with the following file list?

SQLite Query Results
9623|2013-04-27 21:56:27|DCIM/105APPLE|IMG_5328.JPG
9648|2013-04-27 21:56:27|PhotoStreamsData/97527241/104APPLE|IMG_4265.JPG
9624|2013-04-27 21:56:48|DCIM/105APPLE|IMG_5329.JPG
9649|2013-04-27 21:56:48|PhotoStreamsData/97527241/104APPLE|IMG_4266.JPG
9625|2013-04-27 21:56:59|DCIM/105APPLE|IMG_5330.JPG
9650|2013-04-27 21:56:59|PhotoStreamsData/97527241/104APPLE|IMG_4267.JPG
9656|2013-04-29 01:34:10|PhotoStreamsData/97527241/104APPLE|IMG_4268.JPG
9657|2013-04-29 01:34:23|PhotoStreamsData/97527241/104APPLE|IMG_4269.JPG
9658|2013-04-29 01:38:19|PhotoStreamsData/97527241/104APPLE|IMG_4270.JPG
9654|2013-04-29 01:38:28|DCIM/105APPLE|IMG_5334.JPG
9659|2013-04-29 01:38:28|PhotoStreamsData/97527241/104APPLE|IMG_4271.JPG

We see the expected DCIM/Photostreams created date and filename pattern through the first three pairs. Then we have three Photo Stream images, IMG_4268.JPG, IMG_4269.JPG, IMG_4270.JPG, that are missing their DCIM counter parts. The next DCIM image is IMG_5334.JPG, which is three images removed from the previous DCIM image, IMG_5330.JPG. This is probably better understood in a table:

Created Date DCIM (Camera Roll) PhotoStreamsData

2013-04-27 21:56:27



2013-04-27 21:56:48



2013-04-27 21:56:59



2013-04-29 01:34:10


2013-04-29 01:34:23


2013-04-29 01:38:28


2013-04-29 01:38:28



Is it reasonable to conclude that IMG_4268.JPG, IMG_4269.JPG, and IMG_4270.JPG are facsimiles of the missing IMG_5331.JPG, IMG_5332.JPG, and IMG_5333.JPG?

Created Date DCIM (Camera Roll) PhotoStreamsData

2013-04-27 21:56:27



2013-04-27 21:56:48



2013-04-27 21:56:59



2013-04-29 01:34:10



2013-04-29 01:34:23



2013-04-29 01:38:28



2013-04-29 01:38:28



I believe that it is. We know that:

  • The Photo Stream service creates a file with the same content at the same time it is created in the Camera Roll

  • Filenames are sequentially created relative to the directory contents

  • Video files in the Camera Roll Domain are not created in the Photo Stream Domain

  • Facsimile images are not cryptographically identical, but the image content is arguably the same.

With this information and the observable file pattern: the unbroken filename sequence of Photo Stream images and an equivalent number of missing Camera Roll images as unpaired Photo Stream images, the conclusion is reasonable.


This analysis is not complete. Recovery of deleted SQLite records can be used to confirm or refute this analysis, and I’ll begin working on that next. Also, there are circumstances in which the interpretation is ambiguous (absent recovered SQLite records). Take the following as an example:

SQLite Query Results
9616|2013-04-27 21:51:57|DCIM/105APPLE|IMG_5321.JPG
9643|2013-04-27 21:51:57|PhotoStreamsData/97527241/104APPLE|IMG_4260.JPG
9617|2013-04-27 21:52:01|DCIM/105APPLE|IMG_5322.JPG
9644|2013-04-27 21:52:01|PhotoStreamsData/97527241/104APPLE|IMG_4261.JPG
9618|2013-04-27 21:53:10|DCIM/105APPLE|IMG_5323.MOV
9619|2013-04-27 21:55:22|DCIM/105APPLE|IMG_5324.JPG
9645|2013-04-27 21:55:22|PhotoStreamsData/97527241/104APPLE|IMG_4262.JPG
9646|2013-04-27 21:56:14|PhotoStreamsData/97527241/104APPLE|IMG_4263.JPG
9622|2013-04-27 21:56:19|DCIM/105APPLE|IMG_5327.JPG
9647|2013-04-27 21:56:19|PhotoStreamsData/97527241/104APPLE|IMG_4264.JPG

Rendered into a table

Created Date DCIM (Camera Roll) PhotoStreamsData

2013-04-27 21:51:57



2013-04-27 21:52:01



2013-04-27 21:53:10


2013-04-27 21:55:22



2013-04-27 21:56:14


2013-04-27 21:56:19



Above, we see that IMG_5323.MOV, a video, does not have a PhotoStreamsData pairing, and that is expected. The next Camera Roll image is paired with the next Photo Streams image in sequence… again, expected behavior. But there is a gap of two filenames in the Camera Roll and only one pairing file in the PhotoStreams Data. The implication is that one of the missing files is a video, but which one? Does IMG_5325.JPG or IMG_5326.JPG pair with IMG_4263?


Neither IMG_5325 nor IMG_5326 were in the database, even as deleted records, so the data was overwritten or vacuumed.

Thus, you can see, this will not solve all deleted file mysteries, but you might very well be able to "recover" a file deleted from the Camera Roll by sifting the Photos.sqlite database and finding the Photo Streams equivalent!

Saturday, May 11, 2013

SQLite: Hidden Data in Plain Sight


The title of this post is not intended to imply there are some new clever data hiding techniques for sqlite, but to alert the reader that data in plain view is often going unanalyzed or under-analyzed.

My experience in data forensics has taught me two things about SQLite databases: they are both ubiquitous and poorly understood by examiners. I find that most examiners rely on "viewers," such as SQLite Database Browser or automated tools that parse the SQLite for them. Both of these approaches can be very flawed, however.

The first practice—using viewers—is not faulty on its face. SQLite Database Browser and similar applications are a quick way to visualize the content of a database. But most examiners are using the viewers to open tables in a "flat" view. This is akin to looking at the table data in a spreadsheet: you see the content but not in relation to the data from other tables in the database (or from other databases for that matter) as the database was designed to be used. Further, the data is often not meaningful in the form in which it is stored, i.e., timestamps are often in some form of epoch time and not a human readable format.

The second practice of relying on automated tools means you are counting on a programmer (who may not be a forensic examiner) to tell you what’s important in the database, and you are relying on the coder’s SQLite skills, which might be lacking. (What follows is not a rant against a product, but just an illustration of my point) For example Cellebrite Physical Analyzer parses the iPhone sms.db (sms/iMessage) into a neat, well formed report. But it doesn’t include the rowID in the output, which is an auto-incrementing integer applied to each sent or received message. From the rowID, you can tell if intervening messages have been deleted, but you won’t know that from the automated report from Cellebrite. And what about when your automated tool doesn’t parse a database you have discovered?

Demanding Answers (Learning to Query)

The key to understanding SQLite is to learn the structure query language (SQL). One of the best on-line resources I have found for this is The lessons are brief, and in less that a dozen 5-10 minute sessions, you will have the basics of SQL under your belt. And with the basics, you can accomplish much.

The query language is designed to be human readable, which, by the by, makes it easier to remember. It consists of sentences composed of subjects and predicates. For example, to view all the contents of a table:

SQLite Command Line Program (command line mode)
$ sqlite3 some.db 'select * from some_table;'

Using the above SQL statement, our subject is "select *" which is translated "select all fields" (or columns, if you will), and the predicate, "from some_table" is fairly self-explanatory. The table "some_table" is located in the "some.db" SQLite database. To be brief, in a SQL query, we tell the database engine what we want (subject) followed by qualifiers (predicate).

Finding SQLite Databases for Analysis

So, how do we overcome the shortcomings of the common SQLite Database analysis techniques? Let’s take a recent examination I conducted as an example. I was looking for communications recorded in an iPhone iOS v.6.1.3 backup. I had restored the backup to its original file structure (DOMAIN/path) to facilitate the analysis. After examining the sms.db, I discovered a large block of deleted messages (by RowID analysis) for the time frame that was the subject of the investigation. The messages were not recoverable (more on recovering deleted SQLite records another time).

I decided to take a look at other communications applications that might have been overlooked when the user was deleting data. Unfortunately, I don’t know all the different communications applications available for Apple mobile devices and likely never will. I can get a list of the installed applications on the Device from the Info.plist in the backup directory, but in reality, that doesn’t really help me too much because, as I said, I don’t recognize many of them. I do know, however, that most communications applications store their data in SQLite databases. So, I search for those:

$ find unback/ -type f -exec file {} \; | grep SQLite
unback/HomeDomain/Library/Voicemail/voicemail.db: SQLite 3.x database
unback/HomeDomain/Library/SMS/sms.db: SQLite 3.x database
unback/HomeDomain/Library/Safari/Bookmarks.db: SQLite 3.x database, user version
unback/AppDomain-com.cardify.tinder/Documents/Tinder.sqlite: SQLite 3.x database

I restored the iTunes backup to a directory called unback consistent with the method used by the open libimobiledevice software library, which offers a device backup/unback utility as well as others useful for iDevice analysis.

You can see that I use the find command to look for files (-type f) in the "unback" directory and the execute the file command to determine the file type. I piped the results through the grep command, filtering for "SQLite". With the command, I get a list of SQLite files, true, but I still don’t know what there all are. Some are familiar and/or obvious, but some are not. Take the last item on the abbreviated list above: Tinder by Cardify. I had never heard of Tinder (as well as many other applications that appeared in the results).

Getting a Peek Inside

It would be more informative to list the databases, and then get a look at the tables contained in each one. While table names don’t necessarily tell you the content, they can be informative while in data "triage" mode. So, how do we modify our find command to show us the databases as well as their tables?

BASH (incorporating SQLite Command Line Program)
$ find unback/ -type f | while read i; do file $i | grep -q SQLite; \
[ $? = 0 ] && (echo $i; sqlite3 $i .tables; echo); done
_SqliteDatabaseProperties  voicemail

bookmark_title_words  folder_ancestors      sync_properties
bookmarks             generations

_SqliteDatabaseProperties  chat_message_join
attachment                 handle
chat                       message
chat_handle_join           message_attachment_join

ZLIKE             ZPHOTO            ZUSER             Z_METADATA

Ok, that’s much more helpful. But how does the command work? Like the initial command, find is used to locate files (directories are excluded). The results of the file command are piped to a while loop, which assigns each file name to the variable i. Similar to the first command, the file command displays the file type which is filtered for "SQLite" by grep. The "-q" option in grep is used to keep grep silent; it is the exit status that is of interest.

I want the exit status, or alternatively: the return status or exit code, to perform a test. All commands, scripts and functions return an exit status, and an exit code of "0" means success. The exit status is captured in the variable ?, and recalled, like all BASH variables, by prepending with a dollar sign: $?. In the command, I test the exit status of the last command which was grep. If the regular expression "SQLite" is matched in the file command output, the grep exits with "0". The test "[ $? = 0 ]" is shorthand notation for "if the last command’s exit code is 0", then do what follows: (echo $i; sqlite3 $i .tables; echo), i.e., print the file name, print the tables of the database, and then print a blank line (for readability). I discussed while loops in a previous post if you have more interest, or you could look here.

Making Sense of the Data

From the output thus far, I see that Tinder has a message table, as well as user and shared friends tables. Looks like it is a social networking application, and the data might be relevant to the investigation. So, how do I get see the contents of the tables? I could look at one table at a time:

SQLite Command Line Program
$ sqlite3 -header Tinder.sqlite 'select * from zmessage limit 5;'
3|2|1|1|832|379798729.794|What's doin?
4|2|1|1|1318|379804817.728|Are you online often?
5|2|1|0|1318|379806963.685|Wouldn't you like to know?!

In the command above, I used the SQLite "-header" option to display the column titles, and I limited the output to five records to get a sense of the data. Dropping the limit syntax would result in the entire table and all record fields being printed to standard output (the screen).

Is there a way we could quickly look at a few records of every table to see what is of interest, if anything? You bet!

SQLite Command Line Program
$ for i in $(sqlite3 Tinder.sqlite .tables); do echo Table: $i; \
sqlite3 Tinder.sqlite -header "select * from $i limit 5;"; echo; done
Table: ZUSER
47|379794187.955|I like pie.|604832678|50f2fc2fbe8d00b3d4f58c36|Gunter|

3|2|1|1|832|379798729.794|What's doin?
4|2|1|1|1318|379804817.728|Are you online often?
5|2|1|0|1318|379806963.685|Wouldn't you like to know?!


Briefly, the primary difference in the last command from those executed earlier is the use of a for loop. The for loop takes the ouput of the SQLite .tables command,

The output above demonstrates the relational nature of SQLite databases. Looking at the ZMESSAGE table, we see the message content, but the user is an integer (ZUSER field). The integer appears correlate to the ZUSER table (Z_PK field). Just looking at the ZMESSAGE table, we see the conversation but we don’t know with whom it occurred.

SQLite lets us query the tables in relation to make more meaningful output.

SQLite Command Line Program
$ sqlite3 Tinder.sqlite 'select m.z_pk, zinbound, zuser, zname, \
zcreationdate, zbody from zmessage as m, zuser as u where \
m.zuser = u.z_pk limit 5;'
3|1|832|Tobias|379798729.794|What's doin?
4|1|1318|Theoduff|379804817.728|Are you online often?
5|0|1318|Theoduff|379806963.685|Wouldn't you like to know?!

In this command, I specified the fields I wanted returned as opposed to all fields. This is necessary whenever you are relating tables to one another. You may have noticed that in the predicate I queried both the ZMESSAGE and ZUSER tables. The "as" statements create aliases to the tables (zmessage = m, zuser = u) to keep the command more concise. In the select statement, I asked for the ZNAME field, which is located in the ZUSER table where ZUSER from the ZMESSAGE table matched Z_PK from the ZUSER table. In the select statement, I had to specify the Z_PK (m.z_pk) field from the ZMESSAGE table because both tables contain that field.

Two more fields don’t have much meaning in our result: ZINBOUND and ZCREATIONDATE. ZINBOUND is a flag, that with some context, lead me to understand that 0 = sent and 1 = received. ZCREATIONDATE appears from its value to be Mac Absolute Time and file system timestamps support this evaluation. The case expression can be used to interpret the flags. It is the equivalent to an if/then statement in scripting languages. The datetime function converts the unix epoch to a human-readable date. Because the values in the Tinder database are Mac Absolute Time, the timestamps have to first be converted to unix epoch by adding 978307200 seconds.

SQLite Command Line Program
$ sqlite3 -header Tinder.sqlite 'select m.z_pk, case zinbound when 0 \
then "sent" when 1 then "received" else "unknown" end as zinbound, \
zuser, zname, datetime(zcreationdate + 978307200, "unixepoch", \
"localtime") as zcreationdate, zbody from zmessage as m, zuser as u \
where m.zuser = u.z_pk limit 5;'
1|received|832|Tobias|2013-01-13 11:27:16|Hi!
2|received|1156|Siegfried|2013-01-13 11:16:24|hey!
3|received|832|Tobias|2013-01-13 11:38:49|What's doin?
4|received|1318|Theoduff|2013-01-13 13:20:17|Are you online often?
5|sent|1318|Theoduff|2013-01-13 13:56:03|Wouldn't you like to know?!

Now we have meaningful data by relating two tables, interpreting flags (case expression), and converting timestamps (datetime function). You can find more information about the case expression here, and the datetime function here.

Summing Up

I covered a lot of ground in this post, from using find, file, grep, and while and for loops to basic and intermediate SQLite Command Line Program usage. I left a lot of explanation out of the discussion, and I barely scratched the surface of SQLite analysis. My goal was to: - highlight the fact that automated tool and viewer users are likely leaving a lot of data on the table (pardon the pun) - show how command line tools can be used to rapidly locate and evalute SQLite databases - demonstrate that learning SQLite queries will go a long way to filling the gap left wide open by automated tools - encourage you to learn more about SQLite and improve your investigative skills.

I hope to start delving into more specific SQLite analysis topic in future posts.

Happy Querying!

Time Perspective

Telling time in forensic computing can be complicated. User interfaces hide the complexity, usually displaying time stamps in a human reada...