Wednesday, September 18, 2013

iPhone: Recovering from Recovery

I was attempting to brute force an iPhone 4 passcode for data recovery. The phone was in poor condition and had undergone modifications: the home button had been replaced as well as the back cover, maybe more. I could not reliably get the phone into recovery mode, possibly the result of a faulty home button, so i used libimobiledevice’s ideviceenterrecovery command.

It worked wonderfully. Maybe too wonderfully. I eventually achieved DFU mode (the home button was probably the culprit in making this normally simple process quite difficult), executed my exploit, and obtained the passcode and data. My goal was to unlock the phone and pass it off to another investigator. But, when I rebooted the phone after DFU mode I found it was in recovery again!

I tried a variety of things, from trying to rest with FirmwareUmbrella (formerly TinyUmbrella) to disassembling the phone and disconnecting the battery, but nothing worked. Then a colleague (thanks, Perry) suggested iRecovery.

What is libirecovery?

libirecovery is a cross-platform library which implements communication to iBoot/iBSS found on Apple’s iOS devices via USB. A command-line utility is also provided.
— libirecovery

libirecovery can be compiled in Linux. I found I had to install the libreadline-dev package in my Ubuntu install, but you may find you have to do more depending on the packages you already have installed. Building requires you to execute the followed by make and then make install. I had to also run ldconfig to register the library since this was not done automatically.

The command line utility is the irecovery tool. It is used as follows:

iRecovery - iDevice Recovery Utility
Usage: irecovery [args]
        -i <ecid>       Target specific device by its hexadecimal ECID
        -v              Start irecovery in verbose mode.
        -c <cmd>        Send command to client.
        -f <file>       Send file to client.
        -k [payload]    Send usb exploit to client.
        -h              Show this help.
        -r              Reset client.
        -s              Start interactive shell.
        -e <script>     Executes recovery shell script.

On first blush, it might seem that the solution to my problem was the command irecovery -r to reset the device. But that is not so. Instead, I needed to enter the shell, change and environment variable, and reboot.

iRecovery Shell
$ sudo irecovery -s
> setenv auto-boot true
> saveenv
> reboot
Running the command as root was required or the program failed with a segmentation fault.

The device rebooted into the normal operating system and I was able to unlock it with the passcode I had recovered. If you find yourself in a recovery loop, I hope this post will help you, uh, recover from it!

Friday, September 13, 2013

Recovering Data from Deleted SQLite Records: Redux

Rising from the Ashes

I’ve received many, many inquiries about recovering deleted records from SQLite databases ever since I posted an article about my first attempt to recover deleted data. Well, the hypothesis of checking the difference between the original database and a vacuumed copy seemed sound at the time and did in fact yield dropped record data, but it also included data from allocated records. The main thing I learned was that I had much to learn about the SQLite database file format.

Since that time, I’ve run into more and more SQLite databases, and the issue of recovering dropped records has become paramount. I have learned how to do so, and I’ll share some of the secrets with you now. But first, you need to know a little about SQLite databases…

This article is not a treatise on the SQLite file format. The best resource for that is located at I hope to put the salient points here so you can understand the complexity of the task of recovering dropped records from SQLite databases.

SQLite Main Database Header

The first 100 bytes of a SQLite database define and describe the database. The key value to record recovery is the page size, a 16-bit (two-bytes) big-endian integer at byte offset 16. SQLite databases are divided into pages, usually matching the underlying file system block size. Each page has a single use, and those containing the records that are the interest of forensic examiners is the table b-tree leaf page, which I’ll refer to as the TLP. The TLP is distinguished from other page types by its first byte, \x0d or integer 13.

Thus, we can find the TLPs with the knowledge of the database page size we obtain from the database header and check the first byte of each page for \x0d. In python, that might look like:

Python 3: Finding table b-tree leaf pages
from struct import unpack

with open('some.db', 'rb') as f:
    data =

pageSize = unpack('>h', data[16:18])[0]

pageList = []

for offset in range(0, len(data), pageSize):
    if data[offset] == 13;
The code above prints the offset of TLPs. Make sure you are using Python 3 if you want to try this for yourself.

Table B-Tree Leaf Pages

The TLPs hold the records, and consequently, the dropped (deleted) records data when they occur. Each page has an 8-byte header, broken down as follows:

Table 1. Table b-tree leaf page header
Offset Size Value



Page byte \x0d (int 13)



Byte offset to first freeblock



Number of cells



Offset to first cell



Number of freebytes

The header introduces some terms that need explaining. A freeblock is unallocated space in the page below one or more allocated records. It is created by the dropping of a record from the table. It has a four-byte header: the first two bytes are a 16-bit big-endian integer pointing to the next freeblock (zero means its the last freeblock), and the second two bytes are a 16-bit big-endian integer representing the size of the freeblock, including the header.

Cells are the structures that hold the records. The are made up of a payload length, key, and payload. The length and key, also known as the rowid, are variable length integers. What are those? I’m glad you asked:

Variable-length Integers

A variable-length integer or "varint" is a static Huffman encoding of 64-bit twos-complement integers that uses less space for small positive values. A varint is between 1 and 9 bytes in length. The varint consists of either zero or more byte which have the high-order bit set followed by a single byte with the high-order bit clear, or nine bytes, whichever is shorter. The lower seven bits of each of the first eight bytes and all 8 bits of the ninth byte are used to reconstruct the 64-bit twos-complement integer. Varints are big-endian: bits taken from the earlier byte of the varint are the more significant and bits taken from the later bytes.

I won’t go into varints any further in this post, because I will not be discussing cell decoding in this post. Suffice it to say that with the payload length, we can define the payload, which itself is made up of a header and columns. The header is a list of varints, the first describing the header length, and the remainder decribing the column data and types. The page header contains number of cells and the offset to the first cell on the page.

The last value in the header, freebytes, describes the number of fragmented bytes on the page. Fragmented bytes are byte groupings of three or less that cannot be reallocated to a new cell (which takes a minimum of four bytes).

Immediately following the page header is a cell pointer array. It is made up of 16-bit big endian integers equal in length to the number of cells on the page. Thus, if there are 10 cells on the page, the array is 20 bytes long (10 2-btye groupings).

Page Unallocated Space

There are three types of unallocated space in a TLP. Freeblocks and freebytes we’ve discussed, and the third is the space between the end of the cell array and the first cell on the page referred to in the SQLite documentation as simply "unallocated". Freeblocks and unallocated can contain recoverable record data, while freebytes are too small for interpretation. Thus, knowing the first freeblock (defined in the page header), the length of the cell array (interpreted from the number of cells defined in the page header) and the offset to the first cell (yep, you guessed it, defined in the page header), we can recover all the unallocated space in the page for analysis.

Python 3: Finding table b-tree leaf page unallocated space
for offset in pageList:
    page = data[offset: offset + pageSize]

    pageHeader = unpack('>bhhhb', page[:8])
    pageByte, fbOffset, cellQty, cellOffset, freebytes = pageHeader

    # get unallocated
    start = 8 + cellQty * 2
    end = cellOffset-start
    unalloc = page[start:end]
    print(offset, unalloc, sep=',')

    # get freeblocks, if any
    if fbOffset > 0:
        while fbOffset != 0:
            start, size = unpack('>hh', page[fbOffset: fbOffset + 4])
            freeblock = page[fbOffset: fbOffset + size]
            print(offset, freeblock, sep = ',')
            fbOffset = start

With the lines from the two code boxes, we have coaxed the unallocated data from the "some.db" SQLite database. We have printed the offset of each unallocated block and the contents (in python bytes format) to stdout. With just a little manipulation, we can turn this into a script a reuseable program, and the content can be grepped for strings. At bare minimum, we now have a way to determine if there is deleted content in the database related to our investigation, e.g., we could grep the output of the Android mmssms.db for a phone number to see if there are deleted records. Searching against the whole database would not be valuable because we cannot separate the allocated from the unallocated content!

Now, this obviously does not reconstruct the records for us, but recovering the unallocated data is a good start. In future posts I will describe how to reconstruct allocated records with an eye towards reconstructing unallocated records.

Monday, June 17, 2013

TextMe App: Lesson Learned from Unusual Tables

I recently had the opportunity to help a colleague with an iPhone database that was not supported by his automated tools. The application was the TextMe application, and predictably, the texting app stored its chat in a SQLite database. What made the database interesting was the fact that there was no immediately obvious way to identify to whom a message was sent.

Let me illustrate: A quick scan of the database reveals some of the capabilities of the application: texting (ZMESSAGE table) and calling (ZCALL, ZVOICEMAIL)

SQLite Command Line Interface
$ sqlite3 TextMe2.sqlite .tables

The subject of this investigation was the text messages, so I needed to see how the table was constructed.

SQLite Command Line Interface
$ sqlite3 TextMe2.sqlite ".schema zmessage"

The CREATE TABLE statement shows us there are 14 fields in the table, and a majority are integers: Z_PK, Z_ENT, Z_OPT, ZSTATUS, ZCALL, ZDISCUSSION, ZHEIGHT, ZSENDER, Z3_SENDER, ZTIMESTAMP, ZBODY, ZGUID, ZLOCATION, ZREMOTEID. But, don’t fall into the trap that the declared type of each column (e.g. INTEGER, TIMESTAMP, VARCHAR)actually constrain the data to those types, they don’t. Treat the type-name as informative only, but verify the data before forming any conclusions.

Inspecting the columns, we see some of obvious value in a forensic examination:

  • Z_PK (which is the primary key, an auto incrementing integer who’s chief value is assisting us in identifying if messages have been deleted from the database)





Others might be grabbing your attention, but I’m going to keep this discussion focused on these columns. Some sample data is in order:

SQLite Command Line Interface
$ sqlite3 -header TextMe2.sqlite "select z_pk, zstatus, zsender,
ztimestamp, zbody from zmessage limit 5;"
3|4|10|386362603|hey, what are you doing?
4|3|2|386362630|I'm checking out this new app
5|3|2|386362634|It might be a challenge to decode
6|3|2|386362644|But I'll figure it out...

We see we have a couple of interpretation issues, here: the status is an integer that needs to be interpreted, and is the sender. The date is some form of epoch time, and my eyes tell me its likely Mac Absolute Time. I expect to find the interpretations in other tables in the database. But what jumps off the screen at me is that there is no obvious answer the following question: To whom is the sender sending the message? The time stamp gives us a sequence of messages, but how do we know that sender "2" is sending messages to sender "10"? Couldn’t sender "2" be sending his message to, say, sender "5" and in the midst, receives a message from sender "10"? Absolutely!

So, how to we rectify this issue? Well, I sort of mischievously left off the zdiscussion column in my query. I did this to steer the conversation and simulate what can happen when an investigator encounters a new database for the first time: overlook an important column. If we include the column, we see something interesting:

SQLite Command Line Interface
$ sqlite3 -header TextMe2.sqlite "select z_pk, zstatus, zdiscussion,
zsender, ztimestamp, zbody from zmessage limit 5;"
3|4|2|10|386362603|hey, what are you doing?
4|3|2|2|386362630|I'm checking out this new app
5|3|2|2|386362634|It might be a challenge to decode
6|3|2|2|386362644|But I'll figure it out...

Now we see that the conversation is all part of the same discussion. And if we studied the whole database, we’d see example of where more than one conversation was occurring a the same time, and by sorting on the discussion field, we make sense of those conversations. But date stamp alone does not clue us in.

This might not seem like a big deal, but most messaging databases I have encountered have the remote party in the message record for both sent and received messages. This works well and leads to easy interpretation, e.g., "sent to Joe" and "received from Joe". But this database without properly understanding the discussion column, is the equivalent of "sent by Joe" and "sent by Jane", leading to the question "to whom?"

Rather than breakdown the rest of the analysis, I’m just going to share the query I used to complete the analysis:

  m.z_pk as "Key",
  datetime(ztimestamp + 978307200, 'unixepoch', 'localtime') as "Date",
    when m.z_ent then (select z_name from z_primarykey natural join zmessage)
    else "Unknown(" || m.z_ent || ")" end
    as "Type",
  case zstatus
    when 2 then "Sent"
    when 3 then "Delivered"
    when 4 then "Received"
    else "Unknown(" || zstatus || ")" end
    as "Status",
  zdiscussion as "DiscussionID",
  zusername as "Contact",
  zbody as "Message"
from zmessage as m, zcontact as c
on m.zsender = c.z_pk
order by discussionid asc, date asc;

By way of brief description:

  • the AS statements, such as that seen in m.z_pk as "Key", create aliases, effectively renaming the columns (or in the FROM statement, the tables) to make the output more informative.

  • The first CASE statement queries the z_primarykey to interpret the Z_ENT integer into its textual value.

  • The second CASE statement interprets the ZSTATUS flag into its English equivalent, which was not stored in the database, but determined by studying the application user-interface. This begs the question, then where does the application get the textual interpretation? Probably within its code.

  • The FROM statement queries two tables, zmessage and zcontact, to interpret the ZSENDER integer. But wait, you say, there is no ZSENDER in the select statement! (see next bullet point…)

  • The ON statement is the SQLite equivalent of an SQL INNER JOIN, which is an intersection of the two tables, i.e., it select rows from both tables where the columns match. In this case, the columns to be matched are ZSENDER from the ZMESSAGE table and Z_PK from the ZCONTACT table. The effect is that the SELECT statement knows which ZUSERNAME from ZCONTACT to return based on the ZSENDER value in the ZMESSAGE table.

  • The ORDER BY statement sorts the output first by the ZDISCUSSION column, then by the ZDATESTAMP, both in ascending order. Note that the column alias names are used.

I hope this gives some insight into the workings of TextMe2.sqlite databases and helps you in the future with the analysis of never-before-seen SQLite databases.

Sunday, June 16, 2013

Why I do What I do: Thanks Dad!

This post has both nothing and everything to do with why and how I do data forensics. I hope you'll take a moment to read it. 

I am a tenacious do-it-your-selfer in many areas of my life, including home repair, veterinary medicine (just ask my poor dogs), and data forensics. Some of it I even do reasonably well once in a while (even a broken clock is right twice a day, right?). There has been a strong influence in my life that lead me inevitably to this station in life: my father.

My father is a retired firefighter--scratch that--fireman.  Some would say he was tough on me while I was growing up.  I'm sure I've said that myself more than once.  But it would be more accurate to say that from an early age, my father instilled in me a strong sense of right-and-wrong and as self-reliance.  No excuses, no B.S.  He never failed to try to teach me, never missed an opportunity to impart wisdom that came from much experience.  He saw some of the worst life had to offer and lost friends in the service of his city, and I think he was hell bent on helping me avoid the pitfalls of life.  Unfortunately, my career choices betray how closely I listened to the "avoiding harm" lessons: I became a soldier and then a police officer!

Our relationship during my teen years was rocky at times, with frequent arguments over differences of opinion on important life matters (so important, that I can't remember a single one of them).  I didn't know it then, but we argued because we were so alike.  We have a lot of the same qualities: strong wills, the desire to be right (not for the sake of being better than others, but for the sake of not being correct in what we believe, do, and say), and the wish to pass what we know onto others. It was really just our point of view that differed.

My father doesn't just talk-the-talk, though.  He, above all, walks-the-walk.  I admire him greatly in this, and I strive to be more like him in this way.  He put his family first: I can remember how happy it made me when he bought a HiFi stereo system with all the bells and whistles, not because I had a new stereo to play with (oh no, DO NOT touch the equalizer settings!), but because he finally, after something like 15 years of my life, spent money on himself for something non-essential!  It made him happy, and that made me happy, too (and eventually, I was permitted to touch the equalizer... once I was schooled in the proper shape of sound).

So, on this Father's Day, I choose to write a non-technical post, but an important one because it acknowledges the source of my beliefs:

  • Thanks Dad, for making me care about right and wrong: I may take a little too long to reach a conclusion in a case, but I'm not likely to state something untrue because I have checked and double-checked the facts to the best of my ability.  
  • Thanks Dad, for giving me and insatiable curiosity for the world around me, so that I can now study a file or file system and come to understand how they work (or hours, days, and sometimes weeks trying!).
  • Thanks Dad, for teaching me to do for myself, so that now I can write programs to solve forensics problems to get investigators and prosecutors the essential information they need.
  • Thanks Dad, for teaching me so I know how to teach others, so that all that I have learned I willingly pass on and hopefully advance my field, if only a little.
  • Thanks Dad, for helping me find perspective in what I do, not letting work step in-line ahead of my family (the real reason I am no longer a gunslinger).

Happy Father's Day.  I hope I grow up to be just like you.

Wednesday, May 29, 2013

"Hashes? We don't need no stinking hashes!"


This is admittedly a long post. I hope you’ll find it worthy of your time. It demonstrates a methodology for obtaining images from Android devices that are locked USB debugging disabled. A popular device, the Samsung Galaxy S3 is used for the demonstration. This is not a step-by-step guide for all Android devices, but can be used as a framework to be adapted to your situation.

Samsung has recently upped the ante when it comes to Android pin/password locks in Jelly Bean. Past Android releases concatenated the the pin/password with the password salt and calculated both a SHA1 and MD5 hash of string. The hashes were then concatenated and stored in the 72-byte /data/system/password.key file. I talked about cracking the password.key here and here. But now, Samsung has changed the game.

Changing Times

I first encountered the change in March. I was contacted to assist a law enforcement agency with cracking a Samsung S2 pin. I planned to use hashcat to crack the pin, and the agency had those files at the ready. But, to my surprise, the password.key file was only 40-bytes in length. I won’t bore you with the steps I took to try to crack it, but suffice it to say that I didn’t crack it.

I later learned that Samsung had changed the pin/password hashing schema in their Android JellyBean releases. The new schema calls for the pin/password to be concatenated as before, but the SHA1 hash is repeated 1024 times thusly:

  • sha1(lastHash + iteration + pin/password + saltHex)

Lets assume a pin of 1234. Bearing in mind that in computerese, counting starts at zero and there is no lastHash in the first iteration, the process would look something like this:

Table 1. New Samsung hashing schema


Concatentated Elements



0 + 1234 + a1b2c3d4e5f6abbc



SHA1 + 1 + 1234 + a1b2c3d4e5f6abbc



SHA1 + 2 + 1234 + a1b2c3d4e5f6abbc



SHA1 + 1023 + 1234 + a1b2c3d4e5f6abbc


Knowing the schema makes cracking the resulting hash possible, though much slower than the stock android schema. But, this post isn’t about cracking the hash—rumor has it that Samsung is planning to change its hashing method again now that the secret is out--it’s about ignoring it entirely!

Traveling back in Time

Back to March… I mentioned that I didn’t crack the pin that was sent to me, ut, it got me thinking: is there another way to cope with locked Android devices. I knew from experience and study that the Android pattern lock was stored as a sha1 hash in the data/system/gesture.key binary file and the pin/password was stored in the data/system/gesture.key text file. So, I went to work on android Emulators for Android 2.3, 4.0, and 4.1.

And what I discovered was this: Simply renaming (preferred to deleting so the data can still be examined) the gesture.key or password.key file (e.g., "gesture.key.bak") disabled the locks! A reboot removes the hash values from memory and they are not reloaded by the operating system because they no longer exist. In the emulator, I was still presented with the lock screens, but by entering any pattern, or by touching OK/Done on the pin/password screen, I had access to the device. In real world devices, the lock screens were bypassed altogether.

The law enforcement agency I was assisting had root access to the device in question, or they would not have been able to send me the password.key and settings.db files. I contacted them and advised them to rename the password.key and as expected, they bypassed the pin lock.

Got Root?

There’s the rub, though: getting root access. In Android 2.3, this could be done, for the most part, with privilege escalation exploits like RageAgainstTheCage, GingerBreak, and ZergRush, to name a few of the most popular. The exploits, generally speaking, crashed services on the devices and prevented them from descalating privileges on restart. For the exploits to work, USB debugging needed to be enabled so the software could be pushed through the [Android Debug Bridge] (adb) from the desktop to the device. While I receive some devices with USB debugging enabled, the vast majority aren’t in that condition. And, the days of Android 2.3 are waning; I now see more Android 4.0+ devices which have been strengthened against such attacks.

What then can be done about a locked Android device where USB debugging is disabled, or in an Android version that is not vulnerable to a privilege exploit attack? How do we communicate with a device that just plain refuses to talk to us?

Recovery isn’t Just for Addicts

The Android operating system relies on several partitions to segregate and protect data. There are likely exceptions to the rule, but you can expect to find a boot partition, a system partition (the operating system that is the user experience), a data partition (user applications and data), a cache partition, and a recovery partition. There are, in fact, many more partitions in most devices. But we are going to focus on the recovery partition in this discussion.

Recovery is an operating system of its own. It’s small and limited, and most manufacturers limit its functionality. In stock recoveries, the functions are limited to the installation system updates, known as ROMs, and the wiping of the data and cache partitions (e.g. to remove user data before selling or returning the device).

Lesser known is that many stock recoveries have USB debugging enabled, i.e, a desktop system can communicate with the device through the adb. The privileges through the stock recovery are limited, but some information can be gleaned from the system. Let’s look at a Samsung Galaxy S3 that is pin-locked without USB debugging enabled…

Booted into the system ROM (the standard operating system), we expect to be unable to connect to the device through adb.

$ adb devices
List of devices attached


Sure enough, the device is not detected because the adb daemon on the device is not running. After shutting down the device, we boot into recovery by pressing and holding home + up volume + power.


Release the power button after the device vibrates but continue to hold the other hardware buttons until the Android graphic appears.

After the Android graphic appears, the stock recovery menu is displayed. The menu has the limited functionality described above, but we’re not here to look at the menu, but to determine if the adb daemon is running.

$ adb devices
List of devices attached
e7bc4973        recovery


Nice! We now have communication with the device. If we dropped to a shell with the adb shell command, or checked our privileges with adb shell id, we’d see that we have limited privileges as the shell user. But one thing we are able to do, which can be quite handy while looking for ways to defeat the security, is to determine what version of Android is installed.

$ adb shell cat default.prop

the default.prop text file is created on boot and shows many of the device settings. The first thing we see is what we already know: the device is locked (, and the USB debugging is disabled in the system ROM (ro.debuggable=0). But further along in the file we see some interesting tidbits (exerpted):

ADB Feb 14 15:05:00 KST 2013

We now know important details of the device: make, model, and Android version. Researching the device, we find there are custom recovery operating systems by both ClockWorkMod (CWM) and Team Win Recovery Project (TWRP). Both recoveries are rooted, meaning that they boot with administrative privileges in place. And in our case, that’s the only feature in which we are interested.

Samsung devices have a download mode, also known as "Odin" mode, for uploading data to partitions. The proprietary but leaked sofware tool "Odin" is used to accomplish this task. There is an open source tool, Heimdall, that is the functional equivalent of Odin. Odin is Windows-only software, but Heimdall is cross platform. Our goal is to use Odin or Heimdall to install one of the custom recoveries. We don’t want to fully root the device as an end user might, as that makes changes to the system and data partitions.

Powerful Norse Gods

Odin and Heimdall are named for Norse Gods. The software has the power to push software to a Samsung Android device while the device is in Download mode. They become hamstrung when the bootloader is locked, however. Luckily, an unlocked bootloader can be installed first to allow installation of the custom recovery. Though I prefer to use open tools whenever possible and Heimdall detected the device, it did not work with the bootchain archive from the exploit, so it was necessary to use Odin.

The process is straight forward and covered in the tutorial. I recommend installing TWRP because it has netcat and dd installed in the patch. The CWM version in the tutorial does not include netcat which is necessary for imaging.


The main point to remember when using Android forums for tutorials or information is that our goal as a forensic examiners is not the same as those on the forums:

  • Users want to achieve permanent root privileges and install custom operating systems

  • we want to obtain unaltered or minimally altered data.

Read the information with that in mind and make sure you understand what changes are taking place before you proceed.

Heeding the warning above, we can condense the steps in the tutorial to just four:

# Download and install the Offical Samsung Drivers # Download ODIN v3.07 # Download and flash the VRALEC Bootchain # Download and install TWRP (now at version

It is important in the TWRP installation that you boot immediately into recovery (home + volume up + power) or the boot process will detect the custom recovery and replace it with stock. This is the reason the tutorial mentions unchecking "Auto-reboot" in the Odin options. It you fail to boot into recovery and instead boot the stock operating system, you’ll have to repeat the third fourth step above.

Once you’ve successfully booted into the custom recovery, you are ready to image the device memory.


When I first broke into Android forensics, I was using Micro-SD cards to capture device images. Luckily, I have discovered another means for imaging, and a good thing, too: many devices don’t have SD Card slots (Kindle Fire, for example). The process involves using adb to do port forwarding and then using netcat and dd to pipe the raw NAND flash data to the connected desktop computer.

ADB port forwarding to pipe data to the desktop
$ adb forward tcp:5555 tcp:5555
$ adb shell
~ # nc -l -p 5555 -e dd if=/dev/block/mmcblk0

Above, we use the adb forward <local> <remote> command to forward Android tcp port 5555 to the desktop tcp port 5555. Then we enter the Android shell with the adb shell command. Immediately, you should notice the shell prompt which indicates we are the root user.

The netcat command, nc, is used to open connections between devices. Here, we set netcat to listen (-l) for and incoming connection on port (-p) 5555. When the connection is made, we direct it to execute (-e) a dd command that uses the device’s whole memory block as its input. At this point, nothing further happens until the connection is made. To make the connection, we open another terminal and use netcat on our desktop system:

Receiving the data
$ nc 5555 | pv > image.dd
14.7GB 0:45:32 [ 5.5MB/s] [                               <=>                ]

In the new window (remember, our first session is waiting for a connection), we simply make the connection by telling netcat the destination ( and port (5555) with which to connect. We optionally pipe the data through the pv command, which measures the data passing through the connection and redirect the output to a file called image.dd. Without the pv command, in the event you don’t have it installed or don’t wish to measure the output, would be: nc 5555 > image.dd.

As you can see from the pv output, we imaged the 16GB Galaxy S3 in a little more than 45 minutes. The JTAG process, which is your only other recourse for imaging a locked S3, averages 27-32 hours, not to mention the need for disassembly, soldering, reassembly and specialized hardware. Does this make the custom recovery technique better than JTAG? Not at all. As always, the needs of the case should dictate your approach.

But what kind of data did we actually net? Let’s take a quick look:

Image partition listing
$ partx -o nr,start,end,size,name image_129349.dd
 1     8192   131071   60M modem
 2   131072   131327  128K sbl1
 3   131328   131839  256K sbl2
 4   131840   132863  512K sbl3
 5   132864   136959    2M aboot
 6   136960   137983  512K rpm
 7   137984   158463   10M boot
 8   158464   159487  512K tz
 9   159488   160511  512K pad
10   160512   180991   10M param
11   180992   208895 13.6M efs
12   208896   215039    3M modemst1
13   215040   221183    3M modemst2
14   221184  3293183  1.5G system
15  3293184 28958719 12.2G userdata
16 28958720 28975103    8M persist
17 28975104 30695423  840M cache
18 30695424 30715903   10M recovery
19 30715904 30736383   10M fota
20 30736384 30748671    6M backup
21 30748672 30754815    3M fsg
22 30754816 30754831    8K ssd
23 30754832 30765071    5M grow

Looks like we have the whole enchilada (the purest reading this will think, "no, that only proves you have the partition table" and they’d be right!). On further examination, we’d find that we can mount the partitions and examine them without difficulty. As an aside, I have found partx to be the ideal tool for reading raw Android images. Though I use The Sleuthkit for disk forensics, its not well suited for Android images, basically because Android uses non-standards compliant file name schema.

Traveling Forward in Time

Android 4.2.2 might be the death knell to the custom recovery approach.

Note: When you connect a device running Android 4.2.2 or higher to your computer, the system shows a dialog asking whether to accept an RSA key that allows debugging through this computer. This security mechanism protects user devices because it ensures that USB debugging and other adb commands cannot be executed unless you’re able to unlock the device and acknowledge the dialog. This requires that you have adb version 1.0.31 (available with SDK Platform-tools r16.0.1 and higher) in order to debug on a device running Android 4.2.2 or higher.

— "Android Developers"

But, then again, maybe not. The Quote above states that adb will not work until it is authorized by the User through the device interface, i.e., the standard operating system. What about recovery? It remains to be seen.

Sunday, May 19, 2013

iOS6 Photo Streams: "Recover" Deleted Camera Roll Photos


Recovery of deleted files in iDevices since iOS 4 has been impractical to impossible because of the use of an encrypted file system. This does not mean that deleted data cannot be retrieved, however. Deleted records can be recovered from allocated SQLite databases because of the of the construct of the database file: dropped records are not always overwritten. This article will demonstrate how the interaction between the iOS Photo Streams service and the Camera Roll application can help examiners identify and "recover" deleted photographs.

The dawning of Apple iCloud in 2011, a new service was born: the iCloud Photo Stream. Photo Stream syncs photos taken with an iDevice with other devices registered by the user. The user must have an iCloud account and enable Photo Stream through the Settings | iCloud | Photo Stream menu for the service to work.

Photo Stream comes in two flavors, if you will: the basic Photo Stream and Shared Photo Stream. Basic Photo Stream syncs photos only between the users devices, but Shared Photo Stream alows the user to share photos with other people through a public website or directly on their Photo Stream enabled devices.

Photo Stream Requirements

Shared Photo Stream

  • iDevice with iOS 6.0+

  • Mac with OS X v10.8.2+ and iPhoto 9.4+ or Aperture 3.4+

  • PC with Windows Vista (SP2)/7/8 and the iCloud Control Panel 2.1+

  • Apple TV (Gen 2) with software update 5.1+

Basic Photo Stream

  • iDevice with iOS 5.1+

  • Mac with OS X v10.7.5+ and iPhoto 9.2.2+ or Aperture 3.2.3+

  • PC with Windows Vista (SP2)/7 and iCloud Control Panel v2.0+

  • Apple TV (Gen 2) with Software Update 5.0+

You might be getting ahead of me already and thinking, "Excellent! If a user takes a photograph on his iPhone and it gets synced with with other Photo Stream enabled devices, I can find deleted photos from the iPhone on the other devices!" To that I say, "whoa there big fella," and I pull firmly back on your reins. It’s not quite that simple, and while that’s a possibility if the photograph is moved out of Photo Stream storage and into a long term storage location, there are a few things you need to know…

Pictures in the Sky

When a user takes a photograph using a Photo Stream enabled iDevice, like an iPhone, it is sent to iCloud storage and held there for 30 days to give all the users registered devices a chance to sync. The data is only transfered from the iPhone when connected to WiFi, it doesn’t transmit photographs over cellular networks. iDevices hold the last 1000 photos (JPG, TIFF, PNG, and RAW) in the stream, so a user must move photographs to a long term storage location like the Camera Roll if they want to keep them longer. Macs and PCs don’t have the 1000 photo limitation because of their larger storage capacities. Videos are not synced through Photo Stream.

If another Photo Stream enabled device in the user’s network takes a photograph, or a photograph is loaded into Photo Stream on a Mac or PC from another source like a digital camera, it is synced with the iPhone, as you’ve probably guessed. If an image is deleted from Photo Stream on any one of the networked devices, it is deleted from all the devices on the network. As our friend Mork for Ork used to say, "Shazbot!" It’s not going to be that easy. And besides, how often do we get our hands on all the devices when performing forensics, anyway?

So, Why the Post?

While performing an examination on an iPhone 4s with iOS 6.1.3 recently, I noted lots of duplicate photographs with different filenames. The filename differences were not from a user renaming them (if that’s even possible in a non-jail broken device), but images with the same apparent image content and the same created time existed in two distinct paths, had distinct names (like "IMG_1027.JPG" and "IMG_1098.JPG") and existed in different domains. Intrigued, I investigated.

Through study and a little experimentation with a colleagues iPhone, I discovered that when a user takes a photograph, it is simultaneously stored in the Camera Roll Domain in /Media/DCIM/ subdirectories and the Media Domain in the /Media/PhotoStreamsData/ subdirectories. The filename is incremental: if the last photograph was "IMG_1027.JPG", the next is "IMG_1028.JPG". This is true even if "IMG_1027.JPG" is deleted before the next image is taken—filenames are not reused.

Likewise, the photograph that is written to the Photo Stream is given an incremental filename. But, it is not based on the Camera Roll Domain filenames, but instead on the Photo Stream image filenames stored in the Media Domain. Recall that images in Photo Stream can come from any networked device and it will make sense that they need to have unique filenames created on the local device to avoid overwriting. With that knowledge, you begin to understand why two files with the same image content exist in different locations and have different filenames.

But, there’s more. I mentioned that videos were not synced in Photo Stream. They are, however, stored in the Camera Roll domain of the iDevice in which they were created, and they are given an incremental filename that mixes them into the photographs though they have a different extension. I’ve printed a sample listing to illustrate.

Partial listing of files in the Camera Roll Domain

Thus, video are included in the Camera Roll Domain incrementing filename schema, but since they are not written to the Media Domain, the filenames there unaffected, i.e, not incremented.

Another factor to consider is that the Camera Roll domain can be used to save videos and images not made with the device being examined. The user has the option to save image and videos from Internet browsers, applications like Facebook, and MMS messages by pressing and holding them. Whatever the source of the files, they are saved to Camera Roll with sequential filenames.

Camera Roll saves more that photos and videos taken with device

Again, the JPEG, TIFF, PNG, and RAW images are automatically added to the Photo Stream when the service is enabled. If only there were some data source to help us figure out how to match the Camera Roll image filenames to those in the Photo Streams directories. Cryptographic hashing (e.g., MD5) won’t help because the EXIF data is different in .jpg images, and besides, the point of this article was to "recover" deleted images, which means you are missing one half of the comparison! Once again, enter SQLite…

Recovering images deleted from the Camera Roll

The /CameraRollDomain/Media/PhotoData/Photos.sqlite database contains data about images in the Camera Roll Domain and the Media Domain. Incidentally, it also tracks the videos in the Camera Roll Domain. The ZGENERICASSET asset table is the table of interest and contains file dates, paths, and names.

ZGENERICASSET table schema
$ sqlite3 -header CameraRollDomain/Media/PhotoData/Photos.sqlite \
".schema zgenericasset"

A lot of the information in the table is inconsequential to our analysis. Some is of passing interest, and might be relevant to another type of examination. But lets hone in on our quarry: file dates, names, and paths. To make SQLite work for you, you must learn its Structured Query Language.

First off, lets get an overview of the files that this table tracks. It would be nice to see just the unique paths in the ZDIRECTORY field. Fortunately, SQLite gives us the Distinct function for just this purpose.

SQLite DISTINCT() function
$ sqlite3 -header CameraRollDomain/Media/PhotoData/Photos.sqlite \
'SELECT DISTINCT(zdirectory) FROM zgenericasset ORDER BY zdirectory ASC'


So, we see from the output that the DCIM and PhotoData directories, which are in the Camera Roll Domain, and the PhotoStreamData directory, which is located in the Media Domain, are the source root directories for the data tracked by this table. How would I know that if I wasn’t familiar with the iPhone file structure? Hunt and peck? No!

BASH find command with grep
$ find -type d | grep -E '(DCIM|Photo(Streams)?Data)$'

Here, we tell the find command to return only directories with the -type d option and we filter the results with a regular expression. The regular expression looks for lines ending with DCIM, PhotoData, or PhotoStreamsData. That saves alot of hunting and pecking!

Now, we’ll get export the files from the database sorted by creation date. From this, maybe we can draw some conclusions, or at least have a list of files on which to focus for further analysis.

SQLite Query (truncated output)
$ sqlite3 -header -list CameraRollDomain/Media/PhotoData/Photos.sqlite "SELECT \
z_pk, DATETIME(zdatecreated + 978307200, 'unixepoch', 'localtime') AS \
zdatecreated, zdirectory, zfilename FROM zgenericasset
9598|2013-04-27 20:53:44|DCIM/105APPLE|IMG_5303.JPG
9628|2013-04-27 20:53:44|PhotoStreamsData/97527241/104APPLE|IMG_4245.JPG
9599|2013-04-27 20:53:49|DCIM/105APPLE|IMG_5304.JPG
9629|2013-04-27 20:53:49|PhotoStreamsData/97527241/104APPLE|IMG_4246.JPG
9600|2013-04-27 21:16:41|DCIM/105APPLE|IMG_5305.JPG
9630|2013-04-27 21:16:41|PhotoStreamsData/97527241/104APPLE|IMG_4247.JPG
9601|2013-04-27 21:16:49|DCIM/105APPLE|IMG_5306.JPG
9631|2013-04-27 21:16:49|PhotoStreamsData/97527241/104APPLE|IMG_4248.JPG

Time stamps in the database were in Mac Absolute Time and had to be converted to unixepoch by adding 978307200 seconds for the SQLite DATETIME() function to perform the conversion to local time.

Here we see a pattern developing. Look at the lines in pairs: the created dates match in each pair. The DCIM filenames increment by one as do the PhotoStreamsData filenames. This leads us to believe that image in the DCIM directory is related to the image in the PhotoStreamsData directory. If we viewed the images, we would in fact see that the image content appears to be the same. However, some image optimizations occur in the creation of the Photo Stream image and we will not match by MD5, and not just because the embedded metadata is slightly different (original filename, for example, but the image data is different.

iCloud: Photo Stream FAQ

On your Mac or PC, your photos are downloaded and stored in full resolution. On your iPhone, iPad, iPod touch, and Apple TV, your Photo Stream photos are delivered in a device-optimized resolution that speeds downloads and saves storage space. While actual dimensions will vary, an optimized version of a photo taken by a standard point-and-shoot camera will have a 2048 x 1536 pixel resolution when pushed to your devices. Panoramic photos can be up to 5400 pixels wide.
— Apple Inc.

But does this make a claim that they are the same image invalid. Not at all! The similarities are overwhelming: embedded modified and created time stamps are identical (in the database and in the file system), gps data is identical, device information is identical, and image content appears identical, and so on. In the purest sense, they are not the same any more than a film negative is the same as a print from the negative, yet we would argue they are the same because one is a reproduction of the other. Such is the argument here.

Finally, lets extend this analysis to deleted files. What if we were presented with the following file list?

SQLite Query Results
9623|2013-04-27 21:56:27|DCIM/105APPLE|IMG_5328.JPG
9648|2013-04-27 21:56:27|PhotoStreamsData/97527241/104APPLE|IMG_4265.JPG
9624|2013-04-27 21:56:48|DCIM/105APPLE|IMG_5329.JPG
9649|2013-04-27 21:56:48|PhotoStreamsData/97527241/104APPLE|IMG_4266.JPG
9625|2013-04-27 21:56:59|DCIM/105APPLE|IMG_5330.JPG
9650|2013-04-27 21:56:59|PhotoStreamsData/97527241/104APPLE|IMG_4267.JPG
9656|2013-04-29 01:34:10|PhotoStreamsData/97527241/104APPLE|IMG_4268.JPG
9657|2013-04-29 01:34:23|PhotoStreamsData/97527241/104APPLE|IMG_4269.JPG
9658|2013-04-29 01:38:19|PhotoStreamsData/97527241/104APPLE|IMG_4270.JPG
9654|2013-04-29 01:38:28|DCIM/105APPLE|IMG_5334.JPG
9659|2013-04-29 01:38:28|PhotoStreamsData/97527241/104APPLE|IMG_4271.JPG

We see the expected DCIM/Photostreams created date and filename pattern through the first three pairs. Then we have three Photo Stream images, IMG_4268.JPG, IMG_4269.JPG, IMG_4270.JPG, that are missing their DCIM counter parts. The next DCIM image is IMG_5334.JPG, which is three images removed from the previous DCIM image, IMG_5330.JPG. This is probably better understood in a table:

Created Date DCIM (Camera Roll) PhotoStreamsData

2013-04-27 21:56:27



2013-04-27 21:56:48



2013-04-27 21:56:59



2013-04-29 01:34:10


2013-04-29 01:34:23


2013-04-29 01:38:28


2013-04-29 01:38:28



Is it reasonable to conclude that IMG_4268.JPG, IMG_4269.JPG, and IMG_4270.JPG are facsimiles of the missing IMG_5331.JPG, IMG_5332.JPG, and IMG_5333.JPG?

Created Date DCIM (Camera Roll) PhotoStreamsData

2013-04-27 21:56:27



2013-04-27 21:56:48



2013-04-27 21:56:59



2013-04-29 01:34:10



2013-04-29 01:34:23



2013-04-29 01:38:28



2013-04-29 01:38:28



I believe that it is. We know that:

  • The Photo Stream service creates a file with the same content at the same time it is created in the Camera Roll

  • Filenames are sequentially created relative to the directory contents

  • Video files in the Camera Roll Domain are not created in the Photo Stream Domain

  • Facsimile images are not cryptographically identical, but the image content is arguably the same.

With this information and the observable file pattern: the unbroken filename sequence of Photo Stream images and an equivalent number of missing Camera Roll images as unpaired Photo Stream images, the conclusion is reasonable.


This analysis is not complete. Recovery of deleted SQLite records can be used to confirm or refute this analysis, and I’ll begin working on that next. Also, there are circumstances in which the interpretation is ambiguous (absent recovered SQLite records). Take the following as an example:

SQLite Query Results
9616|2013-04-27 21:51:57|DCIM/105APPLE|IMG_5321.JPG
9643|2013-04-27 21:51:57|PhotoStreamsData/97527241/104APPLE|IMG_4260.JPG
9617|2013-04-27 21:52:01|DCIM/105APPLE|IMG_5322.JPG
9644|2013-04-27 21:52:01|PhotoStreamsData/97527241/104APPLE|IMG_4261.JPG
9618|2013-04-27 21:53:10|DCIM/105APPLE|IMG_5323.MOV
9619|2013-04-27 21:55:22|DCIM/105APPLE|IMG_5324.JPG
9645|2013-04-27 21:55:22|PhotoStreamsData/97527241/104APPLE|IMG_4262.JPG
9646|2013-04-27 21:56:14|PhotoStreamsData/97527241/104APPLE|IMG_4263.JPG
9622|2013-04-27 21:56:19|DCIM/105APPLE|IMG_5327.JPG
9647|2013-04-27 21:56:19|PhotoStreamsData/97527241/104APPLE|IMG_4264.JPG

Rendered into a table

Created Date DCIM (Camera Roll) PhotoStreamsData

2013-04-27 21:51:57



2013-04-27 21:52:01



2013-04-27 21:53:10


2013-04-27 21:55:22



2013-04-27 21:56:14


2013-04-27 21:56:19



Above, we see that IMG_5323.MOV, a video, does not have a PhotoStreamsData pairing, and that is expected. The next Camera Roll image is paired with the next Photo Streams image in sequence… again, expected behavior. But there is a gap of two filenames in the Camera Roll and only one pairing file in the PhotoStreams Data. The implication is that one of the missing files is a video, but which one? Does IMG_5325.JPG or IMG_5326.JPG pair with IMG_4263?


Neither IMG_5325 nor IMG_5326 were in the database, even as deleted records, so the data was overwritten or vacuumed.

Thus, you can see, this will not solve all deleted file mysteries, but you might very well be able to "recover" a file deleted from the Camera Roll by sifting the Photos.sqlite database and finding the Photo Streams equivalent!

Time Perspective

Telling time in forensic computing can be complicated. User interfaces hide the complexity, usually displaying time stamps in a human reada...