Linux Sleuthing

Tuesday, June 19, 2012

Data exchange: Archiving and Moving Data between OSes

The Back Story

It finally happened (sigh): I found myself in need of a Mac. I don't much enjoy Mac computers, but I'll save my rants for my Mac-user friends who love to hear from me on the topic. But I needed to extract data from an iPhone4 with iOS 5.0.1. I've discussed previously that libimobiledevice provides an open-source suite of tools that can be used to extract data from an iPhone with Linux. Frankly, I need to freshen that post as the tools have become more robust and, as of this writing, support extracting data (by means of an iTunes-like backup with the idevicebackup tool) through iOS 5.1.1.

But, in my case, though I only needed files that were available through idevicebackup, the phone was pin-locked which prevents the utility from functioning. I am fortunate enough to have access to the tools provided by Johnathan Zdziarski at the iOS Forensic Research site, one of which provide the pin code from a locked iPhone. One of the facts required for the pin code tool to work is knowledge of the device model, discoverable on the back of the phone, and the iOS version installed. Discovering the iOS version is not always trivial, unless you have libimobiledevice-utils.

Useful Digression: Determining the iOS Version of a Device

The libimobiledevice-utils toolset provides the ideviceinfo tool which displays valuable information about an unlocked iPhone, Touch, or iPad in key/value pairs. The info includes the device serial number, mac address, bluetooth address, phone number, device name ("Badguy's iPhone), cpu type, hardware model, firmware version, even the color of the phone. Under the product version key, the iOS version is displayed. You may have noticed I said "unlocked" device. Information, though in to a lesser degree, can be obtained from a locked device as well, but nothing you read in the help will make that obvious. If you pass the -s / --simple option, you can still obtain limited but critical information to include the device name and the product (iOS) version.

With the iOS product version in hand, I proceeded to obtain the pin code with the aide of Zdziarski's tools. I also downloaded the logical file system, consisting of over 12gb of data in one tar ball. I needed to preserve the archive as evidence, and I also wanted to transfer it to a Linux box for examination, where I have more contemporary command line tools available to me than provided through OS X.

The Point: Moving Data Between Operating Systems

I like easy, and I like methodology that is cross-platform, whenever possible. To be more specific, I prefer to use the same tools no matter the OS. I have sort of adopted a unix philosophy of forensics: the unix tool philosophy is make one tool to do one thing and do it well. My forensics philosophy is learn one tool to do one task and learn it well. Now, I know its not always possible, but I want to avoid the problem of using Zip compression and tools because I'm in Windows, Gzip/Bzip and related tools because I'm in Linux where such compression is dominant, and Stuffit for Mac, etc. Much better to learn one method that will work in most any situation without the need to download and install yet another utility, in my way of thinking.

Because I was working on the Mac command line, it made sense to stay there and chop up that 12gb tar file into DVD sized chunks. And for fun, I decided to see if I could squeeze down the size some (it turned out that much of the data was video, and there wasn't much compression to be had, however). So, how can it be done? Newer incarnations of tar allow the creation of volumes, but not so in OS X Lion. However, the bzip2 compressor and the split commands are part of the standard command line tools.

The concept is simple: compress the tar file and chop it into 4.1gb chunks for burning. Bzip2 handles the compression, but it creates a file of the same name as the target file with the .bz2 extension by default. Thus, "archive.tar" becomes "archive.tar.bz2." This does not accomplish my goal, because I'm just left with a somewhat smaller file, but still too large to burn to DVD. However, bzip2 takes the -c argument which pipes the data to stdout instead of creating a new file! Now we're getting somewhere.

We can take the data piped from bzip2 and use our Ginsu (that's a knife for those of you who didn't watch a lot of TV in the 1980's) command called split. Split will divide if file into user-defined sized pieces. The sizes can be based on the number of lines, amount of data, and if I read the man page correctly on the Mac, even a data pattern. In my case, I want to define it by size, and the Mac version of the tool tells me to specify the -b flag for the byte size of the resulting files, appending 'k' for kilobytes and 'm' for megabytes. The single hyphen, used as an argument, replaces a file name when data is being piped from another tool like bzip2.

So, putting it all together, I split my archive.tar file thusly:

$ bzip2 -c archive.tar | split -b 4100m - archive.tar.

Bzip compresses archive.tar and sends the data to stdout. That data is piped through the split command, which creates 4.1gb chunks. The resulting files were archive.tar.aa, archive.tar.ab, and archive.tar.ac. Note the hyphen in the command after the -b file size option: that is 'shortcut' for the data from the bzip2 command and replaces what would otherwise be the name of the file for split to operate upon. Also note the period at the end of archive.tar. Without it, the output files would have the name 'archive.taraa,' etc., without the period separator between the original file extension and the new one supplied by split.

What Now?

I burned those files to three DVDs, which provided the necessary evidence retention medium. I also copied those files to my Linux box where I reassembled and decompressed them in one operation:

$ cat archive.tar.* | bunzip2 > archive.tar

I did not untar the archive, because I can mount that read-only and examine it as is. This provides a better forensic environment for examination. I'll cover this method in a future post. Finally, because I install and maintain a Cygwin environment on my Windows workstation, I could have used the same method to transfer the files to my Windows operating system. One method, three operation systems interchange data. Now, that's the way I like it!

Yet Another YAFFS Discussion

In previous posts, I've discussed rooting and imaging Android devices. While the exploits change from one Android version to another, the principals are the same as I detailed in the past. Most Android devices, small portable devices like smart phones in particular, use NAND flash memory with the yaffs file system for storage.

If you are new to building binaries from source code, then this tutorial is probably not for you. However, I hope to explain it well enough that you can still follow along even if you have very little build experience. For starters, make sure you have the appropriate build tools. In Debian and Ubuntu, it's easiest to to install the "build-essential" package:

$ sudo apt-get install build-essential

Though the next step is not required, you'll likely want to install the "git" software versioning system so you can easily obtain and install the latest yaffs source code. Otherwise, it is possible to download the source code as a tar archive from the source code repository. I'll be demonstrating the git method here:

$ sudo apt-get install git 

Finally, you'll likely want to install the module for easy access.

$ sudo apt-get install module-init-tools

Building the Module

In order to mount Android images, download the latest source code from the online repository. You'll probably have to install git if you haven't done so in the past. It is not standard in most Linux Distros.

$ git clone git://www.aleph1.co.uk/yaffs2 Cloning into yaffs2...
remote: Counting objects: 7027, done.
remote: Compressing objects: 100% (4247/4247), done.
remote: Total 7027 (delta 5566), reused 3473 (delta 2700)
Receiving objects: 100% (7027/7027), 3.43 MiB | 304 KiB/s, done.
Resolving deltas: 100% (5566/5566), done.

The source code is downloaded into a subdirectory called 'yaffs2' is used the example command above. If you want to clone into a different directory, add the directory name as an argument following the web address. If the directory doesn't already exist, it will be created.

Next, change into the source code directory and issue the "make" command to build the source according to the parameters already laid out in the Makefile.

$ cd yaffs2

$ make 

make -C /lib/modules/2.6.38-13-generic/build M=/home/jlehr/projects/yaffs2 modules
make[1]: Entering directory `/usr/src/linux-headers-2.6.38-13-generic'
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_mtdif.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_mtdif2_multi.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_mtdif1_multi.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_packedtags1.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_ecc.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_vfs_multi.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_guts.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_packedtags2.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_tagscompat.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_checkptrw.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_nand.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_nameval.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_allocator.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_bitmap.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_attribs.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_yaffs1.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_yaffs2.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_verify.o
  CC [M]  /home/jlehr/projects/yaffs2/yaffs_summary.o
  LD [M]  /home/jlehr/projects/yaffs2/yaffs2multi.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /home/jlehr/projects/yaffs2/yaffs2multi.mod.o
  LD [M]  /home/jlehr/projects/yaffs2/yaffs2multi.ko
make[1]: Leaving directory `/usr/src/linux-headers-2.6.38-13-generic'

Finally, install the module.

$ sudo make mi  #or sudo make modules_install

make -C /lib/modules/2.6.38-13-generic/build M=/home/jlehr/projects/yaffs2 modules_install
make[1]: Entering directory `/usr/src/linux-headers-2.6.38-13-generic'
INSTALL /home/jlehr/projects/yaffs2/yaffs2multi.ko
DEPMOD 2.6.38-13-generic
make[1]: Leaving directory `/usr/src/linux-headers-2.6.38-13-generic'

Mounting a yaffs image

I was planning to finish this discussion with mounting a yaffs image, but its a more complex topic than I can reasonably handle in a few lines. Look for a discussion on the complexities of mounting a yaffs image, and maybe the methods for obtaining one, in a future post.

Monday, May 7, 2012

Chomping on BlueTooth

I had a difficult task recently: try to determine who was the likely operator of a netbook. The netbook was used to download and store ATM card numbers that had been illicitly collected with a skimmer. The operating system was Windows XP, and the sole user account was a pseudonym that I could not link to anyone in particular.

I tried a few obvious tactics: I looked for installed software (Program Files and Windows Registry), hoping to get registration names and/or email addresses for registered applications. Nothing. In fact, it appeared little extra software had been installed on the system except for the software and driver's to communicate with the skimmer.

I decided to boot the image in a VM to get a 'lay of the land.' In dead disk analysis, its quite easy to see what files are on the desktop, for example, but not nearly so easy to see how objects are arranged, the wallpaper, running applications, etc. Sure, you can discover these things, but you have to look several locations and then synthesize the information... sometimes a picture is just plain better. I used xmount, VirtualBox, and opengates, a technique I've previously detailed.

I immediately noted something after the system booted: the BlueTooth (BT) service was running. I have never seen BT running by default in a Windows system, but that doesn't mean it wasn't configured to run on boot by the netbook's manufacturer. But still, I was intrigued. The BT application windows did not show a history of connected devices, but I wondered if such a history was available. Maybe the paired device might lead me to the suspect?

I'm in the habit of imaging devices with forensics-oriented Linux boot discs such as CAINE, DEFT, or those of my own creation. The benefits can be quite substantial, including the ability to easily collect hardware information. I use lshw for this purpose, but in this case, I saw no BT hardware listed in the report, just ethernet and wifi. Did this netbook support BT, or was an external adapter used? In this case, I no longer have the hardware in front of me to inspect, but it doesn't appear to have a built-in adapter.

However, I am able to detect third-party bluetooth software with a quick registry search in the mounted disk image:

$ reglookup -i software | grep -i bluetooth
/Widcomm/Install/INSTALLDIR,SZ,C:\x5CProgram Files\x5CWIDCOMM\x5CBluetooth Software\x5C,2010-11-28 03:01:19

The tool I used, reglookup, is an efficient and useful command line registry parser. The -i option causes subkeys to inherit the timestamp of the parent key. The date may not reflect the modification date of the specific key since the parent key modified date is updated whenever a subkey is modified.

Inspecting The Widcomm (Broadcom) software keys, I found a "Devices" subkey that organized remote devices by their MAC addresses:

Widcomm/BTConfig/Devices,KEY,,2010-11-28 03:04:14
/Widcomm/BTConfig/Devices/00:02:72:e2:05:7a,KEY,,2010-11-28 03:10:08
/Widcomm/BTConfig/Devices/00:23:3a:e5:1f:53,KEY,,2010-11-28 03:04:45
/Widcomm/BTConfig/Devices/78:ca:39:41:40:e1,KEY,,2010-11-28 03:04:05

Each "device" subkey, among other things, contained a name key. This was the 'text' name of the device, and one of the devices stood out:

/Widcomm/BTConfig/Devices/78:ca:39:41:40:e1/Name,BINARY,badguy\xE2\x80\x99s MacBook Air\x00,

The data in the Name key is in a python bytes format, which displays ASCII text where possible and otherwise the hex in the form \x00. It can be converted in python3 rather simply with:

b'badguy\xE2\x80\x99s MacBook Air'.decode()
'badguy’s MacBook Air'

I redacted the Name key data for this article, but from it I learned the name of the MacBook Air with which the netbook had been paired. The name of the MacBook was the name of a suspect in the case. Proof positive that the named suspect was the operator of the netbook? No, certainly not. But helpful in light of the totality of the circumstances, to be sure.

Monday, February 27, 2012

Spreading Out My Skills:
Fun with Spreadsheets

I had an opportunity to improve my spreadsheet skills last night while helping my wife on a project. It's not hard for me to improve in this area because up until yesterday, spreadsheets were just a convenient way to open and sort CSV documents from forensics tools. But, I learned some averaging and summing techniques, and more importantly, conditional statements and conditional formatting. What's that got to do with forensics, you might ask?

I've written and/or used plenty of tools that produce CSV output. Let's take a an SMS output, as an example. Often, a particular phone number is the target of an investigation, and spreadsheets make it quite easy to sort on a column of data, such as the phone number. So, in a few clicks, you've got the target number nicely grouped for review.

But, the more investigations I do, the more I've come to realise that good intelligence and investigation reads between the lines--not in a 'make up your own interpretation' sort of way, but looking to see what else was going on in the phone, computer, browsing session, etc., to give the target data proper context. Conditional formatting can really help here. It allows you to easily visualise the target data while at the same time seeing it in context.

OK, now I have your interest, but you really don't know what I mean by 'conditional formatting.' Simply put, conditional formatting changes the look of a spreadsheet cell based on the content of the cell. It is automated, rules based process; you set the rules, the spreadsheet formats the cells according to the rules. Taking our cell phone SMS output as an example, you could create a rule that changes the color of a cell based on the the phone number in the cell. Thus, you can easily find your target, but still see it in context.

I'll use the spreadsheet in Google Docs as an illustration for setting up a conditional format:

Sweep the cells or select the column to which you wish to apply the condition.
Right-click in the selected area and choose (you guessed it) 'Conditional formatting...'
Set the rule according your your specifications. That's it... really!

Your options may not seem like much at first, but you can specify more than one rule for the cell selection. If the condition for one or more of the rules is met, then the text and background color selections your make are applied to the cell. Conditional operators are:

Now, I also mentioned conditional statements. These are statements that act on the data itself, not the cell format. When would you want to change the data in a forensics investigation? Well, how about this:

You are not a SQLite giant, but you know how to use your favorite GUI SQLite browser to export a table as CSV. The SQLite table represents 'Sent' messages as '0' and received as '!'. You'd like to render those values in their text equivalent for easy reading. Sound like a possible scenario, yet?

OK, you've bought into the idea, but how do you do it? Well, spreadsheets offer and 'if' statment that takes three arguments, and if, then, else clause if you will. In our case, we would want the expression to read "If the value is zero, replace it with 'SENT', otherwise replace it with 'RECEIVED.'" The expression looks like:

IF(test, then_value, otherwise_value)

The formula for our example in your spreadsheet might look like this, then:

=IF(B2=0, "SENT", "RECIEVED")

You can easily apply this formula to each successive cell, automatically changing the cell address for the appropriate row, by clicking the cell with the formula, grabbing the handle on the lower right corner of the selection box, and dragging to to the end of your column. If statements can even be nested to make more that two possible outcomes:

=IF(B2=0, "SENT",(IF B2=1, "RECEIVED", "UNKNOWN"))

In the statement above, cell B2 is tested for 0, if the condition is met, then it is replaced with "SENT." If it fails the text, then the "otherwise" value is another IF statement: if B2 is 1, then replace it with "RECEIVED", otherwise replace it with "UNKNOWN." It is possible to have multiple nested if statements.

Who knew spreadsheets could be so much fun? I even hear they do math!

Saturday, February 25, 2012

SINF Structure

I spent some time decoding the SINF files that I discussed here, thanks in great part to a link sent to me by a colleague (Thanks, Derrick). Here are my findings, to date:

Unlike iTunes Purchased MP4 media files, the SINF does not contain the iTunes user account name, which is most often their email address and most useful for contacting owners of stolen devices. Instead, you are limited to the iTunes user's name and iTunes ID#. Short of a search warrant or subpoena, Apple is not going to reveal the owner's personal information, though they have contacted owners on my behalf in the past.

Please review the data in the table and compare with your findings, if you are so inclined. Post a comment if you have more insight, find an error, or can confirm any of the information.

Tuesday, February 14, 2012

iOS .sinf Name Calling

In my ever present quest to identify the true owners of stolen iPods, I made discovery in iOS while examining a Touch that may be probative: the app .sinf files found in the /private/var2/Applications sub folders. According to File-Extensions.org:

The SINF file extension is associated with applications for Apple iOS operating system that is used in Apple iPhone, iPad and iPod Touch. File contains information about digital rights that are applied in application. The SINF file is stored in an IPA iOS application archive.

I found that by searching the ../Applications directory for .sinf files, and then grepping for the term "name", the Apple Store real name associated with the app can be discovered. On the Linux command line, this can be accomplished very quickly with:

$ find private/var2/Applications -name "*.sinf" -exec strings -f {} \; | grep name

Modification dates for the files can be used to create a timeline of activity for the device and perhaps demonstrate when new residents moved in, so to speak. The find command can by used with stat to quickly provide a list of date stamps:

$ find private/var2/Applications -name "*.sinf" -exec stat {} \;

But, even better, you can put it all together in a fairly simple command and create csv output for examination and sorting:

$ find private/ -name "*.sinf" | while read i; do name=$(strings "$i" | grep name); date=$(stat -c %y "$i"); echo -e "$i,$name,$date"; done

It appears from content that I have uncovered in a suspected stolen device, that the real name of the Apple Store account used to install the app is embedded in the .sinf file at the time of installation. If this is the case, a stolen device, though it have the device name changed and the true owner's data deleted, may still have applications that were installed with the owners Apple Store account!

Testing still needs to be done for verification, and I don't currently have any test devices to properly test. If you are able to conduct any validation studies, please comment on this post with your findings. I'll amend this post once I'm able to conduct my own studies or receive reliable findings from others.

Friday, January 27, 2012

iPod's, what's in a name?

iPod Device Names

iPod devices have a name. It's set by the user when they initialize the device through iTune's (there are alternate initialization methods, but that is not the focus of this post). When the focus of the investigation is determining the device owner, the device name is a good place to start. The device name, for example, could be "John Doe" and you happen to know who is John Doe, or how to find out.

Of course, the device name could be 'Pookie', which won't help you out too much. But, don't give up, I've already demonstrated another, even more useful, method for identifying iPod owners through iTune's purchased media. Take a look here if that interests you.

But, I got curious, where in the iPod can you find the device name? It's clearly stored on the device, because, as any iPod owner can tell you, if you navigate from the main menu to the 'About' screen in 'Settings', you'll see something akin to "John Doe's iPod."

Where to Look

The first place to look in a FAT formatted iPod is the volume label of the data volume (aka partition). The current device name is the volume name. You can view it with blkid, or for the forensically inclinded, with the sluethkit at the root level.

I'll use recent 5th gen Nano I recently examined as an example. I am operating as root because I am examining a device directly:

# blkid /dev/sdd1
/dev/sdd1: LABEL="PINK PANTHE" UUID="E0B8-3334" TYPE="vfat"

# fls /dev/sdd1
r/r 3:    PINK PANTHE (Volume Label Entry)
d/d 5:    iPod_Control
...

Now, I'm fairly worldly (all my friends are now rolling their eyes), but I suspected when I check Settings | About, the device name on this Nano, I'd find the device name was 'Pink Panther', not the truncated 'Pink Panthe' that was in the volume, which has a limit of 12 characters. And sure enough, that's what I found: 'pink panther.'

So, if the 'r' in pather isn't in the volume, then the volume is not the source of the data in the About screen. So, what is the source? Turns out, after mounting the device read-only and employing my favorite keyword search utility (more on that one later), the source turns out to be the 'Library.itdb' SQLite database in the 'iPod_Control/iTunes/iTunes Library.itlp/' directory.

I found the table in which the device name resides as follows:

# sqlite3 '/media/iPod/iPod_Control/iTunes/iTunes Library.itlp/Library.itdb' .dump | grep 'pink panthe'

INSERT INTO "container" VALUES(-3226555229562403833,0,333435002,347345556,'pink panther',100,0,1,0,1,0,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL);

What I did there was dump the table contents, which shows the commands that were issued to create the database and populate it. The dump, when saved to a file, can be used to backup and restore a database. For my purpose, I see that a list of values, including 'pink panther' was inserted into the 'container' table.

Now, I can produce a nice query that can be used in future examinations to directly recover the device name from the Library.itdb database:

# sqlite3 -line '/media/iPod/iPod_Control/iTunes/iTunes Library.itlp/Library.itdb' 'select name from container'
 name = pink panther

Now I have two sources for the device name in a FAT formatted device. And, the database query can be used for HFS formatted iPod Classics, presumably. Combine that with the media search for Apple Store account and real name information, and even an unallocated search for MPEG-4 metadata (next post), and you have a robust, though not fool proof methodology for identifying iPod owners.