COPY THAT

BE TESTY or: are your software results incomplete?

Originally written: April 2019, by Dan Mares.
However, by the time you read the article, a lot of time may have passed and the software that was tested may have been updated and now just might pass the tests. However, you should conduct tests of your own to see if the current version passes your tests and meets your needs.

No matter what investigative, forensic or analysis discipline you are in, please be sure to test and be comfortable that your software produces true, accurate, repeatable and complete results for your needs.

SUITE STUFF

DON'T BOTHER USING A SUITE AS MOST SUITES ARE DESIGNED TO PROCESS FULL BIT STREAM PHYSICAL "IMAGES", AND WHEN PROCESSING FULL PHYSICAL IMAGES WILL GENERALLY PASS THE TESTS. THEREFOR THE TESTS ARE DESIGNED TO TEST THE SOFTWARE CAPABILITY AT THE FOLDER/FILE LEVEL, AND AS SUCH ASSUME YOU ARE PERFORMING THESE TESTS ON LIVE SUSPECT MACHINES WHICH NORMALLY WILL NOT ALLOW INSTALLATION OF SUITES.

Also, regarding the processing of the test files by suites. Remember, we are testing, hash, copy, zip/containerize/unzip/restore. Most suites will be able to perform these tests. But in my tests, some do not perform in 100% confidence for my tests. For instance: If you think about the suite process of finding a suspect file on a suspect drive, copying it or saving it to a forensic drive for transport from the original source to your analysis location, then restoring it at your analysis location, isn't that technically copying the suspect file from point A (suspect) to point B (analysis/evidence location). So suites do "technically" copy. Some, also create some sort of "container" which you can place your evidence files. Then at a later time, move the container to a work machine, and open the container for additional processing. Isn't this technically a source zip, and destination unzip. They may not call it that, but the process and effect is the same: containerize the source evidence, move it to an analysis location, de-containerize it, and process the evidence. So use a suite if you wish, just operate ONLY at the folder/file level which would be the basic and smallest amount of evidentiary file/data. If you can't process a single file as evidence, what good is a terabyte image going to do. No physical bit/sector analysis allowed for these tests.

Some preliminary information: I want to remind that all the testing I have done and reference in this and any other testing related article was done using Windows10 on an NTFS file system on a desktop computer. The NTFS file system was used as the test environment because I believe that a significant number of corporations and other forensic investigations take place using the NTFS file system. Also, the test environment was set up specifically to test certain aspects of the NTFS file system which may or may not be relevant with other file systems, regarding ability to alter a files last access date, use long filenames and alternate data streams. These three parameters add to the forensic and evidentiary complexity.

How do you know the software you are using is producing true and accurate results? A great question for attorneys to ask you. What is your answer.

If your answer is, it was listed on the web as useful "insert your topic here" software, "others use it", and have recommended its use. Well, we know if its listed on the web, it must be useful. If you use any of these arguments, I think your case is going south.

Probably the best answer would include something like, I have tested it myself to determine if it satisfies my needs for this particular investigation. The key words here, are "tested it myself", and "for this particular investigation". Does the investigation need to process the entire image at the sector level, or can you search at the logical file level. If a sector level search is used, any fragmented item may be lost. (ie: first half of string at sector 1000, while the tail end of the string is at sector 2000). So a sector level search would not find the evidence, while a file level search probably would.

AND: Just for fun, have you tested the version of the software your opposing party is using. Maybe you can find some faults with it. What a neat idea. DAH!

Before we get started, here is an excellent general purpose testing link Computer Forensic Reference Data Sets from the people who know. And also look for the "Federated testing test environment" from NIST.
Federated Testing Project

A challenge (6/2020) for you to test your forensic hash/copy/zip software for forensic and evidentiary reliability.

Suppose you are performing string searches. You may obtain a string search program from a "reliable" source. (note at this time, I believe NIST is performing or setting up test environments for string search programs. Verify this statement for yourself. If you take my word for it, I have a bridge in Brooklyn thats for sale). You obtain and run the string search program and it shows no results. So your report says, "didn't find any suspect strings". What you didn't think about was, are your strings stored in an unusual format (unicode UTF-16, in a compressed format, unusual syntax: ie:EBCDIC, fragmented between sector boudaries etc), and is the program designed to find those unusual items. For you millenials, EBCDIC is an antiquated mainframe format. HA HA.

You may have known that the strings were stored in your data in an usual format, but what you didn't know (or test), is that the string search program wasn't coded to search for those formats. You just "ass"umed the program would work for your needs.

Another example, (and I admit, I'm not a network investigator, in fact, I can barely spell ntwrk). Anyway, suppose your software was built to search on only IPV4 addresses, but it wasn't documented that way. And your analysis needed to analyze IPV6 formats. You may not find any hits, but you didn't know the software wasn't coded to find the particular network problem you are searching for.

Other areas that I have found a significant number of programs fall short are two areas that persons could easily hide suspect data. They are: (1): long file named files. That means files whose combined path/filename is greater than 255 characters. and (2): search or identify alternate data stream files, and process them as "normal" files. Even typical "explorer" operation won't show the existence of an alternate data stream. Just imagine what could be hiding in one. Passwords, virus software, pornography, the keys to the kingdom. The version of PKzip I am currently using is 14.2 and it fails to identify long filenames. It does store alternate data streams, but in the displays, doesn't give any indication that they exist, are stored, or names. My version of 7zip will find and store long filenames, but fails to store data streams. So, at the time this article was written the only reliable zipping program I have tested and found capable of passing my tests is WINRAR (with appropriate options chosen).

Bottom line: Test your software. Test it to determine if it produces results for the particular subject of the investigation. Test it at both ends of the bell curve. Don't be a ding-a-ling and fail to ring the bell. Don't rely on the software writers claims or other users that it works in all instances. If you do, you may be missing important data and leave yourself open to challenges.

 
(hopefully), You know what you are looking for. 
(hopefully), You know how it is stored on the drive.(contiguous, segmented, containerized, special coding.)
(hopefully), You know how to create some test data.

Why don't you create your own test data for the specific items you are searching for. In my previous life, when we found no hits, we knew something was wrong, so I learned to "SEED" a test platform. This will allow you see if your software can find it. If it will, then it will probably find the data on the suspect platform.

Very few programs will produce 100% accurate results for all instances. Does your choice of software, produce accurate results for what you are looking for. Only your tests can confirm this. And will help you answer the attorneys question, Did you test the software?

You may have to testify to that fact. Even though my software doesn't contain bugs, in some instances its just operatinally challenged. As is the situation with many software programs. But, is the place(s) where the software may fail, or have its restrictions, a place that would raise questions to your specific investigative results. I don't really care that my software will fail at the petabyte level. Because I know it will probably never be used on petabyte files. But it does work on gigabyte files. Can you say that.

For those adventurous souls who wish to test their forensic suites or stand alone software I have placed here two zip files of a 1GIG image. They contain files with long filenames > 255 characters, and files which contain Alternate Data Streams, (ADS). You can download them, they are about 3 1/2 meg each (compressed). Unzip to the image and mount with your suite. Then see if you can: export the ADS, LFN's, Hash the files, etc.. Then compare and confirm your results with what is really available.
Here are the zip files. 1Gig DD image 1Gig FTK image

Check out some of my software on my homepage for forensic cataloging (file listing), hashing, copy, email header processing, secure file deletion, string searching and others.

DMARES.COM

For questions or answers (no flames please) regarding the hashing software, the NIST data records on my site, work007 (at) dmares.com.

I would appreciate any comment or input you have regarding this article. Thank you. dan at dmares dot com,