OK – JUST ZIP IT

The truth is out there
So make sure your software can find it.

  

Read this article and raise your forensic intelligence level a few points. 😄

First authored May 2019.
However, by the time you read the article, a lot of time may have passed and the software that was tested may have been updated and now just might pass the tests. However, you should conduct tests of your own to see if the current version passes your tests and meets your needs.



CAVEAT
Your know what that is: "a modifying or cautionary detail to be considered when evaluating".

The cautionary detail is that my testing took place during 2019-2022 time frame and I suspect that some of the versions of the software being tested has been updated, modified, fixed. Not to say, the problems I found have been fixed, just that there are newer versions out there, and maybe, just maybe the short comings I found may have been fixed. Just saying.


RANT OF THE DAY

Recently (10/2022) I offered to provide some test data to persons who wished to test out their zipping programs for evidentiary reliability and accuracy. Of the testers, I received a response which all should consider: "this has been a very eye opening exercise". Lets just hope that your opposition didn't have the same response. Because if they did, and followed thru, your next evidentiary presentation may go out the window.

One thing that really upsets me is that during my testing of the zip programs, and discussion with some other forensicators who use various zipping tools I found that some of them have the idea that if/when they zip up the evidence, whether it be the original evidence from the server, or their work product result to provide to the court/reviewer/whatever, they really think that if they can explain what their "provided/final" evidence is, that is enough. And they don't really care that during their process of zipping, unzipping, zipping, unzipping, you get the picture, that they may have missed something, or lost something in the translation, so to say. They only argue that their final evidence proves XYZ and have no feelings that they may have either corrupted, lost, or just missed evidence during the process. They only feel showing the evidence they provide in their final report is what proves the case, and any lost, or otherwise not processed is of no consequence. Believe it or not, not in these exact words, but this is what I have heard.


Before you get into this article, you might read these associated sequence of articles.

Here are a few articles you might like to read in the order listed. But before reading them, think about this small difference: the difference between "processing the evidence", and "conducting the forensic investigation". I think these articles are more targeted to the processing of the evidence rather than the direction you use to conduct the investigation. They may be very similar but no cigar.

Start here:

Inventory/Catalog files  Creating an inventory of evidentiary files
Forensic file copying  Article tests over 40 "forensic" file copiers
Forensic Hashing  Article tests over 30 "forensic" hash programs.
the one you are on:ZIP-IT for forensic retention  Article test a few zipping programs and
ZIP_IT_TAKE2  More tests for your zipping capabilities.
ZIP FILE/container/container  Hashing your zip container reliably
MATCH FILE HASHES  Demonstrates hash matches using Maresware.
A HASH software buffet   How-to use Maresware hash software

Zip-it: That’s what my mother used to say to me when I was bad. But that’s not what we are talking about here. However, we will talk about “bad” zipping software. Yes zipping software doesn’t always perform as expected. What a revelation.

This article and its companion: ZIP_IT_TAKE2, talks about the forensic and evidentiary uses of zipping software. It points out some unique and simple requirements on an NTFS file system which when zipping evidence you may not retain all that is necessary for true evidentiary retention. I have set out some simple requirments on an NTFS file system which when you use your zipping software to zip and store evidence, you may not be getting all the evidentiary data you think you should retain. And when you think about it, isn't zipping from pointA to pointB another method of forensic copying? About 75% of the zipping software I tested failed one or more of my forensic/evidentiary requirments. Is this what you want? Explain its shortcomings to a defense person.


Preliminary case information which determines why I chose the items to test.

First is you have a situation where you can seize the entire computer, or make a full bit image of the drive then some of these test requirements will be easily met using a suite. See Suite stuff   below. However, there are situations which will be a little more restrictive, and which will cause you (or rather your software) to be more restrictive in what and how you process the evidence. That situation will be explained here, and again below, just so you get the idea behind the topics I chose to perform the tests aroung. I think (I know thinking is bad), that testing software under these more restrictive scenarios will show that the software can not only perform in a more restrictive environment, but also in one in which you have conplete control.

So lets begin:

The tests were performed on an NTFS file system because I believe that is the most common file system used by corporations today. It also offers the more items with which we till perform the tests.

So number one is the fact that the software will be able to find unicode file names. Not necessarily display in full unicode format, but merely find and process those items. And more importantly, when unzipping, restore the correct name.

Then, second because we are on NTFS files system, we must be able to find and process long filenames. Those filename paths greater than 255 characters. You will be surprised at how many programs can't do that. In some cases, the long filenames may be missed all together, or their names stored as the 8.3 form, so when unzipped you loose part of the real war and peace filename.

Third, again because NTFS, we will ass ume that the owner of the computer system, (usually a corporation) has last access update turned on. The last access update may or man not be important to your investigations, but if it is turned on, your program should be able to NOT tamper with the evidneces last access date. Will your zip, unzip maintain all three original MAC dates, and more importantly, when unzipping, will it set the original dates. Wouldn't that process be a nice evidentiary step?

Also, you must consider when performing the zip unzip operation from suspect to a work drive for transmit to your office, or to/from the reviewer or prosecutor, that the zip program retains ALL original file dates so as not to corrupt or influence the analysys.

Fourth and final: again because of NTFS, we should be able to find, identify, and process where necessary any alternate data streams. Consider a porn investigation where the user downloads porn from various sites. Did you know, that some browsers (I'm not telling you which, thats for you to find out) actually store in ADS's the original URL and other information of the download. Might be very interesting in porn or other internet investigations. Will the zipping program store and restore the ADS evidence? Have you tested it?

If you perform a bit-image of the drive using a suite, most of these items above will easily be identified and located as evidence. The bit-image copy is in-fact a true and accurate copy. However, in our test scenario, we are sitting at a corporate server where we can ONLY process/examine/image/copy/zip-unzip (call it what you will) that directory tree belonging to the suspect. So this fine line refinement and restriction must be considered when testing our software. So in effect a zip un-zip process is a fancy copy procedure. Also, the zip un-zip process is often used as a long term storage process. So wouldn't it be nice if this process saved and restored all pertinent evidentiary information. I would like to think so.

Another thought. I know, thinking is not good. But for long term storage. Years down the road, will you have a program which can faithfully unzip the evidence when needed. How many times have you performed a task with a program which worked years ago, and now for whatever reason, you cannot get it to perform any longer. Meaning the original zip program is no longer available, and you don't have a program which can unzip your evidence.


Before we start:
A challenge   (6/2020) for you to test your forensic hash/copy/zip software for forensic and evidentiary reliability.

As an aside, you might want to check out:   Sullivan strickler and their tape archiving solutions. They have been in business quite a long time.

Some preliminary information: I want to remind that all the testing I have done and reference in this and any other testing related article was done using Windows on an NTFS file system on a desktop computer. The NTFS file system was used as the test environment because I believe that a significant number of corporations and other forensic investigations take place using the NTFS file system. Also, the test environment regarding ability to alter a files last access date, use long filenames and alternate data streams adds to the forensic and evidentiary complexity.


An alternate data stream evidentiary discussion:

Lets say you are doing a pornography investigation, and find a file with a provacative name within the suspects pictures.
PIECE_OF_TAIL.mp4     72,788,245  01/29/2022 08:19:20:653w
Looks like an interesting piece of evidence.
You might be wondering where did that file come from?
   -    did you know that if the file was downloaded from the web, depending on the browser, the information might be right in front of you in an alternate data stream?
PIECE_OF_TAIL.mp4                       72,788,245  01/29/2022 08:18:56:678c  A.....
PIECE_OF_TAIL.mp4:Zone.Identifier               95  01/29/2022 08:18:56:678c  .adata
Lets see whats inside that alternate data stream that was created by firefox browser when the movie was downloaded.
I've taken some literary license by extracting the alternate data stream data to a "real live" file using the copy_ads.exe program so it could be seen.
D:\TEMP>type "PIECE_OF_TAIL.mp4[Zone.Identifier]"
[ZoneTransfer]
ZoneId=3
HostUrl=https://www.dmares.com/maresware/graphics/PIECE_OF_TAIL.mp4
   -    Don't you think knowing the information within that alternate data stream might be helpful to your investigation?
   -    Wouldn't it be nice if you saved that piece of information in the zip file when you zipped the evidence?
   -    Do you know that a number of zipping software programs WILL NOT include that piece of evidence in the final zip file???
   -    Do you know that a number of "forensic" copy programs won't copy the alternate data stream from pointA to pointB of you evidence work directory???

LETS GET STARTED

Even though zipping of files/data is a routine normal occurance, those that are conducting security or forensic exams and "save" their results or reports for delivery to managment or attorneys should consider the capabilities of these zipping programs with relation to what you are zipping. Some forensic examiners zip image files, which would work fine. Others zip extracted evidence after processing with a forensic suite. And still others massage the initial extracts and get the data down to a point where they will provide it to managment or legal persons for review.

Also consider that many of you will be asked to work a case where the suspect has a directory (thats a folder for you millenlials) on a much larger server. When you are assigned the case, either criminal or civil, you find out that the search warrant, or the company will not allow you to physically image the 100 Terabyte server if only you are looking at the suspects 500G directory. So you must either find an "imaging" tool that can properly capture that 500G, find a zipping tool that can zip that 500G, or find a copy too that can forensically copy that 500G. Whichever choice/option you choose, don't you think it wise to make sure the tool you use will properly capture (zip) and restore (unzip) the evidence when you are ready to start your analysis. How many of you actually have tested your "imaging" or zipping software to capture all the data in a tree withoug leaving or missing any evidentiary crumbs?


Next

The problem(s) you should consider, is that during your process, you may inadvertently create some unusual data files. The forensic suite may extract alternate data streams, or extract files and put them in a folder which has a long filename. I have seen forensic suites extract data to long filenames as a matter of course. Expecially when you are asking it to recreate the original path/folder structure, and you yourself are outputing the data to a subdirectory which itself starts multi-levels down the tree. You then don't know how long the ultimate extracted folder will be, and when you go to zip up the data for delivery, you may miss one or two important files. Or totally miss an alternate data stream which was hiding behind an important file. Last, but not least, does the zipping program maintain the source and ultimately unzipped file dates? If you have never tested your zipping program to see how/what it does in unusual situations, you probably should. Test not only your version, but the version that the opposition is using.

Recently brought to my attention was the fact of how do investigative agencies handle the storage and retention of their evidence. Whether it be work product, or final reports and evidence that must be stored for future (maybe long time future) reference and restoring. It would be embarrasing if your organization stored evidence in zip formats (or what you thought was a forensic copy of the evidence: see   COPY_THAT article), and a few years down the road when it was time to make it available for court or other review, the unzipped data couldn't show any original meta-data, or the original zipping process missed some data, and the unzip process failed to restore the data. Your IT staff may have a completely different goal in mind when handling and storing YOUR evidence.

First, let me state: the tests I ran are not at all scientific. I consider them practical.

Second: I am not using names here because I do not want to point fingers. I’m just pointing out what I found in my unscientific tests. If my minimal tests show a failure, why proceed further.

Third: The test suite I used was purely arbitrary. Set up for an NTFS file system, and seeded with items which, from other tests, I knew might cause problems with zipping software, but would not be unheard of in a forensic or backup environment. NTFS was used because most forensic analysis and reports are created on computers running Windows and NTFS.

Fourth: I DID NOT test any encryption capabilities of the software. I use separate PGP encryption of the stand-alone zipped file when necessary. Finally: Run some tests yourself. Don’t take my word for it.

I began by selecting three of the most popular zipping software packages. My versions may not be the most current, because I’m CHEAP and don’t spend money needlessly. Most had both GUI and command line capability. However, for consistency, and because most people prefer the GUI interface, I only used the GUI in my tests. A very important thing to remember with the GUI versions, is that unless all the correct boxes are checked, you may not get the results you expect, and may not obtain results similar to mine. But if the program hid the option so much that I couldn't find it, I may not wish to use that program in the future.

Some required in-depth choices of operations which I would consider required, so I had to look for those options I wished to implement. Each package showed different options for the same operation, and some had no option for a needed capability. I MAY have missed an important option. If I did, it only means the option was so far buried, that a normal person might also forget to look for it, and find it. I tried as best I could to locate and check all the boxes which would cover the items I tested for (ie: Long filenames, Alternate Data Streams, Date retention).

Then I created a folder with the following parameters: Files containing Unicode characters in their filename (ie: CYRILLIC names). Second, I created some files with long filenames (path/filename > 255 characters). Third, I inserted some Alternate Data Streams (ADS) in a number of the files, both those with normal length names, and long filenames.

I created those three types of files because, in a backup scenario, an investigation and subsequent evidentiary output which would probably be sent to an opposing party (attorney), one or all of these types of data (files) might be necessary to produce. Forensic suites and other forensic software operations may routinely export files with any or all of these items.

Also, I have seen discussions, where persons have been asking which methods people use to store data for "posterity". That’s a long time, not your rear end.

The common methods of storing or delivering data are in a zip format. Not only for space saving, but also for inclusion into a single file for distribution to a requesting party.

So, lets test some of the zip(ping) capability of these programs.

Again, I’m not going to name names, or identify which program failed in which area so here are the general results. If you think some of the items may belong to your processes, you might test the software yourself. What a novel idea.

First:  All zipped and unzipped all filenames correctly, (UNICODE, etc).
Second: Two of three processed Long Filenames correctly. (see ADS below)
Third:  Only one processed ADS’s in normal and LFN filenames,  
Fourth: Only one of them reset the last access date of the original files after the zip process. 
Fifth:  Two of three properly reset (original) last access date of the restored file to the original access data. 
Sixth:  Two of three properly reset all the MAC dates to the restored file. The third only reset the original ‘M’odified date. 

Here is a quick and dirty spreadsheet showing which program did what. If you want to know the real name of the program contact me at: dm at dmares.com

Program # Unicode LFN ADS Reset Src Access Reset Dest Access Reset MAC Dest.
# 1
PASS
PASS
PASS
FAIL
PASS
MAC
# 2
PASS
FAIL
FAIL
FAIL
PASS
MAC
# 3
PASS
PASS
FAIL
FAIL
FAIL
M
# 4
PASS
PASS
PASS
PASS
PASS
MAC

File #1, even though it captured the ADS files in both the normal and Long Filenames, the options to obtain that capture proved to be very confusing. I had to try and create the zip file over 5 times before the ADS's were properly captured. File #2, the GUI interface was nothing less than horrible to work with. So much so, I uninstalled it as soon as the tests were completed.

Special note of Program #4, which is WINRAR and is used in both Linux and Windows. It is quite inexpensive (acutally I think its shareware, but I paid for a license), when I first tested it, the program did not have the ability to reset the source last access date. However, with one simple request, and what I think was a reasonable evidentiary explanation, the programers agreed to include the reset of source last access in the next version. Well, as of December 11, 2019 the version 5.80 has all the capabilities which I tested and it has passed all my tests.

I personally prefer the command line, since I have more control. Just take a look at all the available commands and options with the command line version of WinRar (called rar.exe). "Technically" there is a limitation to the length of the path/filename in WinRar. But it is normally not a concern. If you have an exceedingly long path/filename (>2047 characters) I suggest you get to reading war and peace. (just a joke). The 2047 limit should be enough for most instances. In the next section I have provided a command line that seems to work very well at creating a self extracting exe which passes all my tests. You may want to check it out.

"c:\path_to_winrar\rar"  a -sfx   -r -ts+ -tsp  -os _DEMO.EXE  -zc:\"program files"\winrar\comment.txt   folders/files-to-ad  -ppassword

The content of the comment.txt file which contains routine required options is:
The comment below contains SFX script commands which will cause the extraction of the .exe to be silent (not ask for user input) and
overwrite any existing files during the extraction. The other item which begin with a semi-colon are unnecessary and are included 
for other purposes not needed at this time.


Silent=2                                                
Overwrite=1                                             
;Setup=setup.exe                                         
;SETUP=setup16.exe          not needed    
;Presetup=hello.exe                                      
;Path=C:\temp\default_unzip_path                         
;PATH=.\.      
;SavePath                                                


the subsequent extraction/unzip command line (which is easily included in a batch file) is
C:DEMO.exe -s2 -tsp -tp+ -os -ppassword 

Even though the above process seems to work and passes all my forensic, evidentiary preservation requirements, this mention is in NO WAY an endorsement of WINRAR. Don't take my word for it, and test for yourself any zipping program you use and be comfortable with its operation. I have tested and use WINRAR when preparing all my test data and it has worked admirably.

Consider any or all of the above shortcommings when you are archiving, or preparing for discovery your files.

In short, only one of the zip programs tested in this minimal test process passed all the tests. The tests included: Unicode FileName retention, LFN, ADS, reset ALL appropriate MAC dates (of source and restored files).

AND: When you actually think about it, isn't a "zipping" process a sort of copy method for retention, discovery, safe data saving? What evidence might you be missing in the zip process? Also, is next years version of the zipping program going to be able to unzip last years version. Or is product 'A' capable of processing the zip file of product 'B'.

A final thought, but not included in the above list. I tested a recent "free" version of PGP (v8). It compresses, and lo and behold, also had failures. However, since i don't use PGP to compress, only encrypt, I didn't include it in the statistics.

So, which zipping program are you using to store and restore your legacy data or evidentiary file data?

Associated articles and programs of interest:
Inventory/Catalog files  Creating an inventory of evidentiary files
Forensic file copying  Article tests over 40 "forensic" file copiers
Forensic Hashing  Article tests over 30 "forensic" hash programs.
ZIP_IT_TAKE2  More tests for your zipping capabilities.
MATCH FILE HASHES  Demonstrates hash matches using Maresware.
A HASH software buffet   How-to use Maresware hash software

Test data  containing about 30 files, in a self extracting executable

 

I would appreciate any comment or input you have regarding this article. Thank you. dan at dmares dot com,