SHA_VERIFY

PURPOSE   OPERATION   ITS ABOUT TIME   OPTIONS   COMMAND LINES   RELATED PROGRAMS Processing Stats


Author: Dan Mares, dmares @ maresware . com (you will be asked for e-mail address confirmation) or dan.mares @ norcrossgroup . com
Portions Copyright 2000-2014 by Mares and Company, LLC
Phone: (770)770-242-6687 x 119
Last modified: Nov 2, 2010

PURPOSE

The SHA_VERIFY program is designed to perform a number of hashing type operations on files or groups of files.

During its operations it can perform any or all of the following hash types: CRC32, MD5 (128 bit), SHA1 (160Bit), SHA2 (256 Bit), SHA2 (384 bit), SHA2 (512 bit).

One of its capabilities is to perform the hash on a single file or group of files. Nothing new here.

Another capability is that it can, in memory emulate a calculation of X number of sectors containing a single character. How many of you wipe drives with a single character, and really don't have a good way to show that the drive is "TOTALLY" wiped? You could perform an md5 of the physical drive, but how do you know that the value you got for 10,000,000,000 sectors of hex 00 is correct? This program, will, in a few minutes in memory, emulate the md5 calculation of that many sectors. When you arrive at the same value, you know every sector was wiped.

The same calculation can be done if you tell the program to emulate XXX bytes of data of a specific value. So if you have a file of 10meg of hex 00's, you can determine what its hash value should be.

And finally, the most important and primary use of the program is to calculate the hash total of the "combined" group of files with the same base name. The base name MUST have a traditional dot (.XXX) extension. This group of files with the same base name is a set of output files usually created while using a dd type imaging program to create images of disk files. During this operation, the SHA-1 (160 bit) value is calculated by default. So in the output file you would get two values, one for the MD5, and the second for the SHA value. If you want more than the 160-bit SHA-1, you need to add options.

Often, when using dd or a similar imageing program such as the Maresware ntimage a group or set of output files are created with sequential index numbers as their extensions. For instance, if the output file names were image. The extensions would be, .000, .001, .002, etc. until there were enough files created to encompass the entire physical disk that was imaged.

With these seperate images, and the hash log created with dcfldd or ntimage, sha_verify can check the hash integrity of each of the image segments and compare it against the hash log file that was created during the imaging process. If a segment is not correct, it will notify the user (and place information in an output file is chosen) that image xyz is a mismatch. (see the -m and -v options)

Why SHA_VERIFY?

Because the stage of the process that occurs between the dcfldd calculation of the md5 (in memory) and the actual write to the output file is not included in any md5 calculation. If a write error in some way corrupted the data being written to the output file, you wouldn't know. I personally have had this occur. The only way to determine if there was a write error to the output files, is to perform an MD5 on the final set. The Linux program md5sum can do this, but SHA_VERIFY produces a more usable output.

When you perform a dd of a physical disk, even if you use dcfldd (of which there is an enhanced version on the Mares and Company ftp site) you are calculating the md5 hash of the original disk on the fly. A simple process might be:

md5sum   /dev/hdX > someoutputfile
dcfldd   if=/dev/hdX hashwindow=2000M hashlogfile=logfile .... | split -b 2000m -a3 -d - output.

this will produce an md5 of the original disk (the md5sum line), and an md5 hash listing of the dcfldd program (hashlogfile) as it processes the data through memory. BUT how do you know that the output.xxx files all total to the correct hash? You could then take the output.xxx files and perform the following test:

cat output.* | md5sum

This would produce a single value abcdef111.... of the final md5. If it matches, you have a warm fuzzy feeling. What if it doesn't match? You have to go and perform the entire image again.

What if you had a way of knowing which file in the set was the one that contained the write error? If you did, then you could use dcfldd, dd, or ntimage to reimage just that part of the disk. In the least, you would know which one of the set had an error in it and make appropriate adjustments.

With SHA_VERIFY you will get an output that contains, the md5 value of each file in the set, and more importantly, the total combined md5 of the entire data set. This value is the same as would be reported by the md5sum program. But you now have the individual md5 values of each file. So the one which contained the error can be easily identified. (also see -m and -v options).

The output can be set to contain delimeters for additional processing or loading into a spreadsheet.


OPERATION

Verfy simply, the program takes from user a set of command line options which tell it what operation to perform. To perform a simple hash, an emulated hash, or a merged hash. It then does the calculation, and if asked, puts the output to an output file.

If requested to perform the merge option, it identifies the proper sequence (it is not always true that the OS will default sort correctly), of the extensions, .000, .001 etc, and calculates the md5 of not only the individual files, but the md5 of the combined files as if they were one contiguous stream of data. Which is the way the data came off of the hard drive.

After the calculations are made, the filename, and md5 of each file is printed to the screen, or placed in an output file, and at the end, the final md5 is also provided.

If the final md5 matches the output of dcfldd, or the original md5sum of the physical drive, you know you have a good data set. If the final value is incorrect, you can proceed to correct the problem.


Program Output:

The output record is different for each operation. So no samples are provided here.

The output of the merge operation is intended to be placed in an output file for future reference such as verification that files were not altered. This is important when certifying that file contents were not altered during forensic examination or duplication for analysis. These hash values can also be used when making copies of the images to say a work array/drive. Don't try and tell anyone that every time you copy a file, it copies correctly.


Top

OPTIONS

All options should be preceded by a (-) minus sign. Some can be grouped together.

USEAGE:
sha_verify  -[options]
            -[Oo] outputfilename: Write output to filename.
                  -O includes date/time & command line in output file.
            -f filename: to perform MD5 and SHA on. (only 1 filename allowed).
            -m filetype.*: to merge hash values and treat files as one.
                           use this to merge hashes of dd image files.
                           there should be a sequence to the file extensions.
            -v ntimage_hashlogfile: logfile created by ntimage. use only with -m option.
            -t amount:   of characters to perform MD5 and SHA on.
            -s amount:   symbolic number of sectors to perform MD5 and SHA on.
            -c value:    decimal value of buffer to do operations on, 'A'=65.

            -C:          calculate 32 bit CRC of items.
            -256:        do SHA2 256 bit hash.
            -384:        do SHA2 384 bit hash.
            -512:        do SHA2 512 bit hash.
            -@:          print copyright for SHA2 code.


-f + filespec:  If you want the operation on only 1 file, use this option. (-f   filename.ext)

-oO + filename:  Output file name. Place the output to a filename. If uppercase O then date and time of the run is added to output file.

-m + filename.*:  The basename, and the * wildcard of the filetypes to merge the contents of when performing the calculation. Use this if you want to verify the outputs of the dd type programs.

-v + ntimage_hashlogfile:  The name of the hashlog file created with dcfldd or NTIMAGE. This option verifies the -m files with the contents of the hashlog file. The -m option is mandatory, so this (-v) option has files to process as a reference point. The hashlofile has a specific format requirement as shown here.

Format of the hashlog file must be pipe delimeted, and have as the last field the hash value. It must also start with the first hash value associated with the first image.index value. The program reads the hashlog one line at a time, and expects the image files to be of the same order. If the image files were indexed .000 or .001 etc, then there is no problem. Sample hashlog format

0001  |  12345  |   0123456789abcdef
0002  |  12345  |   0123456789abcdee
0003  |  12345  |   0123456789abcddf
0004  |  12345  |   0123456789abddef

-t + amount:  of characters to perform MD5 and SHA on.

-s + sector count:  symbolic number of sectors to emulate the calculation on.

-c + character:  decimal value to use as the content of the data to perform the calculation. usually, -c 0 for hex 00.

-C:  add a 32 bit CRC to the calculations.

-256C:  add a SHA2 256 bit calculation (SHA1 always defaults).

-384:  add a SHA2 384 bit calculation (SHA1 always defaults).

-512:  add a SHA2 512 bit calculation (SHA1 always defaults).


Top

COMMAND LINES

c:>SHA_VERIFY -f filename.ext -o output
perform md5 and sha1 on filename.ext and place to output file

c:>SHA_VERIFY filename.* -o output
same as above but calculating all filename.* files.

c:>SHA_VERIFY -f filename.\* -o output -s
add the sha1 value to the output line.

c:>SHA_VERIFY -t 10000 -c 0
caclulate 10000 characters of hex 00.

c:>SHA_VERIFY -s 10000 -c 0
caclulate 10000 sectors of hex 00.

c:>SHA_VERIFY -s 10000 -c 0 -256
caclulate 10000 sectors of hex 00.and add the SHA2 256 calculation.

c:>SHA_VERIFY -m filename.* -o output
find the files filename.* and "merge" the hash values as if it was a single physical drive value. Use this verify the dd output.

c:>SHA_VERIFY -m filename.* -v \tree\hashlogfile -o output
find the files filename.* and open the hashlog file. Read each line of the hashlog file and compare the hash with each sequential index of the filename.* image files.

NOTE: the -o output can be used with all the options, and is suggested with the -m option.

RELATED PROGRAMS

MD5_VERIFY Linux version of sha_verify.

MD5

HASH or the linux version hashl