Diskcat


PURPOSE   OPERATION   16 & 32 Bit Versions   OPTIONS   COMMAND_LINES   RELATED PROGRAMS


Author: Dan Mares, dmares @ maresware . com
Portions Copyright © 1998-2017 by Dan Mares, and Mares and Company, LLC
Phone: 678-427-3275
Last update: October 2016
(10-22-2016: enhanced the -88 option to include a column of the 8~3 filename)
(12-05-2013: fixed date display when windows reports a date of 00-00-0000, see -T option)
(11-26-2013: added (.) dot to -T option
(07-13-2012: fixed -I option
(07-13-2012: added --sequence #)
(12-20-2011: added --levels options)
(12-2011: fixed bug in processing ADS's and file extensions)

GET diskcat.exe

(64 Bit version is available in beta form. Please call if intetested.)
(16 Bit version no longer supported, and may be available as a free download.)

NOTE: This is a command line program. Should/Must be run as administrator. It will process long path/filenames > 255 characters. However, even as administrator, WIN7 and later may not allow access to certain protected directories. You should test the access as needed.

NOTE: These (hash, diskcat, and upcopy) command line programs WILL process files with long filenames ( > 255 characters) which is seen more and more in modern file systems. If you are using other hashing software, you should test its capability to process long filenames. (I have found a significant number of popular stand alone hashing programs have not been updated sufficiently to handle long filenames). I have tested a number of command line and GUI hashing and forensic copy programs. Some cannot process long filenames at all. Others can only find and process a single file at a time. Not very useful in forensics. And others may be able to find a file thru the GUI, but can't do a recursion. So i urge anyone who is planning on using a hashing program on current filesystem, please check the capability of your program on the filesystem you intend on using it on. I have a file containing approximately 82 files with longfilenames. You can use 7-zip file to extract these files and then test your software to see if it finds it. After you download the file, you must unzip the .zip file, then use 7-zip to unzip/extract the long filename files. The file was .zipped to allow for automatic download of a zip file, as most browsers don't know what to do with a .7z file extension, and the 7z file was created because I have had little to no luck using winzip or pkzip to properly store a long filename path.

After you successfully un-7zip the file structure, you can use the 64 bit diskcat.exe diskcat program to confirm that there are long filenames in the structure. Use the option --showlong, this should produce a listing of about 82 files with path/filenames greater than 255 characters.


top

Purpose

Diskcat, in its basic operation, will traverse the entire directory structure of your hard disk and create a listing disk(CATalogue) of all files and/or directories on a hard, USB or floppy disk.

It is designed to be used for investigative/forensic purposes by creating a catalog of files on hard or floppy disks. The output is a fixed length record which lends itself to importation into a database for further analysis or sorting.

The newer versions after April 2009, which have in the banner a reference to the version of "32 bit unicode" will now process path/filenames longer than the 255 character limit previously set in Windows. This new upper limit is the *IX compatability of 32000 characters. In order for it to work, the banner must have the unicode signature in it.

In addition to creating a catalog listing, it has many options which can be used to increase the forensic use. For instance, creating a CRC for each file for security and file validation. It can check the header of each file to determine file type or mislabled file. If the +H (upper case H) option is used, it will place in the output file ONLY those files which match the types of headers the user provided in the “header.fil”. As of August 2006, the user can also designate a "Category" to place each header in.

It can also search for: specific file types; files of specific dates or sizes; and can, in effect, be programmed to search for files meeting specific criteria. This operation can turn Diskcat into a “findfile” program.

When “cataloging” files from multiple/different disks it can “tag” each output record with the specific label indicating the disk that contained that file. This leads to easier location of files at a later date when searching for them on floppy disks.

When run on an NTFS file system, the 32 bit version also has the capability of showing files with associated Multiple Data Streams (-s), and of showing the owner of the file (-u or -U username)

For each file listed it can execute a specific program or DOS command on that file. For instance, if the user asked Diskcat to locate all *.zip files it would list all those files with a .zip extension. Then the user could ask it to execute the PKZIP -v command on all the .zip files found. This would effectively produce a listing of all the files contained in all .zip files found on the disk being examined.

Other programs could be run for personal directory maintainance. Diskcat could locate files over XX days old, and then run the ms-dos del command on those files to clean out the disk.

A user-designed batch file could also be run on files selected by Diskcat, thus allowing the user to accomplish almost any operation on the file.

As of April 2009, a newer (beta) version of diskcat is available. It is not the one that is default supplied as the program, but exists for those who wish to try it. The main difference is that this new version has the capability of processing filename greater than the 255 character limit. Generally speaking, it can handle true unicode filename lengths up to 32000 characters.


Top

MD5, SHA1, CRC options

Another very useful feature is the -C option for CRC, or cyclic redundancy check, and the -5 option for MD5 hash, or the --SHA option for the SHA-1 calculation.

The -5 option generates an MD5 hash of every file it encounters. For this reason, you should ensure that you have turned off the update last access date in the registry, are using WIN7, or use the -R option.

The --SHA (thats a minus minus option) generates the SHA-1 value of the file.

These options are quite helpful when using the program to check for a corrupted file or program. These options generate their appropriate value and places it at the END of the record.


Top

Operation

Diskcat's default is to recurse the entire directory tree from its default directory (if you were at root, this means the entire drive), and to list every file to the screen. It produces an output listing which is normally in a three column format. The first column is the filename including the path (defaulted to 60 characters); the second column is filesize; and the third column is the file attributes. If an output is selected, the disk serial number of the source drive is added as a default fourth column as seen below(at Sample Output).

Columns such as CRC, date, time, disk “Label”, file type, file owner are added as the various options are chosen.

With the 32 bit version long filenames are handled effortlessly. For extremely long filenames the -w option should be considered.

A lot of programs, inadvertently adhere to the old windows limit of 255 characters total path/filename. Long filenames ( greater, > 255 characters) are handled with no problem. see the --showlong option, and download   sample long filename files.

Top

Sample Output

Below is a sample of the normal output.(Filename length and spaces have been truncated in order to fit on the page.) Also notice the alternate data stream identifier for the junk file.

Disk serial number is provided if the Label (-i) option is used. If the -I (Label ) option is used, then the serial number is replaced by that user-supplied label.


Path/filename filesize attrib disk_serial_no.

D:\WORK\VER20\WINREL\DISKLABL 4 A....E 24F9-7921
D:\WORK\VER20\WINREL\DISKCAT.INI 350 A..... 24F9-7921
D:\WORK\VER20\WINREL\BASE.obj 28267 A..... 24F9-7921
D:\WORK\VER20\WINREL\OPTIONS.obj 29454 A..... 24F9-7921
D:\WORK\VER20\WINREL\FIXNAME.obj 2519 A..... 24F9-7921
D:\WORK\VER20\WINREL\sortfile.obj 3535 A..... 24F9-7921
D:\WORK\VER20\WINREL\diskcat.obj 36419 A..... 24F9-7921
D:\WORK\VER20\WINREL\EBC_ASC.obj 3727 A..... 24F9-7921
D:\WORK\VER20\WINREL\INIT_LIB.obj 3501 A..... 24F9-7921
D:\WORK\VER20\WINREL\DMP_FUN.obj 3248 A..... 24F9-7921
D:\WORK\VER20\WINREL\DISKCAT.exe 156872 A..... 24F9-7921
D:\WORK\VER20\WINREL\junk 10 A..... 24F9-7921
D:\WORK\VER20\WINREL\junk:alt.txt 15 ADATA. 24F9-7921


Top

Disk Labels

There is also an option (-I), for Identifier or insert, which provides for a literal tag or disk label to be added to the record. This option allows for an automatic labeling or a manual label input


Author: Dan Mares, dmares @ maresware . com
Portions Copyright 1998-2016 by Dan Mares and Mares and Company, LLC.
Phone: 678-427-3275
Last Update: May 10, 2013
HASH_LINES new as of June 24, 2013
by the user. The automatic labeling is suggested when cataloging multiple disks in a forensic setting.

This labeling option is only allowed if you are using it in conjuntion with the -aO output option which creates/appends an output file containing the results of the program. The disk label is normally used when creating disk catalogs of numerous disks. You can provide a unique label (up to 9 characters long) for each disk. If the disk label ([-I label] option) is used, then the disk serial number is replaced by that label. It is suggested that all the disk labels, if used, be of the same length so that when the file is printed the disk labels all line up properly. (REMEMBER: if you are using this program to catalog many floppy disks, always use the -[aO] (append) options to cause the output file to be appended.)

Another alternative to keying in a separate (-I) disk label each time using the -I option is to use the lower case -i option. The lower case option is an automatic number incrementing option. The program must be run from a default hard disk directory. It then looks for a file called DISKLABL in that directory. If it doesn’t find one it will create it. It picks up the 10 character ascii contents of the file DISKLABL; if none is there it starts the label numbers at 1001. This 1001 is used as the label to add to each record of the output file just as if you had keyed in -I 1001. The program then places the 1001 in the DISKLABL file and closes it.

Then, when the next disk is catalogued and the program finds 1001 (the last label number used) as the contents of the DISKLABL file it takes that 1001 and adds 1 to it to make it 1002. This 1002 then becomes the label to add to the output file records. It also replaces the DISKLABL contents with 1002 so the next time the program is run it will find 1002, and increase it to 1003 etc.

If, however, you wanted to start the numbering at a specific place such as a case number, or search site number, or alphanumeric number as labels, you should first create the DISKLABL file and place in it the ascii contents of the number you wish to start at less 1. For example, if you wanted it to start at MAR1001, the initial contents of DISKLABL should be MAR1000. The program will subsequently take care of the incrementing of the numbers. No provisions are made for the incrementing of the alpha section of the label. And the number part MUST be at the end.

The default disk label is the disk serial number (if no other label was chosen).


Top

16 and 32 Bit Versions

The 16 bit version is no longer supported.

There are certain differences between 16 and 32 bit versions:

FILE ACCESS TIME: Using any version of Diskcat with any of the following options: (-h, -z, +h, or -c) will alter the last access date on an NTFS or WIN95 file system. This may cause an evidentiary problem for some investigations. (See It’s About Time in the hash.exe documentation for a full explanation.)

The 32 bit NT version can be set to replace the original last access time of the file if the -R option is used. (This can also be accomplished with an environment variable of RESET.) When running the program without one of the options that “OPENS” a file the last access date is not altered. (You can verify this for yourself before using it on evidence. The command <mdir.exe> can be used to verify last access times of files on NTFS.)


Top

Options

Diskcat is INI capable.

This program is INI capable. INI keywords here are in [BOLD, ALL CAPS].

All options should be preceded by a (-) minus sign (with the exception of two of the +hH options). Some can be grouped together, and others MUST be grouped without a space(they will be specified as to which style to use). The options are grouped where approriate.

Some options are only active in the 32 bit version running on an appropriate file system because they deal with specific 32 bit items like MDS (Multiple Data Streams) or file times.

-p + path(s)    If more than one directory is to be looked at, then add the paths here as appropriate. (-p c:\windows d:\work) [PATH]=path

Some options may conflict with one another, and be mutually exclusive. I have made every effort to notify the user when conflicts occur, or they are mutually exclusive. But when using convoluted mixtures of options, please test the results.

--path=single_path_to_traverse;  Only a single path is used in this -- option.

--showlong   If there are files in the current path with path/filenames longer than 255 characters, a lot of programs fail to find and display them. Diskcat has not problem finding and displaying them. For ease of confirming that these files may or may not exist, the --showlong option will display ONLY those whose path/filename is greater than the 255 character limit. When testing software, it might be advisable to confirm that it works on long filenames. If you are unsure that you have test files, you can download a sample 7-zip file here. Unzip the 7z file, then unzip the contents maintaining full path. You should end up with about 80+ files with paths longer than 255 characters.

--levels=xx;    (12/2011) The --levels option recurses the path/directories by only the xx value. So if xx was a 2, then the recursion will only recurse two directory levels from the top or starting location dictated by the -p option. So if you are starting X levels down based on the -p option, this will add to the number of total levels reflected in the output. This option ONLY produces a listing of directories, and the -AD option is also needed. If the -AD option is not included, results are unexplained, and it produces trash or may not run.

-f + filespec    If more than one file type is needed, add them here. (-f *.c *.obj *.dll) [FILES]=filetype

If the above options are used, the program builds a matrix of paths and file types. It searches all the requested directories for all the requested file types, thus producing a total of all the files in all the paths requested. These options are added to any default command line provided.
(C:>diskcat c:\work\*.c -f *.dll -p d:\windows)

--filename=single_file_type    Only a single filetype is used in this -- option.

-x + filespec    E(x)clude these file types from listing (same format as -f option) (-x thesefiles.txt) [EXCLUDE]=filetype

--exclude=single_file_type_to_exclude    Only a single filetype is used in this -- option.

-oO + filename    Output file name: place the output to a filename. If uppercase ‘O’ then existing output is appended to. The special output option -ostdout should be used if you wish to redirect the output to another file or directly to a printer. This option (-ostdout) may not work with some other options.
Or if you are mouse reliant, you can redirect output using redirection:    2> outputfilename
INI file syntax:[OUTPUT]=filename

--output=outputfilename    Same as above except output is always appended to.

-oO + YY[YYMMDDhhmmss[=:]literal_text]:     Causes the output file to be named with todays date based on the mask used, and a .txt extension is added unless user includes extention in the mask name. If the literal text is included as part of the output name, you must use either the "=" or ":" delimeter in the mask or else it is ignored. The minimum is that the YY be the first item. Then you can add additional modifiers to refine the output name. This option is especially helpful when you are creating the catalogs with batch scripts run periodically. Then depending on the mask used, the output filename will reference the date and time of the run. The modifiers are case dependentant, and add the following:

YY = two digit year, 12
YYYY = four digit year, 2012
MM = two digit month, 07
DD = two digit day, 31
hh = two digit hour, 22
mm = two digit minute, 30
ss = two digit seconds, 15
[=:]literal text2add (adds the literal to the filename)

So a format of YYMMDDhhmmss:filename or YYMMDDhhmmss=filename would result in,
20120731_223015filename.txt
Notice that after the date, there is an embedded underscore before the time. Sorry, this is the way it is.

-V    Output records are variable length. With the full pathname remaining as the 1st item on the line. This guarantees that the full path is included. Also inserts pipe delimeters by default. Mutually exclusive with -w(idth) xx option. A -w or a --nameafter option will disengage any -V option.

-w + #   Change the default width of the filename from a default of about 50 to any other specific value. If you have long filenames, this may be necessary to accommodate the entire name. With current path lengths, it is often advisable to use about 160 - 250 as a length. If a filename longer than 50 is used, the screen output tends to be more than one line long.

The -w option also has a unique special modifier of -w0. If a -w0 is used as an option, then the filename itself is a 50 character first field, and the full path/filename is moved to the last field. It turns the record into a variable length record with the fullpath at the end and also adds pipe (|) delimeters. Any -w option also turns off any -V option. A -w0 option is probably the best if you are going to import the output to a spreadsheet. As this gets you a filename as the first field, and a full path as the last field of a delimeted record

If the --NAMEAFTER (see --nameafter option below) is used, this will generally override the -w options, and causes the fullpath to be the last item in the record and is currently set as a fixed value of 50 characters for the fullpathname. There are combinations of -V, -w and --nameafter that make some combinations mutually exclusive. It takes some practice to get the proper mix of -w0 and --NAMEAFTER. When using these varied options, consider using a delimeter to absolutely show where the fields end. So experiment freely. (-w 50) [WIDTH]=50

--NAMEAFTER[=nn]:  (This is similar to the -w0 option, except it makes the final field fixed at 50 characters). If you want the fullpath name to be moved from the first field to the end of the record, then use this --NAMEAFTER option. It takes the path/filename and makes it the last item on the record instead of first (the filename itself is still left as the first fixed length item). The default length of this fixed item is usually approximately 50 characters. However, if you add the (=nn) value where nn is a number to expand or contract the path, then it is sized accordingly. This option still results in a fixed length output, but the name is still the last item on the record instead of the first. If you want the last field to be truly a variable, and contain the entire full path/name just use the -w0 option.

-a      Append output to filename provided in -o option. Serves same purpose as using an upper case O. (-a) [APPEND]=[ON|OFF]

--sequence[=nnnn]  Add a "sequence" or record number to the beginning of each record. The width of the field is ALWAYS 6 characters with leading zeros. If the =nnnn is replaced with a numeric value, then the sequencing (record numbering) begins at that value. This will allow the user to start the record numbering at any pre-determined value. The equal = sign must be included. --sequence=1000, will start at 1000. The order of preference in the output record is: sequence no, -I label option, -C comment option.

-C + "comment"  Add a "comment" to the beginning of every record. This is very useful when ultimaely merging many outputs from different locations or for different cases. The comment can uniquely identify the sources of the hash values. Example, (-C SUSPECT_CPU#1). The resulting output records would look something like this: "SUSPECT_CPU#1 C:\WINNT\....\filename etc."

-C + COMPUTERNAMExx  A special version of the -C option. If the literal COMPUTERNAME (all uppercase) is used, then the program will find the name of the computer and insert it there. This is kind of like a wildcard subsitution. The user can let the system decide what to put there. This can then uniquely identify the source computer of the hash values. Example, (-C COMPUTERNAME). The resulting output records would look something like this: "CPU-2_ATLANTA C:\WINNT\....\filename etc.". If the xx is replaced by a numeric value, then the computer name field is made this many characters wide. (-C COMPUTERNAME20) becomes: "CPU-2_ATLANTA        C: \WINNT\....\filename etc."

-v;  No 'V'erbose. Do not print headers/footers to output file. (ini: Verbose=on)

-1 + filename   (That's a one, not an ell). The filename here is a file which will contain accounting/log information about the run. It is always appended to, and contains the command line plus statistics about how many files and time of run. The file can later be used as a batch file for duplicating the runs.
If you are running DISKCAT on a WIN7 or newer OS, then there will be reparse points. If this options is used, the reparse directories will be listed in the logfile. If the logfile is not initiated, then no reparse listings will be shown.
The ACCT environment variable can also be set. (SET ACCT=logfilename). Or use the .INI option [ACCT=filename] The order of priority is: Environment, INI file, Command Line option. To explicity turn it off use a +1.

--memo    Causes an interactive dialog with user which allows user to input up to 2000 characters of "memo" information. This information will be appended to the -1 logfile name.

--memo=memofilename    Creates/Appends a file called memofilename, and causes an interactive dialog with user which allows user to input up to 2000 characters of "memo" information. Difference is this version DOESN'T add to the -1 logfile.

-s    Do Not list Alternate Data Streams. (NTFS only). [STREAM]=[ON|OFF]

-u    NTFS only. Display owner name of the file.

-U ownername;  NTFS only. Display only files with this ownername.

-g + #    Where the # is replaced by a number indicating: list all files ‘g’reater than # days old. You can use a -gl pair to bracket file ages. [OLDER]=50

-l + #    (ell, not one) Where the # is replaced by a number indicating: list all files ‘l’ess than # days old. You can use a -gl pair to bracket file ages. To get todays files, use (-l 1) [NEWER]=10

-g + mm-dd-yyyy (greater than, older)
-l + mm-dd-yyyy[acw] (less than, younger) (that's and ell, not a one)
--younger=mm-dd-yyyy (obvious)
--older=mm-dd-yyyy (obvious)
Process only those files (g)reater (older) than or (l)ess than (newer) than this mm-dd-yyyy date. The date MUST be in the form mm-dd-yyyy. It MUST have two digit month and days (leading 0 if necessary), and it MUST have a 4 digit year. The date given mm-dd-yyyy is NOT included in the calculation. Ie. if today was 01-10-2003 and you entered -l 01-09-2003 you would only process todays files. If you wanted to include those on 01-09, you should have entered -l 01-08-2003. If you use the keyword "younger" or "older" you must format the date as mm-dd-yyyy also.

The [acw] literals, choose which time to base the mm-dd-yyyy test on Any or all [acw] can be used. If none used, then default is w

	examples: -l 10-20-2005a, -g
12-05-2005w, -l 12-20-2005a -g 10-01-2005c
examples: -l 10-20-2005acw, -g 12-05-2005wc

-L + #    Where the # is replaced by a number indicating: list all files less than # bytes in size. (-L 100000) [LESSTHAN]=100000

-G + #    Where the # is replaced by a number indicating: list all files greater than # bytes in size. You can use a -GL pair to bracket file sizes. (-G 10000) (-G 10000 -L 100000) [GREATER]=10000

-P     Pause after every 20 lines. [PAUSE]=ON

-d + delimiter    Replace “delimiter” with a delimiter (typically a pipe ‘ |’ ) within double quotes with which to delimit fields. If the delimiter is not printable, use its decimal ascii value but don’t place it it quotes. (-d “|”) [DELIMITER]=|

-D + begin,count    (only in versions available after 12/2010) The -D option is used when processing the files for CRC's or MD5 Hashes. If you want to process a segment/section of the file you use the -D #,# to set the starting byte value, and the number of bytes to process. The starting byte number is always counted from 1, not 0. The 2nd part of the option is the acutual number of bytes to process. The comma (,) delimeter between the two values is only required if the 2nd section is used. If the 2nd value (number to process) is left off, then the entire rest of file is processed beginning at the begin value. (Do not include the comma in this case.) sample -D 100,1000 (start at byte 100, and process 1000) or -D 100 (start at byte 100 and process the rest of the file).

-[Tt][AaCcWw3]     Show the file time as last ‘a’ccessed; last ‘w’ritten; ‘ c’reated; or show all ‘3’. If the AC or W is uppercase, then the milliseconds is added to the filetime. No spaces between the -t and the modifier. ( -tc or -TC or -t3 ) Default is the ‘ w’rite, which is identical to what DIR or Explorer displays. If the T, is upper case, then the date, MM/DD/YYYY is reversed to read YYYY/MM/DD. If the option -T3, is ended with a perdiod (.), (-T3.) Then the item is prefaced with a single quote ('), ('YYYY/MM/DD), '2013/01/01. This single quote keeps Excel from interpreting the item as a date, and reversing the item to MM/DD/YYYY. It eliminates the Excel import step of choosing this field as a text string.

Some software (ie: X-ways, and others) export/extract files (usually child objects) and don't set a date. Windows then responds with a "blank" date, or a date of 0000-00-00. When attemping to use other software to view these blank dates, it becomes difficult to sort or even find items with a blank "date" field. In order to fix this, the diskcat program when it finds a blank date field will display the date of 01-01-1601 which is the Windows XP/7 start/epoch date. When viewing output displays of diskcat with this date, it means that Windows didn't know or have a date to display, and it is "seeded" here just to have something to refer to. [TIME]=[A|C|W|3], [ALLTIMES]=]ON|OFF]

-z    If using the 32 bit version, display time in ‘ Z’ULU GMT format. The letters GMT will be at the end of the output line indicating such. Use GMT to get relative references especially when dealing with 2 or more time zones. (-z) [ZULU]=[ON|OFF]

-m     Do not show any file dates or times. This significantly reduces the size of the output record. (-m) [MILITARY]=[ON|OFF]

-A[ehrsmdD]    Show only files with the following attributes: h=Hidden files, r=Readonly, s=system, d=directories only, m=modified, e=encrypted filesystem (NTFS 2K). The [hrsdm] must be entered immediately after the -A without any spaces. The -A is case sensitive. [HIDDEN|READONLY|SYSTEM|ARCHIVE|DIR_ONLY|ENCRYPTED]=[ON|OFF].

The differences between the -d and -D are that if the upper case -D is used, then ONLY directories are listed in the output. If the lower case -d is used, then directories are added to the output file and the -r (recurse) option MUST be used. (This is somewhat different than the way the Mdir program uses the -AD or -Ad options.)

-Y[nsydm]    (32 bit version only) Sort output on ‘ n’ame (default), file ‘s’ize, file ‘y’ear, file ‘m’ onth, file ‘d’ay. If month sort is chosen, then day is secondary sort by default. Only one sort field can be specified with certainty. Some combinations are possible, but not guaranteed. If the nysdm is upper case, then the order is reversed. [SORT]=[n|s|m|d|y[-]]

-R    RESET the last access time to the original time. This reset is attempted after using an option that opens a file for reading. All files except those LOCKED by the operating system are reset. This same effect can be achieved if an environment variable RESET is set. (set RESET=1). This option is only available on the 32 bit version.

-eE “command %”    See EXEC -e option description below.

NOTE: the file containing the headers has a filesize limitation of 50000 bytes or 500 LINES (including comments), whichever limit is met. This limitation was imposed because occasionally the header files being provided were corrupted and would cause the program to incorrectly execute. The limitation is designed as a safety factor in case the user provides a file which is not compatible with the program.

 

The 'H' options, outlined below, can be very confusing, and produce somewhat unexpected results. Please check your logic before putting into production. See the section on headers in the Headers section for some examples and further definitions.

+h + header_filename     Compares items in filename with headers of every file on disk. See description of “file headers” below. Shows file extensions of ALL (EVERY) file on the disk as the program believes the file to be based on information in the header file provided. This option produces a list of every file on the disk. Download the sample header   file. (Note: this operation alters last access time on files.)

The ini setting of CATEGORY=ON can be used to refine the output record to include the user defined CATEGORY of the file. See the format of the header file below.

+H + header_filename   Compares items in filename with headers of every file on disk. See description of “file headers” below. If the file type matches one of the header types (i.e., is a file of that type) then the program outputs that file's information. This option outputs ONLY those files whose headers match those you supplied in the reference file. Use this option to selectively find specific file types for additional processing. (Note: this operation alters last access time on files.)

-h + header_filename   Similar to +h option. The program attempts to determine the file type of each file. It outputs a record for every file, but fills the file type field ONLY if the extension does not match those in the list supplied. All files whose extension match the file type are listed with a blank in this extension field. To find mismatched files, simply look in the extension field for data. (Note: this operation alters last access time on files.)

The header_file should contain as many headers as the user has available. The more headers provided, the better the chance of determining the file type. Contact Mares and Company for file headers. The program can only identify those headers that the user has supplied. So be careful and make your list as accurate as possible. Different header files can be used depending on the type of files searched for.

-H + header_filename    This is probably the hardest to understand and design for. The file types are checked against the header file list. ONLY those whose extension is mismatched is output. Use this to select ONLY those mismatched files. This should give the smallest output if the header file is complete and accurate.

 

-i    Use the automatic label numbering procedures, and create/modify the file called DISKLABL. The numbering is designed to start at 1000. If you want it to start at 1001, then initialize the file DISKLABL to 1000.

-I + label    The disk_label can be up to 8 characters which will be prepended to the path.

--sequence[=nnn]    Number each output record with a unique sequence number. If the =nnnn is used, then the output sequence begins at this number. This is a good way to uniquely number each record for future identification. The sequence number is the first field of the record. It is ALWAYS a 6 character field.

-8:  Add the DOS 8.3 filename to the end of the record. Which will translate to the 8~3 format if needed.

-88:  Add the uppercase Long File Name to the end of the record. This option strips the LFN from the path listing of the first field, and places only the LFN at the end of the record. The default length is a 16 character field. (Note: the -8 and -88 options are mutually exclusive. Use one or the other).

-88xx[Ee]:  Replace the xx with a value. This value will now determine how wide the Long File Name field will now be. Use this to reduce the size from 75 to some other value. The default length for -88 is 16 characters. If either the 'E' or 'e' is added, then a 6 character field with the extenstion is added.

use of the -88..: format also adds an additional field of the true 8~3 filename.

Use of the upper case 'E' will cause the filename field to contain only the filename up to and NOT including the dot extension. This is to be compatable with some of the extracts from FTK and X-Ways when the filename field is extracted. (ie: MYFILE.DOT is listed as MYFILE) The 6 character extension field is still included.

Use of the lower case 'e' will cause the filename field to contain the full filename which includes the extension. This is to be compatable with some of the extracts from FTK and X-Ways when the filename field is extracted.(ie: MYFILE.DOT is listed as MYFILE.DOT) The 6 character extension field is still included.

--ext:  If the -88xxE is not used to include the extenstion field, then use this if ONLY the extension field is needed. (This is mutually exclusive with the -88xxE option)

--driveletter=X:  (12/2009) When a remote drive is mapped, the drive letter is often assigned a high drive letter, say H: I: J:, etc. This is the drive letter that shows up in the output file in the pathname field. However, this mounted drive is really the C: or D: drive of another computer. So, in order not to confuse a reviewer as to the drive letter the file actually resides on, the user may force this drive letter designation to any drive letter with this option. Replace the X in the syntax: --driveletter=X with the appropriate correct drive letter, C,D,E, etc., and the output record will properly reflect this correction. Remember, that you have done this.

--ziplog:  When a zip file is encountered, check its internal directory/contents and add these records to the output file listing. The zip files are identified by the PK header. Because the files must be opened to read the contents and check to see if they are zip files, the -R (reset) option is always set with this option, and can't be turned off. The directory contents of the zip file are included amongst the normal output records. Since a significant amount of the normal file processing may not be conducted on the contents (zipped files) of the zip file, many of the output fields with this option are left emtpy. For instance, zip file contents do not maintain create or access dates, so those columns are left blank. Hashing, CRC is not done, and header check on the contents are not allowed. (09/2007) [ZIPLOG]

--ziplog=ziplogfilename:  Same as --ziplog except that if the =ziplogfilename is added, the contents of the zip files is placed in a seperate ziplogfilename file, and not intermixed with the normal output.(09/2007) [ZIPLOG=ziplogfilename]

-5:  Add an MD5 hash field. Same as INI file:

--SHA:  Add a SHA-1 hash field.

-c    Create a CRC32 checksum for each file and append at end of the record.

--

--NOCHILD: Diskcat64 version only. When using the X-Ways "recover copy" option to extract files. You may inadvertently check the "copy child objects" box. This will add a folder with an unusual character in the folder name, and add within that folder, the child objects of the file. Later you realize that the child objects of the files are not needed. But you can't manually remove thousands of child objects. This option, finds that folder containing the children, and removes it. If you change the X-Ways option in the include child objects to "_childobjects" which will cause the directory to contain the name "_childobjects" this program will find those directories by name, and remove them.

HASH=ON


INI Settings:

Following is a sample diskcat.ini file with most, if not all, the approprate keywords that diskcat will recognize.

The INI settings that can only be set from the ini file are:
CATEGOREY=ON    ; This installs the category column from the header file
SPLIT=xxx                 ;Set output file record counts to xxx maximum records per file. (ie: SPLIT=30000) Use this when intending to import the output to a spreadsheet with a maximum record limit.
HASH=ON                 ; Turns md5 hashing on for each file. MD5 value is placed before time fields.

The file is shown as all comments, so you can cut and paste from here.

CATEGORY=ON
;CATEGORY is only available in the .ini file.
SPLIT=xxx
;SPLIT is only available in the .ini file.
HASH=ON ;Turn on md5 hashing RECURSE=OFF
files=*.exe
paths=d:\work
output=d:\tmp\junk
older=15
younger=180
lessthan=10000
greater=1000
width=45
delimeter=|
military=on
time=c
alltimes=ON
zulu=ON
stream=OFF
archive=ON
readonly=on
hidden=ON
system=ON
DIR_ONLY=on
directory=on
CRC=on
FIXED=ON
label=labelname
OWNER=ON
SORT=s
verbose=[on|off] (turns on the -v option)
ziplog
ziplog=ziplogfile.txt


Top

File Headers

The [[+-][[hH] + filename] option allows you to provide, in an external text file, a list of standard extensions of files (exs., exe, wp, dbf, gif, etc) and the string of characters that should be found in the header of the target file--if, in fact, that target file is of the type referenced by the extension.

For instance: a program .exe file should have as its first two characters in the file an MZ; a pkzipped file should have a PK as part of the file header.

Setting up the reference.fle

The text file containing the reference extensions and headers will be referred to here as "reference.fle." This file should be set up in the following manner:

One line for each file type indicated, and it is case dependent.

The reference.fle should be created with an ascii text editor. No word processor formats are recognized. AFTER THE LAST LINE, AT LEAST ONE BLANK LINE SHOULD BE ENTERED. Maximum of 100 lines/file types to test for.

The lines consist of 3 or 4 parts. Each must be in the correct format and location for the program to work.

part 1: the category you wish to place this header in. ie: it could be DOCUMENT, PROGRAM, GRAPHIC, SPREADSHEET, or any category word you wish. This is strictly user defined. This text will be placed in the output record, if the CATEGORY=ON trigger is included in a diskcat.ini file.

part 1A: a comma , follows each part.

part 2: The "TRUE" expected extension you expect to see on the file (ex., exe wp gif). No leading period is allowed.

Part 2A: (optional section) a colon (:) followed by a number. (SEE NOTE BELOW).

part 3: a comma (,). This will separate part 2 from part 3.

part 4: header string.

If the first character of the line is a # (pound sign) or a ; (semi colon) this line is completely ignored and is considered a comment.

#exe,MZ    This is a comment line.

NOTE: If the expected header signature (ex., Pklite) is located at some position other than the 1st position of the file, then add a colon (:) followed by the byte location (displacement) into the file where the header signature is expected to be found. An example for a 16 bit self extracting PKZIP file would be (zip:66). The same self extracting zip file created under WINZIP32 commercial version would be (zip:136)

COMPRESSED,ZIP:136,XD39128360000000000000000E0000E010B01041400  

This is the signature for that WINZIP32 bit self extracting executable.

The header string consists of the string of characters that should be looked for to determine if the file in question is the type of file referenced in part 1. Since this string is taken as a literal, it should not have any spaces anywhere within it except those spaces that should be considered as an actual part of the file header.

If you wish, this header string can be a hex value. In this case it must begin with an ‘X’, and the hex values must be each 2 characters wide. Use this if you cannot easily input the values with an ascii editor. Ascii header strings, and hex headers strings can be used on different lines in the same file.

Below is a sample header file. Notice that the first line is in a different format (as described above).

Sample reference file:

COMPRESSED,ZIP:136,XD39128360000000000000000E0000E010B01041400 zip,X504B
PROGRAM,exe,X4D5A
ENCRYPTION,pgp,X84
PROGRAM,com,XE8
PROGRAM,bat,@echo
PROGRAM,bat,set
PROGRAM,bat,SET
GRAPHIC,gif,X47494638
GRAPHIC,jpg,XFFD8FFE0
GRAPHIC,pcx,X0A050101

Notice that the compressed zip header (1st line) was placed before the exe header. This is because, had the exe header come first, the program would have indicated an exe file and would have never gotten to the self extracting zip header. And that the category for that file was compressed, rather than program. This is so in the output, it will be evident that it is a zip file, not a true executable.

Because the header list is checked in the order it is found in the header file, you should place the most restrictive file types first in the header file. An EXE file should have an MZ as its header. Let's take a case where another type of file had a header of MZH. If the EXE,MZ line came first in the header file, then the MZH file would produce an incorrect output. So put the MZH line first in the header file. This becomes important with files containing possible database headers like DB or DBASE.

If it is not a correct extension, the program prints as the 1st three characters of the output the reference extension found in the reference.fle thus indicating what the extension SHOULD have been.

Here is an output without using any of the header options. It just shows what files are there. The .uni files are true microsoft unicode files. All others are true as shown, execept the .exz file is really a misnamed executable.

D:\TMP\junk.uni             2000 ..R.. 
D:\TMP\lesson.uni 30 ..R..
D:\TMP\COKE_ALL.jpg 20130 A....
D:\TMP\COKE_2.jpg 15203 A....
D:\TMP\CLEANUP.BAT 98 A....
D:\TMP\DISKCAT.EXE 135368 A....
D:\TMP\HEADERS.HEX 128 A....
D:\TMP\OUTPUT 0 A....
D:\TMP\diskcat.exz 135368 A....

SAMPLE OUTPUTS for reference file above:

(1)Same output using this command line: diskcat -h headers.hex (list EVERY file, but only SHOW true headers of those with mismatched names). Notice the .uni and .hex extensions are unknown extensions as listed in the reference header file.

D:\TMP\junk.uni             2000 ..R..  UNK 
D:\TMP\lesson.uni 30 ..R.. UNK
D:\TMP\COKE_ALL.jpg 20130 A....
D:\TMP\COKE_2.jpg 15203 A....
D:\TMP\CLEANUP.BAT 98 A....
D:\TMP\DISKCAT.EXE 135368 A....
D:\TMP\HEADERS.HEX 128 A.... UNK
D:\TMP\diskcat.exz 135368 A.... exe

(2)Same run but with the command line: diskcat-H headers.hex (ONLY MISmatches are output.) This run is based solely on the list in the header reference file. So, since the .uni and .hex files are not even listed as a valid header, they are not checked. However, the exe header is listed in the reference file, and a misnamed file was found, so it was listed.

D:\TMP\diskcat.exz 135368 A.... exe 24F9-7921

(3)Same run with the +h headers.hex. Show extensions of EVERY file. If the header is not listed, it is displayed as an UNK(nown) This would probably be the default run for any catalog list. Then sort on the field containing the type of file so you have a neat list sorted in file type.

D:\TMP\junk.uni             2000 ..R..  ASC 
D:\TMP\lesson.uni 30 ..R.. UNK
D:\TMP\COKE_ALL.jpg 20130 A.... jpg
D:\TMP\COKE_2.jpg 15203 A.... jpg
D:\TMP\CLEANUP.BAT 98 A.... bat
D:\TMP\DISKCAT.EXE 135368 A.... exe
D:\TMP\HEADERS.HEX 128 A.... ASC
D:\TMP\diskcat.exz 135368 A.... exe

(4)Same run using the final +H option: Diskcat +H headers.hex ( ONLY show those files whose header is matched in the list.). This option is good to identify specific file types on the drive. You might have a header list of only graphic headers, so the list will only show graphic files. Notice that only those type files where there was a known signature in the header file were output. NONE of the UNKnown types were listed.

D:\TMP\COKE_ALL.jpg        20130 A....  jpg 
D:\TMP\COKE_2.jpg 15203 A.... jpg
D:\TMP\CLEANUP.BAT 98 A.... bat
D:\TMP\DISKCAT.EXE 135368 A.... exe
D:\TMP\diskcat.exz 135368 A.... exe

          *** ALL -hH options alter last access time. ***

IF the CATEGORY=ON is found in the diskcat.ini file, then the additional category field is included.

D:\TMP\junk.uni             2000 ..R..  ASC      TEXT
D:\TMP\lesson.uni 30 ..R.. UNK UNKNOWN
D:\TMP\COKE_ALL.jpg 20130 A.... jpg GRAPHIC
D:\TMP\COKE_2.jpg 15203 A.... jpg GRAPHIC
D:\TMP\CLEANUP.BAT 98 A.... bat PROGRAM
D:\TMP\DISKCAT.EXE 135368 A.... exe PROGRAM
D:\TMP\HEADERS.HEX 128 A.... ASC TEXT
D:\TMP\diskcat.exz 135368 A.... exe PROGRAM
Top

EXEC (-e) OPTION

The exec enhancement uses a command line option to execute either a DOS internal command (exs., copy, del, dir) or a program. The term 'command' will be used in the following discussion to mean both program and DOS command.

The -e or exec option is most effective when used in conjunction with options that can identify certain selected files to perform the command on. It works in a similar fashion to the -f option. As described above, the -f option locates, on the disk, those files which meet certain filename criteria (ex., *.bat).

When a file is located (under whatever option is ultimately used) the filename is passed to the command requested by the exec option. An example would be to use the type command to look at all the *.bat files on the entire disk. Or to do a dir on all the directories located in a specific path or a dir on all the files over a certain number of days old.

The format of the exec option is as follows:

-e  “command %”

The -e (e)xec option is used to execute “command” on the file(s) ‘%’ found").

The actual syntax is:

-e  “command  [arguments] % [arguments]”  

Where:

    the -e is the actual option. If a lower case -e is used, then the entire filename including path is substituted for the %. If an uppercase -E is used, then only the filename is substituted for the %. The quotes around the rest of the option syntax are mandatory. This is so DOS will hold the entire item and pass it as one string to the program.

   command is actually replaced by the command you wish to run.

   the arguments:  are any additional filenames or options needed for the command chosen, and the % is positionally placed at the location where you want the program to place the name of the file it finds. The % is positionally sensitive and should be placed in the exact location where the selected file would have been placed in the chosen command.

For example: A command to do ‘dir’ on all ‘.bat’ files in ‘c:\sample’ path would look like this:

diskcat  -f  *.bat  -p  c:\sample  -e  “dir %”

Notice the retention of the quotes(“).

For example: A command to zip and add to a output.zip file all *.bat and maintain their appropriate path would be:

diskcat  -f *.bat  -e  “pkzip  -ap output.zip %”

NOTE: If the command used is NOT a DOS internal command and is instead a program the program SHOULD be a .exe executable and reside on a subst drive letter of x: This is because Diskcat normally ONLY looks on drive x: for .exe programs to run. If it cannot find the program there, it assumes it is a DOS internal and attempts to run a DOS internal. In some instances it will run programs located in the DOS path. If you are attempting to run one of these, try it first to see if it will operate correctly. You might also try entering the program name as complete path and name with proper extension (.com .bat). This may provide more reliable results if you completely path the program name. (ie.:  diskcat -f *.bat -e “d:\work\run.bat  %”)

SPECIAL ZIP CAPABILITY

This section deals with a special implementation of the -e execute command when you have zip files located in directories, and wish to extract ALL the files located in the zip files in the correct locations. The zip files could have been placed there by the upcopy command, or FTK, or any other program to move zip files to a specific location.

The user MUST have access to a command line version of pkzip. The current version I have identified as pkzip32, indicating it is a full 32 bit long file name version.

The additional commands to add to the -e command is an upper case P directly after the -e, indicating that a PATH is to be inserted somewhere in the command line. This is needed for PKZIP to know where to run the command from.

After the -eP, you use a similar syntax to the basic -e option,except you add a cd command, for change directory. And you put a placeholder -PATH in the command line where you want the program to insert the path to use. This is sort of a wildcard replacement.

The last item, is to provide the correct command line syntax for the OS to change to the -PATH directory, && (and) execute the pkzip program. The full command is below, and the syntax should be followed exactly. You can modify the specific pkzip options, but those listed should extract all contents, in appropriate folders.

This is the command line:
C:>diskcat -f *.zip -eP "cd -PATH && pkzip32 -extract -directories -recurse % -overwrite"

The -eP says we are going to use a path to change to
The cd -PATH is the trigger to tell the program to perform that cd operation
The % is the usual replacement of the filename, which will be a zip filename.


Top

Command Lines

diskcat
/*lists all files on default drive to screen*/

diskcat -?
/* obtain help screen */

diskcat  -o outputfile
/*lists all files to output file called utputfile */

diskcat  -a  -o outputfile
/*append output to existing output file */

diskcat  -O outputfile
/*append output to existing output file */

diskcat  -O outputfile -w0
/*add a variable full path to the end of the record, and filename at beginning*/

diskcat  -O YYYYMMDD:filename.txt
/* cause output file to be named with current year day, and filename.txt as a name */

diskcat  -O outputfile --nameafter
/*move the path/filename field to the end of the record. This will be inserted before any -w0 paths. */

diskcat  -p d:\work\
/* start search at this directory */

diskcat  -p d:\work\ -v -o outputfile.txt
/* start search at this directory, create output named outputfile.txt without (-v)erbose header/footer */

diskcat  -o output -p a:\ -I 1001
/* create a label of 1001 and place the output to output */

diskcat  -O output -p a:\ -I 1001
/* this will append */

diskcat  -i -p a:\ -O d:junk
/* create automatic label of a: with automatic append */

diskcat -p a: +h headers.hex
* check drive a:, and compare headers in headers.hex */

diskcat -p c:\work --levels=2 -AD
/* Catalog ONLY directories within the c:\work directory, and recurse only two levels down from the starting location c:\work. So c:\work\one\two will list, but c: \work\one\two\three will not list.*/


Script to total file extensions using diskcat.

Often it becomes necessary to find out how many files of   XXX extension there are. People want to know how many docs, xls, etc. If you have the complete Maresware suite and an external program called otsort you may be able to use this script. With some small modications.

comming soon.


Related Programs

Crckit

Hash

Hashcmp

Top