Bates_no

PURPOSE   OPERATION   COMMAND LINES   OPTIONS   RELATED PROGRAMS


Author: Dan Mares, dmares @ maresware . com
This program also available for Linux (intel) platforms.
Portions Copyright © (2001, 2014) Dan Mares, and Mares and Company, LLC,
Phone/fax: (770)237-8870 X 119
Last update: 20140221

This is a command line program.
MUST be run within a command window as administrator.
DO NOT attempt to run this from the RUN menu.
There is also a Linux version with minimal options. See download links here
ALSO: remember when entering paths with spaces, they "need to be quoted".


top

Purpose

The Bates_no program allows you to implement a file re-naming process similar to the Bates numbering system to assist attorneys and those who need to create unique filenames to associate with documents relating to a particular case. The program uses the idea of the Bates numbering system to fulfill this requirement.

This program allows the user to efficiently create a Bates numbering system within the logical file system on a computer. It will modify filenames or filename extensions to contain a unique Bates number.

Bates_no can also copy a number of files from a suspect location to a work directory. (This copy operation is similar to that performed by the Upcopy program except that here we are adding the Bates number.) At the same time it performs this copy operation, it renames (and numbers) the files to correspond to a Bates numbering system. This renaming procedure also keeps duplicate files from overwriting each other since the Bates numbers cause a unique name to be generated.

Once the files are renamed, a catalog can be compiled of the new file names (or with the use of the -o option, this catalog is automated) and be printed or used in the legal discovery process.

There are a number of options which will allow many different copy/rename operations. Check out the options.


top

Operation

The program can accomplish one of two tasks: Bates rename, or copy and Bates rename.

The user supplies, at a minimum, a starting drive or directory (folder) with which to begin, and the Bates nunber template for the numbers which are to be used in the operation. Other options are available to fine tune the file selection criteria for those files which are to targeted. Since the program defaults to renaming ALL files it finds, it is suggested that only those files to be renamed exist in the path provided (-p option) as a starting point.

Format of the AnAnANNN Alphanumeric Bates number template/mask.

The template is of the alphanumeric format: AnAnANNNN, where the AnAnA is the alpha root of the Bates numbers, and the NNNN is a starting number. The size (length) of the alpha or numeric section is arbitrary, and can be up to 16 characters. However, there can be NO spaces in the template. A sample template might be: DJM0000. It is suggested that the Alpha part of the number be unique enough so as not to conflict with any existing filename extensions or prior numbering passes. If files are found with extensions or names that are identical to the alpha section of the Bates number, the file may not be processed.

Suppose we have a mask ABC32D000 The alpha portion (ABC32D) of the template can contain both alpha, and numeric. However the following requirements exist: The alpha portion (ABC32D) must begin with an alpha character, and must end with an Alpha character. (ie., ABC32D). If numbers (32) are to be used within the alpha part of the mask, they must be within surrounding character sequences. The reason for this is that the numeric sequencing of the number section of the Bates number is triggered by the last alpha character of the mask. When the last alpha character is found (D) that determines the start of the sequencing. (000). The "minimum" sequence number length is determined by the width of the number part of the mask. If 000 is used, then the minimum number width is 3. So all sequence numbers will be 00X, etc. If the mask was only 0, then the sequence numbers in the Bates number would increase in width as the numbers grew to 10 and 100. For sorting, and other fixed width uses, it is suggested that a fixed width of at least 3 is always used (000). The number found in the mask (000, etc) is the starting number that will be used. So if you have previously used 000-100, then start the next set at 101, (ABC101).

The program begins searching the designated source location (drive or subdirectory/folder) for files. As it locates files it either renames or copies each file by first incrementing the numeric portion of the template and then inserting the template within the new filename.

Under (this) the RENAME operation

The template is inserted before the extension of each file. (If the file has no extension to begin with, then the Bates number becomes the extension.) The Bates number is "pre-pended" (put before) the name if the -B option is used. (See the -B option.) Because the numeric portion is incremented each time, there evolves a unique number for each file. Coupled with the fact that the file now has a new name, this adds to the uniqueness. A file with the name:
D:\first_dir\second_dir\etc\filename.ext
would be renamed to:
D:\first_dir\second_dir\etc\filename.DJM000.ext

This results in totally unique filenames, identifiable by their Bates number.

The only restriction to the numbering template is that if the program is run more than once on a file system, subsequent templates must have a unique (different) alpha root. This is because of internal checks which have to be made to guarantee that a file doesn't get renamed more than once per session.

When the process is finished, a catalog can be run of the entire file system to obtain the new file names. Then put this into a searchable file. Or, you can use the -o option which will create on the fly a catalog of all the new file names.

Successful tests have been conducted on a tree of 85,000 files. The program took about 30 minutes to find and rename all the files.

Under the COPY and RENAME operation

The user would provide a destination path on the command line using the -d (destination) option. This destination path is used as a copy destination of all the files found. Be careful. If you point the program at a root of a drive for the source location, your destination path will be a single directory containing copies of every file on the source drive. This might choke some programs and operating systems. This copy operation is similar to the Upcopy program operation in every way, except this program includes the Bates number in the new filename.

Before the copy operation takes place, the template is inserted between the filename and the extension. The same type of Bates filename is produced as the renumber operation mentioned earlier. Because the numeric portion is incremented each time, a unique number is produced for each file even if the source drive contained similar file names in different directories (which is not unusual). The destination file now has a new name in the destination directory (provided by the user) and a filename consisting of the original filename, coupled with the Bates number inserted.

NOTE: the two operations are mutually exclusive. You can't renumber the files in place AND copy them (-d option). If you wanted to rename them in place, AND copy them, use the Bates_no program first to rename them in place, then use the Upcopy program to copy them.

Spreadsheets

Once the Bates numbering has been completed most users perform some sort of catalog listing of the newly created subdirectory (folder). The catalog can be created by using the Diskcat command, or by using the -o option (see below). This helps to confirm the new names and generates a well formatted output which can be easily imported into a spreadsheet program for further manipulation. Below is a sample file created when using the -o output option.


BATES0000|A:\TIMEREST.BATES0000.obj
BATES0001|A:\PROGRAM.BATES0001.obj
BATES0002|A:\OPTIONS.BATES0002.obj
BATES0003|A:\INIT_LIB.BATES0003.obj
BATES0004|A:\FIXNAME.BATES0004.obj
BATES0005|A:\EBC_ASC.BATES0005.obj
BATES0006|A:\BASE.BATES0006.obj
BATES0007|A:\ACCTING.BATES0007.obj

Imagine importing this into a spreadsheet and using the dots(.) as a field delimiter. You would get a spreadsheet looking like the one below (imagine that the columns are at the spaces). Then you could easily sort on Bates_no column, and/or create different spreadsheet outputs as you need to.


BATES0000|A:\TIMEREST.    BATES0000.    obj
BATES0001|A:\PROGRAM.     BATES0001.    obj
BATES0002|A:\OPTIONS.     BATES0002.    obj
BATES0003|A:\INIT_LIB.    BATES0003.    obj
BATES0004|A:\FIXNAME.     BATES0004.    obj
BATES0005|A:\EBC_ASC.     BATES0005.    obj
BATES0006|A:\BASE.        BATES0006.    obj
BATES0007|A:\ACCTING.     BATES0007.    obj

However, a problem occurs when a file that is being renamed doesn't have an extension. For example: FILENAME. When this file gets renamed, its Bates name looks like: FILENAME.BATES0000. Notice there is no extension or dot (.) after the Bates number. This could cause a problem for some spreadsheets in properly "columnizing" the outputs. So, if you think you will have files without extensions, and are anticipating moving the catalog to a spreadsheet, you might want to use the upper case -B BATESXXX option when numbering. The -B option will add extensions to those files that do not have them. The extension that is added is the unique .__! sequence of characters. (a dot, underscore, underscore, exclamation). This sequence is easy to find, and identify, and it also makes parsing the filename a lot easier when using spreadsheets.

BATES0008|A:\ACCTING. BATES0008. __!

HTML OUTPUT

The program can also create an index.htm with links to the newly named files. This often makes it easier to review the file content.

FTK MODIFICATIONS

FTK can export files. However, with versions prior to 1.8x they were only placed in single folders and did not maintain original path structures. Also, they carried as part of their filename the index number generated by FTK. When the file is extracted, these index numbers are not in any kind of sequence and may cause some persons concern or confusion when deciphering the index numbers. BATES_NO can rename the exported files in a way as to generate a sequential number listing while removing the non-sequential FTK index listings. (The --NOFTK and --FTKFILE options do this)

X-WAYS MODIFICATIONS

When X-ways exports its files, it places them in correct folder structure. You can use BATES_NO to rename the exported X-WAYS files in a logical fashion.


top

Command Lines

C:>Bates_no [source_directory] [-[options]]

C:>Bates_no C:\tmp -b BATES_alpha_ROOT_Number

C:>Bates_no -p C:\tmp -b BATE_ROOT_Number
same as the previous one, except this one makes use of the -p option

C:>Bates_no -p "c:\tmp\space   sub_directory" -b BATE_ROOT_Number -f *.doc
perform opeartions only on the *.doc files. Remember to "quote spaces" in directory names.

C:>Bates_no -p c:\tmp -b BATE_ROOT_Number -f *.doc -w 120
perform opeartions only on the *.doc files and makes the name section of the output 120 characters wide.

C:>Bates_no a: -d c:\temp_bates_dir -b BATES0000
copy the files from the A: drive to the c: directory identified by -d option.

C:>Bates_no -p "d:\ftk_exports\space   sub_directory" -b  BATES0000  --NOFTK  --FTKFILE=filename


top

Options

-p + source_dir    Use this directory as the source (starting point). Remember to "quote paths with spaces".

-[bB] + bates_number_template   This is the template for the Bates number. It is alphanumeric with no spaces. AAAANNNNN. The number part is used as a starting point for the sequencing. It is suggested that the number contain at least 4 or 5 digits with leading zeros as place holders so the final format is a nicely formatted fixed length Bates number. The actual length of this template is not restricted, but one of less than 10 characters is recommended.

The NNNNN portion of the mask is also used to determine the starting number. If the NNNNN portion is anything other than 00000 then the NNNNN value is taken to mean use this numeric value to start the numbering at. (ie. if the template was: DJM_012, then the file renaming would start at filename.DJM_012.ext instead of the default filename.DJM_000.ext)

Occasionally, files without extensions are found. (ex., FILNAME.) When these files are renamed, the new name is FILENAME.BATESXXX. There is no ending .EXT and this may cause some problems when importing into a spreadsheet. If you want an extension added to assist in spreadsheet importation, use the uppercase B, -B. This will add the extension of .__! It can later be easily identified, and it helps spreadsheet formatting. See Spreadsheets above.

-[uU]    Undo or Remove the bates number from files. The -b option must also be included so the program knows which template to check filenames against. All other options remain in effect. If the -o option is used, a list is provided of all old and newly renamed file names.

-P    'P'repend the Bates number to the filename. The default is to place the Bates number prior to the extension. (FILENAME.BATES100.EXT). This option allows you to prepend the Bates number to the filename. (BATES100.FILENAME.EXT). This option may be useful if you are later sorting the filenames. Since the Bates number will be at the front, they will sort more easily.

-d + path   : Path is a destination path (directory) to which EVERY file found will be renumbered and copied. Use extreme caution. This could create an extremely large SINGLE directory. No trees/paths are created under the destination path.

-f + filetype(s)    Rename only those files meeting this file type. Additional file types (max of 10) can be added by separating each one by a space. (ex,. -f *.c *.doc *.tmp *.ppt )

-x + filetype(s)   eXclude those files meeting this file type. Additional file types (max of 10) can be added by separating each one by a space. (ex,. -x myfile*.c )

-[Oo] + filename;    An output file to contain a listing of all the files which are renamed. The record format for the output file is: (BATES_NO0001|C:\PATH\FILENAME.BATES_NO0001.EXT). In most cases the output record is fixed in size (see note below). But the delimiter is there for compatibility. This effectively creates a catalog of all the newly named files.

To get an output file in the LINUX version, simply redirect the screen output > output file

-[Oo]   If the upper case O (-O) is used, the source filename, including the path is added to the output record after another pipe (|) delimiter.
(BATES_NO0001|A:\PATH\FILENAME.BATES_NO0001.EXT|C:\SOURCE\FILENAME.EXT)

Note: the -w option can affect how many characters of the renamed file are printed here. If you want the entire full name printed, either don't use a -w option, or make it large enough to cover all the possibilities in the tree you are pointing to.

--NOEXT:
-E (LINUX version):   The --NOEXTension or -E option is used to remove any file extensions from the resulting renamed file. This is due to the fact that some LINUX users prefer to remove any extensions from the filenames. Notice the Windows and the Linux versions use a different option sysntax.

--NOMIC:
-M (LINUX version):   The --MICrosoft or -M option is used to change any special/restricted filename characters to underscores. LINUX is much more verbose in what characters can be in filenames. For instance, the colon (:), redirection (<>) and other special characters are illegal in the Microsoft file system nameing convention, but perfectly legal in LINUX. So if you are running the LINUX version and wish to remove these special characters for later use on a Microsoft file system, use the -M option. It will replace the special characters with underscores in the new output name. The current characters that are replaced in the LINUX version are: quote "   pipe |   Left and right arrow <   >   colon : . The Windows version adds $, and ?.

-v    No Verbose output. This eliminates the headers and footers in any output file generated. The output records are fixed length at this point, and the data file can be used as input to other programs.

-w + #   Replace # with a number indicating the width you wish the output record to be. This is the width of the path/filename in the -o output file. It is suggested that you also use the -v option to eliminate headers. Use this option to get a fixed length output that is compatable with most other Maresware programs.

-r    DO NOT recurse through the source directory for file. The default is that the source directory is recursed and ALL subsequent files and directories are processed.

-i    Proceed Immediately. Without this option the source tree is first scanned and files are counted so the user knows how many files are involved.

--NOFTK: use in conjunction with --FTKFILE=filename, to rename FTK export files without the non sequential indexes. This option renames the FTK exports and removes the [index] number and replaces it with the traditional bates_no sequences.

-g + #
-l + #   
Rename only those files (g)reater than or (l)ess than # days old. Replace the # with a valid number of days. And don't include the +.

-g + mm-dd-yyyy
-l + mm-dd-yyyy
:  (that's and ell, not a one). Rename only those files (g)reater (older) than or (l)ess than (newer) than this mm-dd-yyyy date. The date MUST be in the form mm-dd-yyyy. It MUST have two digit month and days (leading 0 if necessary), and it MUST have a 4 digit year. The date given mm-dd-yyyy is NOT included in the calculation. Ie. if today was 01-10-2003 and you entered -l 01-09-2003 you would only process todays files. If you wanted to include those on 01-09, you should have entered -l 01-08-2003.

-1 + logfilename   file to contain accounting information.

-h + filename  create an HTML file. (If no filename is given, then INDEX.HTM is used as a default.) The file will have links to all the files which were renamed with appropriate bates numbers. Use caution if this file named is included in the path which is being processed by the program. If the filename meets the command line requirements, it too will be renamed and included in the output. It is suggested this file be in a location other than the processing path

-H + filename  create an HTML file. (If no filename is given, then INDEX.HTM is used as a default. The file will have links to all the files identified using the appropriate command line options. However, this option DOES NOT rename the files with the bates number mask. It merely creates the html reference "index" file. For both the -h -H options, the output htm file, has html page breaks inserted at 45 lines. so that it is easy to print the list using a browser.

-t[acw]    Specify which time type to use in the calculations. The a= =access, c= =create, w= =last write/modify time. Donít forget, in WIN9X, there is no access time.

-G + #
-L + #   
Rename only those files (g)reater than or (l)ess than # bytes in size. Replace the # with a valid file size.

-R   Because the files are opened and read, on WINNT and WIN9X the access date is modified. This option attempts to reset the source file date back to its original.

--XWaysfile=filename   This renames (in place) the exported X-Ways files. The filename provided is the X-Ways file list of the files exported which has at a minimum the following two field: PATH and NAME. The program finds the file named NAME in the appropriate PATH and renames it with a correct bates_no. This option ONLY renames files in place and is mutually exclusive with other copy/rename options.


Related Programs

Diskcat

Upcopy (to simply copy files while maintaining tree structure).

top