WHAT IS MARESWARE

 


HISTORY

    Records / Record formats / Parameter Files / Miscelanous programs /


Before we start.
Let me advise, that I am constantly updating and enhancing the software. As such I do not place current MD5 values of the programs regularly on the site. If you have a version that you are unsure of its integrity, send me an email at dm @ dmares.com and I will send you the current version and its correct MD5 value.

A little history about when and how Maresware got started.

About 1975-1980 time frame I was a criminal investigator involved with investigating various data maintained in computer data files. These files were on mainframe computers. (you know what those are?)

The data was stored in many different formats we called record formats. This meant that every file on the mainframe data set had different information in each record. Although, all the records in any one file contained the same information, there were hundreds of different record formats maintained on computer tapes. (you know what those are, don't you?)

The maintainance of thousands of tapes, thus resulted in the storage of millions of records.

The need to be able to analyse or process different fixed length records containing different information was required. So I took on the challenge to write some software that might help other investigators conplete their investigations.

Since the analysis was being done on mainframe computers, I had the choice of three languages. Assembly, 'C', or Cobol. Assembly was too hard to learn and Cobol was not targeted at what needed to be done. So I chose 'C', which is as close to assembly as you can get. That is why maresware programs are so fast.


INPUTS and OUTPUTS

What I mean by this, is the primary understanding of what different types of operations maresware conducts.

As an investigator or analyst you first have to decide where you are going to obtain and process the data. (lets call it evidence from here on). Then once you obtain the evidence you naturally have to process it to come to a conclusion. So you, the investigator have to come up with a plan or sequence of steps to accomplish your goal. This is your responsibility and as such you must know the process and the programs to use. Each type of case will present different evidence which means you may have to take different approaches. A nifty idea.

So, as an investigator you have to decide where to find the evidence, then what process you will use to obtain the evidence, and then how will you process the evidence, and finally, possibly generate a report or data to provide to a prosecutor or manager to take actions. Hopefully you will personally be a little more engaged in this process than clicking or filling in a few boxes on the windows screen.

A few steps where maresware might be of some assistance.

First I'll explain that maresware programs in general perform two tasks. One task (either run by maresware or other software) is to run a program on the file system (lets say root of X: drive) and obtain an initial output file(s). Those might be programs like diskcat, hash, strsrch which operate on the file system and produce output. Then, the second task is to take the data produced in the first steps (whether from maresware or other software you have) and re-process the data over and over until you come to a conclusion.

Maresware generally produces or processes two types of data. One type is variable length records, and the second type is fixed length records. Both will be discussed later. But here are examples of the two types of maresware software and their input and output. Remember, that even though here I might say that a program produces a certain type of output, that same program might take input and process it for a specific task. So the caveat being, whatever is said here, is generally the main operation and most likely has exceptions. So don't get bogged down.

First step is that you might want an initial catalog or list of the evidence you are seizing. This might come from running any of the following programs:
These programs generally work on the file system, and create fixed length output records. Other output formats are available, but fixed width is the default. Below is a short list of the programs which generally work on the file system files and produce an output.

DISKCAT   Create a catalog of files from a designated tree (ie: X:\). Output is defaulted to fixed.
MDIR         Create a catalog of a directory. Output is defaulted to fixed.
HASH         Hash files which will generate by default fixed length records showing a hash inventory of the suspect tree/drive. Output format is defaulted to fixed, but there is an option for variable output.
DISKSORT   Sort fixed length files. A primary process and mainstay. This takes a fixed length record as input, and sorts on the key field.
STRSRCH   Search for text strings in files for possible leads. Output is fixed.

These initial steps and others, will provide you with the basic output files that you will then have to decide what to do with. The default output of all the above is generally a fixed length record which you will then re-process. Next step(s).

Then once you have these output fixed length files, or any data that you receive from other sources, you may wish to (re-)process the data to perform various analysis tasks on the data as you see fit. This is where the strength of the data file processing programs come in. They take fixed length records (specifically formatted) from other program outputs and can re-process the data as you may need. The process may involve somthing like:

All these below (and other) require input of fixed length records in order to perform their task.

SEARCH      for specific hash values, or directory names within the initial catalog, or search for any item in the data file produced by other means.
BSEARCH   perform a binary (sorted) search on the data. Hundreds of times faster than a linear search.
COMPARE   two sorted files on the same key field to see what hashes might show up in one suspect drive and on the other, or not.
HASHDUP   see if there are any duplicate files in the listing, so you don't miss anything.
FILBREAK   break up, split up, make smaller, larger records into smaller more manageble ones for the next steps.
TOTAL        or count occurances of say, IP addresses, or count how many items occured on a single day.

As mentioned, all the above take fixed length records as their input. An produce fixed length records as output.

Then you have other programs below, that work on files in the system that work in specific ways to accomplish security and foresnic type results.

If you do nothing else, read the manual on upcopy for not only forensic copy capability, but for personal backup and file sync on a day to day basis.

UPCOPY   A primary maresware forensic tool. Performs two tasks.
The main is to forensically copy suspect files and evidence from pointA to pointB in a forensically sound manner.
The second is once you have identified (by other means) any and all files which strike your interest, forensically copy them from the source/suspect/work drive to your smaller work drive environment for additional processing.
And finally, a personal use is to maintain or update your own work drives to maintain up-to-date copies of your work. UPCOPY is one of the most important maresware forensic programs there is. So study it well.

REMOVE   The RM and RMD programs. Depending on the case, you may wish to permanantly remove the illegal file from somewhere. Or simply clean up a work directory of unnecessary files. These two programs have a lot of capability.

There are about 40 different maresware programs, each of which performs a simple, specialized single task, which provides you with an output or operation, that you can then take to the next step. For instance, there are programs to split up larger files to more manageable smaller ones to process, or to merge all the small files into a larger single file. FTFM. It is up to you to figure out how/what/where/when to proceed. Its up to you to determine the processes you wish to accomplish.

All of these above steps you will have plan, and then execute. And then possibly re-plan and re-execute as more evidence shows up or goes away. It is not always as simple as pushing a few buttons or filling a few boxes. These programs/steps, and other programs you have will generate intermediate data files which you will then have think about how to process (re-process) the data for the next step. These steps to process, re-process, and process again are what you were hired to do and paid the big bucks. And once you determine the format of the data, you can use maresware to help in the step to step process to perform your task.

Next I will discuss the various record formats that maresware can create and which ones MUST be available for whic program to properly process the evidence data you submit.

When thinking about using maresware, one thought should always come to you. It is, that there are many steps you must take to get from initial evidence capture to the final output files ready for the report. The process to get here is what you determine. The actual process is how you put or merge the maresware programs together to obtain the output of one program and use it to the input of the next and so on until you have the evidence you expect. Maresware can only provide the tools. You must provide the genious.

top

FIXED LENGTH RECORD

A mainstay requirement for using maresware to process the data.

Mainframe data at the time I wrote the initial maresware software generally consisted of the fixed length records. They were fixed length because the tape blocks which stored the data found it easier to store a fixed "block" of data rather than variable sized/length blocks. Although every file contained different data, the records within any one file always had the same record format. Such as, the name was 50 characters, and began at the first character, the address was next with 30 characters, and so on until all the fields of the record were spoken for. We called this listing of the fields the core record layout. Why, because it lays out the format/content of the records. DAH!. Which might look like:

Last Name:   20
First Name:  10
MI:           2
ADDRESS:     20
CITY:        10
STATE:        2
DOB:          6
etc etc etc. ==
final length 70  (if my addition is correct)

So every record in the data file would have the same format. Spaces would pad the field if necessary, be truncated where necessary. So if you were looking for an address, you would tell a program look at position 33 for 20 characters.

Since originally written to process fixed length records, the main data processing programs (search, bsearch, compare, filbreak, hashdup, total, and others) of the maresware software either initially creates a fixed length record (diskcat, hash), or require a fixed length record to process. There are exceptions, like the strsrch and url_srch programs, but for the most part, the output of the programs is fixed, and the input (in general) MUST be fixed length.

Unfortunately, most current programs today generate variable length records such as csv types. Although maresware doesn't normally process these types of records, we provide a program called PIPEFIX   which will convert or fix a pipe delimited file to be fixed length record, so that you can run the next step using maresware.

Lets take a look at the next steps and define what I call record formats.

top

RECORD FORMATS

For most of the maresware programs, as just mentioned, the records must be fixed length records. That means that every record is the same number of characters. And that each field starts and stops at the same location.

There are some of the programs, like STRSRCH, URL_SRCH, PIPEFIX and others that do not require input records to be fixed length, or like DISKCAT, MDIR and HASH, they create an output of fixed length records so that their output can be re-processed by the next step. (you know, the next step is what you designed).

Then there are those which require a fixed length record in order to process the data. Such as SEARCH or BSEARCH   HASHCMP   COMPARE   TOTAL and others. These and other data processing programs make up the bulk of the data analysis library.

Lets take a look at some record formats.

The data within each record of a fixed length record might contain information like: (Notice each field is specific width, even if it means adding padded spaces. That is what makes up a fixed width/length record.)
shutdown_command.docx   16,879  12/02/2018 06:27:21:699c  08/04/2024 15:25:03:147w  10/31/2025 21:18:22:889a EST A.....
shutdown_help.txt        4,125  04/01/2019 12:00:00:000c  08/04/2024 15:28:06:850w  10/31/2025 21:18:22:920a EST A.....
shutdown_now.bat            19  01/21/2019 07:20:12:830c  01/21/2019 07:20:12:830w  10/31/2025 21:18:22:920a EST ......
or
setup.bat               564ACB6605F971C51483C6FE8F70CBD9      174 06/14/2019 08:58:54:400c
setup.bat:ads_hash.txt  D7573C51A8E1EEDD53A16B5C05657086     1412 06/14/2019 08:58:54:400c
setup2.bat              726E92DD43BA2E288D9286D99432A703      393 12/01/2015 12:25:57:461c
shutdown_now.bat        8BBB36EBF0CE6C14AD9FC66B756E72F7       19 01/21/2019 07:20:12:830c
Again, the above samples are called fixed length or fixed width records. Guess why? Because each record has the same number of fields, and each field is the same width. With spaces added where appropriate to make each record the exact same width, lets say 80 charactes. It adds size to the final file size, but is very easy to read and interpret, and at the time I started, it was easy to program analysis of this type of record. The layout for these records was called core record layout. Because it layed out the record format. There is not a real program alive that can't process fixed length records. Even a simple text editor can open such a file. It may be large on the screen, but it is possible. It is a simple matter to import them to excell or a data base. But why do that, when you are going to need X more steps to get to the final process and provide a reasonable answer.

top


Today, program create output data that might look like the records below: (you know it as a pipe delimited record. Or in some cases, it,might,be,a,csv. ) Where each record is a different width with fields seperated by the pipe ( | ) symbol or commas. It saves space but is not easy to read. And in some cases, csv files may cause input headaches if one record has misformed commas.
shutdown_command.docx|16,879|12/02/2018|06:27:21:699c|08/04/2024|15:25:03:147w|10/31/2025|21:18:22:889a|EST|A.....
shutdown_help.txt|4,125|04/01/2019|12:00:00:000c|08/04/2024|15:28:06:850w|10/31/2025|21:18:22:920a|EST|A.....
shutdown_now.bat|19|01/21/2019|07:20:12:830c|01/21/2019|07:20:12:830w|10/31/2025|21:18:22:920a|EST|......


When I started, back then, what the user needed to know, is what kind of analysis was needed for each file. Same as you might need today.
You might ask, what is the date of the file listed for each record, or what file has a specific MD5 value, or where does a file live in the tree, or find files based on name, date, size, etc. Depending on the record layout (content) of each record, different items could be searched for and/or analysed. Its up to you, the "expert" to figure out the next and the next step in the analysis.

So, the basis of Maresware is that you provide the fixed length record file. Figure what part (field within) of the record you want it to look at, and ask it to perform the analysis. or:
If looking at the file system, you ask for an inventory of all the files, or a hash of all the files, or search all the files for specific string content, or whatever you might need to look for/at in the tree structure.

Then take that output and send it to the next step. By now, I hope you are getting the picture, that there is more than one step involved when using maresware. Each step is specific, targeted, and simple. But put them all in a sequence and maybe you can find some worthy evidence.

The problem is that you, the user, has to have an idea of what they have to analyse, lets say it is an inventory of all the hash values of a system, or you might want search each record in a pcap file (provided by another program) for specific IP addresses, or you might want to match two hash files to see if there are any common or different matching hashes. Or match the entire hash listing against the (180 million plus) NSRL list of known hashes to elimiate 95% of the files.

The idea is, that you have to know what the data you have on hand is made of. And then you can ask maresware to run some analysis on the data files. The main caveat is that the programs only perform a single task at a time. Then, you have to consider what to do with that output. It may need to go into another step (which you have determined is needed) to provide another analysis. Maybe you want to sort on date, or hash, or filename. Then once you have the sort, maybe count or total items of specific field. I can't read your mind. But the oppositon will try.

Then you take the output of each run, decide what information you have, and send it to the next program which performs the next step.

top

Now, lets take a another look.

Lets say you peform a hash of the directory using something other than maresware hash. And you end up with a pipe delimited data file. (not real pretty)
setup.bat|564ACB6605F971C51483C6FE8F70CBD9|174|06/14/2019|08:58:54:400c
setup.bat:ads_hash.txt|D7573C51A8E1EEDD53A16B5C05657086|1412|06/14/2019|08:58:54:400c
setup2.bat|726E92DD43BA2E288D9286D99432A703|393|12/01/2015|12:25:57:461c
shutdown_now.bat|8BBB36EBF0CE6C14AD9FC66B756E72F7|19|01/21/2019|07:20:12:830c
If using Maresware, you will come to learn that maresware for the most part ONLY works on fixed length records. As were the mainframe data files. So you pass the file thru the pipefix program, telling pipefix how big to make each field in the record, and you end up with.
setup.bat               564ACB6605F971C51483C6FE8F70CBD9      174 06/14/2019 08:58:54:400c
setup.bat:ads_hash.txt  D7573C51A8E1EEDD53A16B5C05657086     1412 06/14/2019 08:58:54:400c
setup2.bat              726E92DD43BA2E288D9286D99432A703      393 12/01/2015 12:25:57:461c
shutdown_now.bat        8BBB36EBF0CE6C14AD9FC66B756E72F7       19 01/21/2019 07:20:12:830c
Then, if you have hundreds or thousands of records in the file (over 180 million NSRL hashes) you might want to sort on the hash value.
So you run the fixed record file thru DISKSORT to sort on the MD5. And up with: (notice its sorted on MD5 field)
setup.bat               564ACB6605F971C51483C6FE8F70CBD9      174 06/14/2019 08:58:54:400c
setup2.bat              726E92DD43BA2E288D9286D99432A703      393 12/01/2015 12:25:57:461c
shutdown_now.bat        8BBB36EBF0CE6C14AD9FC66B756E72F7       19 01/21/2019 07:20:12:830c
setup.bat:ads_hash.txt  D7573C51A8E1EEDD53A16B5C05657086     1412 06/14/2019 08:58:54:400c
Now what is the next step?
Lets say this file is a few thousand hash values, and you have knowledge that one or more hash values are of interest to you. So you ask the search program to search the file on the hash field for the one or many hash values you are looking for. And you end up with.
setup2.bat              726E92DD43BA2E288D9286D99432A703      393 12/01/2015 12:25:57:461c
Now, this example is small, and short. Only three or four steps. But you, as the investigator/analyst have to figure what you need to accomplish. Set out the individual steps, and implement each step.

Here is maybe a more complicated example:

Step 1: Hash the file system. (maybe use hash, or your own file hashing program). Capture also all MAC dates.
Step 2: Make each record fixed length, with the date, time, and hash value with each record. (if you used, hash, its already done)
Step 3: Sort on any field, the hash, or any of the file dates. It depends on what you will need to do next. Lets say the hash at this point.
Step 4: You hash another file system. Maybe a co-conspirators machine. Step 5: Perform Step 2 & 3 on this file.
Step 6: You want to see what MD5 values show up on both systems. So you run the Maresware compare program. (both files DO NOT need to be the same format, all they need to be is sorted on the key field. (MD5).
Step 7: Once you have the suspect hash records in hand, maybe read the file date(s) from the matched files, and ask the search program to go back to either one or both of the files to see what other files may have the same questionable dates .

Maybe somewhere above, COMPARE   the hashes to the NSRL data set to eliminate 95% of the data. So the runs aren't so long or large. Its your analysis, use it wisely.


Hopefully, as you can see, the process is developed by you depending on your needs. And the files are processed one at a time by a specific program designed to perform that task. Which, guess what. Can then be implemented over and over in a batch file. You know what that is. Lets try again:

Step 1: You have a PCAP file containing internet stuff.
Step 2: You use URL_SRCH to search for all IP addresses in the file
Step 3: You sort on the IP output field.
Step 4: You total the count for each IP address.
     at this point, you see anomolous counts for IP addresses that maybe strike your fancy. You do have a fancy, don't you?
Step 5: You might use the strsrch program to go back to the PCAP file and search out those anomolies, or use other programs to analyze the PCAP data you have.
     You are the one doing the investigation. Once you have the IP count, you decide how to proceed.

You are in charge. Figure out what the data in each field of the record. Make sure it is a fixed length record.
Then perform whatever next opertion your process might need. Sort, Search, Compare, Total, Search again. and repeat.

In summary:

You are the investigator. You obtain the initial data records/files. You then re-format them (in general) to a fixed length record. Then you decide what the next step(S) are to your proceedure. Find the maresware software which will accomplish each succeeding step. And finally come up with evidence you need to prove your case. Simple: YES/NO???

Next:
How does maresware know what the input data file looks like?? Meaning, what does each field contain and where in the record does it live? If it was intelligent it could read your mind. So you have to tell it what the input data file looks like. That is the next section.

top

PARAMETER FILES

As mentioned before, the data file processing programs of maresware are designed to be able to analyse the items (fields) contained within fixed length record formats. Like:

Last Name:   20
First Name:  10
MI:           2
ADDRESS:     20
CITY:        10
STATE:        2
DOB:          6
etc etc etc. ==
final length 70  (if my addition is correct)
Obviously, depending on the file, and the different fields which make up the records, and which of the fields you want to search on, the programs can't read your mind. If they could, they would be real intelligent not artificially intelligent. So, when you are searching a record for a filename field, or a specific date or of a hash value. You get the picture. You have to tell the program what data (fields) are in each record, and which of these data fields you wish to search on for this run. Also, you have to tell the program how many items you are searching for.

In a test, I ran the bsearch program against 178 million NSRL hash values searching for 1300+ hash values. On a reasonable desktop, it took all of 30 seconds. But I had to tell the bsearch program, how big each record of the NSRL file was, where within the record the hash value was, and then provide a file containing the hash values to search. These items, or pieces of information, are contained in what I call parameter files. Because they provide the "parameters" about the file being searched and what you are searching for.

Mainframe data at the time I wrote the initial maresware software generally consisted of the fixed length records, with available core record layouts for each file. They were fixed length because the tape blocks which stored the data found it easier to store a fixed "block" of data rather than variable sized/length blocks. Although every file contained different data, the records within any one file always had the same record format. So the first two lines of most maresware parameter files contain the block size and record length. These first two lines are consistent for almost all maresware data processing programs. Carriage return/line feeds need to be counted if there.
34000   (block size for demo only)
34      (record length, for MD5, 32 bytes MD5 + 2 bytes for CR/LF)
Then, depending on the location within the record of the item to process (ie: hash, name, date, whatever), you need to provide its location. The problem being, that in mainframe language the first character of a record is actually the "displacement" within the record. So the first character is displacement 0. Since we haven't been displaced anywhere. This displacement is needed so the program knows where to look for the field/item to process.

Next is how big is the data field we are searching for. It could be a 9 digit SSN, or 10 digit phone number, 34 character MD5+2CR/LF values. You get the picture. So the next item is a value of how many characters we will search.
34000   (block size memo  for demo only)
34      (record length, including CR/LF)
0       (displacement 0, meaning first character)
32      (how big is the field we are searchin)
In the above instance it is obvious we will be searhing for MD5 values of a file which contains ONLY MD5 values. (32 charact MD5, +2 CR/LF == 34 character record length) in a large file.

Now the only thing missing is what to search for. For this, in most instances just include them line by line after the first 4 lines.
Items in (parens) are not acutally part of the parameter file. Used here for explanation only.
34000                               (block size explanation  for demo only)
34                                  (record length, including CR/LF)
0                                   (displacement 0, meaning first character)
32                                  (how big is the actual field we are searchin)
564ACB6605F971C51483C6FE8F70CBD9    (these are what we are searching for)
D7573C51A8E1EEDD53A16B5C05657086    (in my bsearch test, i had about 1300 MD5 values
726E92DD43BA2E288D9286D99432A703
8BBB36EBF0CE6C14AD9FC66B756E72F7
Depending on the program being run, and its requirements for knowing how to process the record, each parameter file might have slightly different formats. Such as the filbreak parameter file tells the FILBREAK program which fields it needs to keep, and their final size, and which fields can be eliminated. Thus producing a new final record of pieces or (broken) parts from the original record.
34000       (block size of 3400, which is 10 records of 340 characters includes CR/LF)
340         (record length, or 340 characters, including CR/LF)
0000=002    (need 2 characters from displacement 0 which is the first character in the record)
0010=030    (need 30 from displament 10, which is actuall the 11th character for 30 characters)
0060=006    (need 6 from displacement 60 which is actuall character 61, remember the displacement 0 count.)
00000000    (no more needed  the 8 zeros tell filebreak, thats all folks)
So, as you can see, if you have fixed length record, and a program designed to process that record in its designed fashion, the programs, via a parameter file can know where and how many items to process. Fun, yes/no?

Special Note: Because disksort is so special, take note of this explanation.
While the disksort program needs as apposed to a parameter file, a command line providing the record length, the location of the key field to sort, and the length of sort key. Plus a few other items on the command line.

By reading the manual instructions for each program, you will learn what is needed on the command line, and in the parameter file to get proper results.

top

MISCELANEOUS PROCESSES

Of the 40 or so maresware software, some of the other processes which you can use for intermediate steps to help your overall proceedure might be some of the following.

The BATES_NO program is a program which when you have your final files set aside and ready to deliver, you can "bates_no" or rename the files to provide unique filenames for the reviewer/attorny to look at. This file renaming is similar to the legal bates_number of pages in a legal document. It helps make each file unique so the user can more easily identify which file is reference.

Then there is the SPLIT   and FILSPLIT   programs each in their own way will split up a large filed into smaller pieces. These smaller pieces may provide you with just a small sample with which to develop your process before attacking the large data file.

COLLATE   will, guess what. COLLATE two or more identicle sorted files into a single sorted files. Or easy MERGing of multiple files of the same format. One keeps the sort order, the other just combines.

and finally, RMoving files permanently or not that are no longer needed, wanted, or are illegal to posess.
See what else might be useful in your investigations.

Files A-C  |  Files D-F  |  Files G-K  |  Files L-O  |  Files P-S  |  Files T-Z  |

Th Th Thats all folks

top