ARtool User's Guide

The graphical user interface

If you have read the Introduction to Association Rules section, then you should be able to figure out quickly how to use the interface. In this section you will learn a little more about the finer details.

Initially you have to choose a database to work on. This is done on the Database tab. Once you have selected a database, you can start mining frequent itemsets or association rules from the other two tabs. The Database tab also gives you information about the characteristics of the database that you have selected.

ARtool breaks the mining process into two steps: mining the frequent itemsets and generating the association rules. In what follows we will refer to these steps as the first and second mining step.

If you want to perform the first mining step, just go to the Frequent Itemsets tab, select an algorithm and a minimum support value (a value greater than 0 and less or equal than 1), and press Go. If the algorithm takes too much time, more than you care to wait, then you can press Abort to stop the mining process. Note that it might take some time from when you press Abort to when the mining process is actually aborted.

Since the second mining step needs the results of the first step (the frequent itemsets), these are saved in a cache file, normally having the same name as the database but with an extension of .cache. If you want to read the contents of a previously generated cache file, simply select algorithm Use Cache on the Frequent Itemsets tab. This will be faster than using an algorithm which will always regenerate the cache.

PITFALLS:
For the second mining step, you need to select the minimum support value on the Frequent Itemsets tab, and then select an algorithm and a minimum confidence value (a value > 0 and <= 1) on the Association Rules tab. The Go and Abort buttons work in the same manner as those from the Frequent Itemsets tab. The mining algorithm that you selected will start by reading the cache file built during the first mining step. It is therefore important to have such a cache file.
PITFALLS:
The log window at the bottom of the ARtool frame displays status and error messages, so it is useful to always keep an eye on it. You can clear the log window if you wish by selecting the Program/Clear log menu item.

The results of the two mining steps are presented in tables and they can be ordered ascendingly or descendingly on each column by double-clicking the column headers. The columns containing itemsets are sorted according to the size of the itemset. You can also double-click on a table row to display its contents in the log window, which is useful when the itemset is too large to be displayed entirely in the table.

You can evaluate the rules discovered using various measures by going to the Program/Compute measure menu entry.

If you want to free some memory, then you can clear the result tables by selecting either Discard itemsets or Discard rules from the Program menu.

The Generate a synthetic database menu item allows you to create a synthetic database. For more info about this you should check reference [1a]. Note that if you generate a database with the same name as an existing database, the existing database will be overwritten.

The command line tools

The command line tools are easy to use, just execute each one of them with no parameters to get usage instructions.

Below is a list of the command line tools along with a brief description of each of them:

One thing to note is that all command line tools require that the name of the database files passed to them do not include the extension. The .db extension is appended automatically.

There is a tutorial for the asc2db and db2asc tools in the file 0ASC_TUTORIAL.TXT. These utilities are useful if you want to convert data to the .db format.

The Java packages

All classes from the Java packages laur.dm.ar, laur.rand, and laur.tools contain documentation comments, so you should use javadoc to generate the packages documentation.

I will present here an overview of the classes from the laur.dm.ar package, which is the package containing the association rule mining algorithms: