Approximate Frequent Item Set Algorithm - AFISA

AFISA is an approximate frequent item set generation algorithm. It is described in detail in the paper "Clustering and Approximate Identification of Frequent Item Sets" by Selim Mimaroglu and Dan Simovici. AFISA is high performance software fully implemented in Java.

Figure 1: Execution Time comparison of The Apriori, FPGrowth and AFISA; AFISA is superior.

Requirements:
JRE 1.5.0_09 or higher

How to Run:
java -jar AFISA.jar (-Xms, and -Xmx options can be used for setting up the initial and max memory)

Download:
AFISA.jar
1K_20I_2.ssv (sample input)
1K_20I_2.asc (same sample, different format)

Note:
Compiled code for research purposes only, NO COMMERCIAL USE

Disclaimer:
The software is provided on an *as is* basis for research purposes. There is no additional support offered, nor are the author(s) or their institutions liable under any circumstances.

AFISA inputs can be in one of the two text formats shown below:
.ssv format .asc format
0 0 1 1 0 0 0 0 1 1 1 1 1 0 1 0 0 1 0 0 
1 1 0 1 1 0 0 0 1 1 1 1 0 0 1 0 1 1 0 0 
0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 0 1 1 
1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 
1 0 0 1 1 1 0 0 1 1 0 1 1 0 1 0 1 1 1 0 
1 0 0 1 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 1 
1 0 0 0 1 0 0 1 1 1 0 1 0 0 1 0 0 0 1 0 
1 0 0 1 1 1 0 1 0 1 1 1 1 0 1 1 1 1 0 1 
1 0 1 0 1 0 1 1 1 1 1 1 0 0 1 0 0 1 0 0 
0 1 0 1 0 0 0 0 1 0 1 1 1 0 0 0 1 0 0 0 
2 3 8 9 10 11 12 14 17 
0 1 3 4 8 9 10 11 14 16 17 
3 5 8 9 11 13 18 19 
0 1 3 4 5 6 7 8 9 10 11 13 14 15 17 18 19 
0 3 4 5 8 9 11 12 14 16 17 18 
0 3 4 5 7 8 9 11 12 13 14 15 17 18 19 
0 4 7 8 9 11 14 18 
0 3 4 5 7 9 10 11 12 14 15 16 17 19 
0 2 4 6 7 8 9 10 11 14 17 
1 3 8 10 11 12 16 
Fig 2:AFISA input formats


Figure 3: AFISA screen shot


Note that D1 stands for the Jacquard-Tanimoto distance. User can set up the proximity; AFISA can create overlapping and non-overlapping (clusters) frequent item sets.



Visitor Counter by Digits