Agglomerative Clustering

Selim Mimaroglu


Download Requirements How to Run

 

 

This is a hierarchical clustering method which works bottom-up. Initially each element is a cluster on its own, and then the algorithm merges two closest clusters into one, until there is one reaming cluster which contains all the elements.

 

The distance between two clusters can be computed by using one of the several measures shown below:

Minimum Distance (Single Link):
Maximum Distance (Complete Link):
Group Average:

 

where d is a distance function,  is the ith cluster,  is the jth cluster, a is a member of , and b is a member of .

 

A distance matrix for 5 elements may look like below:

 

 

E1

E2

E3

E4

E5

E1

0

2.5

10.44

4.12

11.75

E2

2.5

0

12.5

6.4

13.93

E3

10.44

12.5

0

6.48

1.41

E4

4.12

6.4

6.48

0

7.35

E5

11.75

13.93

1.41

7.35

0

                                                Fig 1: An example of a distance matrix.

 

Input to the Program must be a “one space separated” values of distance matrix. For example the distance matrix shown above can be stored in a file distance.txt as shown below (Note that there is a space between each value at every line) 

 

0 2.5 10.44 4.12 11.75
2.5 0 12.5 6.4 13.93
10.44 12.5 0 6.48 1.41
4.12 6.4 6.48 0 7.35
11.75 13.93 1.41 7.35 0

Fig 2: A distance matrix input file which corresponds to Fig 1.

 

 

 

 

Requirements:

JRE 1.5.0_09 or higher

 

How to Run:

java -jar AgglomerativeClustering.jar

 

Download:

AgglomerativeClustering.jar

 

Note:

Compiled code for research purposes only, NO COMMERCIAL USE

Disclaimer:

The software is provided on an *as is* basis for research purposes. There is no additional support offered, nor are the author(s) or their institutions liable under any circumstances.

Visitor Counter by Digits