Test of Closest Pair Data Structures

Hierarchical Clustering in a 31-dimensional Fractal

Brute Force Neighbors Quadtree Conga Line MultiConga FastPair
n = 250 12.71s 0.67s 0.52s 2.05s 0.68s 0.59s
500 107.90s 3.18s 2.51s 10.79s 3.03s 2.72s
1000 926.06s 14.38s 11.18s 55.67s 13.62s 12.41s
2000   61.26s   278.97s 64.07s 56.79s
4000   244.23s   1227.56s 269.56s 233.05s
8000   1014.02s   5354.00s 1128.76s 972.92s
16000   4492.64s     4624.10s 4152.42s

Clusters are combined by unweighted medians. Points were placed uniformly at random in the 31-dimensional generalized Sierpinski tetrahedron (formed by choosing 5 random binary values, taking bitwise exclusive ors of each nonempty subset, and scaling into the range from 0 to 1). Times include only the construction of the closest pair data structure and algorithm execution (not the initial point placement) and are averages over ten runs. The quadtree data structure was only run on data sets of 1000 or fewer points due to its high storage requirements. Code was written in C++, compiled and optimized by Metrowerks Codewarrior 10, and run on a 200MHz PowerPC 603e processor (Apple Powerbook 3400c).