CS 269S, Fall 2010: Theory Seminar
22 Oct 2010:

Data Specific Analysis of String Sorting

Saumi Bandyopadhyay

We consider the complexity of sorting strings in the model that counts comparisons between symbols and not just comparisons between strings. We show that for any set of strings S the complexity of sorting S can naturally be expressed in terms of the trie induced by S. This holds not only for lower bounds but also for the running times of various algorithms. Thus this “data specific” analysis allows a direct comparison of different algorithms running on the same data. We give such “data-specific” analyses for various versions of quicksort and versions of mergesort. As a corollary we arrive at a very simple analysis of quicksorting random strings, which so far required rather sophisticated mathematical tools. As part of this we provide insights in the analysis of tries of random strings which may be interesting in their own right.

(Based on a paper by Raimund Seidel at SODA 2010.)