# ICS 269, Spring 2005: Theory Seminar

## 10 June 2005:

Location: CS 432

Speaker: Jeremy Meng

This talk will be a presentation of
a paper titled
"No sorting? Better Searching!", by Gianni Franceschini and Roberto Grossi, FOCS 2004.

Sorting is commonly meant as the task of arranging keys in increasing or
decreasing order (or small variations of this order). Given n keys
underlying a total order, the best organization in an array is maintaining
them in sorted order. Searching requires \Theta(log n) comparisons
in the worst case, which is optimal. We demonstrate that this basic fact
in data structures does not hold for the general case of multidimensional
keys, whose comparison cost is proportional to their length. In two papers
by Andersson et al. (1994) and Andersson et al. (1995) and the full
version in 2001, Andersson et al. study the complexity of searching a
sorted array of n keys, each of length k, arranged in lexicographic (or
alphabetic) order for an arbitrary, possibly unbounded, ordered alphabet.
They give sophisticated arguments for proving a tight bound in the worst
case for this basic data organization, up to a constant factor, obtaining
\Theta(((k log log n)/(log log (4 + ((k log log n)/log n)))) + k log
n) character comparisons (or probes). Note that the bound is \Theta(log n)
when k = 1, which is the case that is well known in algorithmics.
We describe a permutation of the n keys that is different from the sorted
order, and sorting is just the starting point for describing our
preprocessing. When keys are stored according to this "unsorted" order in
the array, the complexity of searching drops to \Theta(k + log n)
character comparisons (or probes) in the worst case, which is optimal
among all possible permutations of the n keys in the array, up to a
constant factor. Again, the bound is \Theta(log n) when k = 1.
Jointly with the aforementioned result of Anders son et al., our finding
provably shows that keeping k-dimensional keys sorted in an array is not
the best data organization for searching. This fact was not observable
before by just considering k = O(1) as sorting is an optimal organization
in this case. More implications of our result are commented in the
introduction.