 # Research in Algorithms and Theory of Computation at UC-Irvine

The goal of research in theoretical computer science is to produce results, supported by rigorous proof, about problems dealing with computers and their applications. The questions to be investigated are often motivated by practical problems, but the goal of understanding the fundamental structure of the problem is often as important as producing a solution of immediate applicability. Despite this emphasis, it turns out that results that first might appear to be only of theoretical value are sometimes of profound relevance to practical problems.

In particular, one of the major subareas of theoretical computer science, and the one pursued by the faculty and graduate students at UCI, is concrete complexity: We look at specific problems and try to determine the complexity (i.e., the amount of resources required) for obtaining a solution. Our work falls into three main areas: design of algorithms and data structures; analysis; problem complexity.

## Design of Algorithms and Data Structures

Given a problem, we try to find efficient solution methods. A data structure is a way of organizing information; sometimes the design of an appropriate data structure can be the foundation for an efficient algorithm, and we have made a number of significant contributions to the field of data structures. In addition, one of our members has written a widely used and respected text on data structures, and is presently completing a second more introductory text.

In addition to the design of new data structures, we are also interested in efficient algorithms for problems arising in a variety of fields. Often such problems can be represented in terms of trees, graphs, or strings, and we are interested in the design of efficient solutions for such problems.

The field of computational geometry investigates the complexity of problems involving two-dimensional (or higher) spaces. This is an active research area which has not only theoretical depth but also practical applications in areas such as pattern recognition, VLSI layout, statistics, and image processing. One major area of our work is the investigation of certain properties of geometric constructs which can be modeled by graphs. We have also explored how solutions to geometric problems such as linear programming or the minimum spanning tree can be made dynamic, i.e., how we can efficiently maintain the solution when the input data are subject to change.

Also of interest is the compression of data. For example, we have reduced the complexity of algorithms for compressing strings, and have also investigated the compression of structures such as quadtrees which are used for storing spatial data.

Current work in genetics provides an exciting application area for algorithms. Some work done long ago by our present faculty, on longest common subsequences and on PQ-trees, has turned out to be valuable in solving problems that arise in genetics. More recently, one of our faculty has introduced sophisticated new methods for speeding the solution of problems such as DNA sequence comparison.

Much of our work has dealt with the fast solution of problems by a single processor. The combination of declining cost of processors and the desire for fast solutions to problems has led to a great deal of interest in the use of parallelism to speed up computation. One natural question is thus: how long does it take to solve a given problem with a limited number of parallel processors? Some of us have been especially interested in solving problems on graphs very quickly without using an excessive number of processors.

## Analysis

Once a solution method has been proposed, we seek to find a rigorous statement about its efficiency; analysis of algorithms can go hand-in-hand with their design, or can be applied to known algorithms. Some of this work is motivated in part by the theory of NP-completeness, which strongly suggests that certain problems are just too hard to solve exactly and efficiently all of the time. It may be, though, that the difficult cases are relatively rare, so we attempt to investigate the behavior of problems and algorithms under assumptions about the distribution of inputs.

Our group at UCI has made major contributions in the area of probabilistic analysis. We have done work in algorithms for problems such as packing, partitioning, marking algorithms, and hashing. In particular, we have obtained a surprising result about the behavior of a well known marking algorithm, and an elegant analysis of double hashing.

Probability can provide a powerful tool even when we do not assume a probability distribution of inputs. In an approach called randomization, one can introduce randomness into the algorithm itself so that even on on worst-case input it works well with high probability. For example, for the classical List Update Problem, which involves organizing data so that we can perform searches efficiently, one of our faculty has shown how to use randomization to beat the inherent limit of a deterministic approach.

An area of considerable recent interest is on-line algorithms. Here we investigate the performance of algorithms which must provide answers based on part of the input before the remainder is available. A good example is memory paging---when swapping, the system must decide which memory pages to keep in its cache before it sees which ones will actually be used later. Earlier analysis of this problem had not been fully successful in explaining why a common heuristic performs so well. One of our faculty developed a new approach which formally models locality of reference, and thus can better explain the performance of paging algorithms.

## Problem complexity

When efficient solutions appear difficult, negative results can sometimes provide very helpful guidance. Two major types of results are possible here.

• In some cases one can actually prove that, under some model, the problem does not admit solution without a certain level of resources.
• For many problems, good bounds of the above type are not available, but the problem can be shown to be equivalent in complexity to some well-known class of problems. For example, if a problem is NP-complete it cannot be solved in polynomial time unless P=NP, which is a major open question.

Such results can save wasted effort by researchers, and in some cases might also suggest that algorithms from a different model should be considered.

Department of Computer Science
University of California, Irvine, CA 92697-3425