Given a collection of n sites, a center point is a point (not necessarily a site) such that no hyperplane through the centerpoint partitions the collection into a very small and a very large subset. Center points have been used by Teng and others as a key step in the construction of geometric separators. One can find a point with this property by choosing a random sample of the collection and applying linear programming, but the complexity of that method grows exponentially with the dimension. This paper proposes an alternate method that produces lower quality approximations (in terms of the size of the worst hyperplane partition) but takes time polynomial in both n and d.
Uses geometric optimization techniques to find, among n weighted values, the k to drop so as to maximize the weighted average of the remaining values. The feasibility test for the corresponding decision problem involves k-sets in a dual line arrangement.
This paper shows how to use my dynamic closest pair data structure from "Dynamic Euclidean minimum spanning trees" for some non-geometric problems including hierarchical clustering, greedy matching, and TSP heuristics. Experiments show variants of my data structures to be faster than previously used heuristics.
(Source code and experimental data – SODA paper – JEA home page)
We show that, for any set of n points in d dimensions, there exists a hyperplane with regression depth at least ceiling(n/(d+1)). as had been conjectured by Rousseeuw and Hubert. Dually, for any arrangement of n hyperplanes in d dimensions there exists a point that cannot escape to infinity without crossing at least ceiling(n/(d+1)) hyperplanes. We also apply our approach to related questions on the existence of partitions of the data into subsets such that a common plane has nonzero regression depth in each subset, and to the computational complexity of regression depth problems.
We generalize regression depth to k-flats. The k=0 case gives the classical notion of center points. We prove that for any set of n points in Rd there always exists a k-flat with depth at least a constant fraction of n. As a consequence, we derive a linear-time (1+epsilon)-approximation algorithm for the deepest flat. The full version of this paper also includes results from "Computing the Depth of a Flat".
We compute the regression depth of a k-flat in a set of n d-dimensional points, in time O(nd-2), an order of magnitude faster than the best known algorithms for computing the depth of a point or of a hyperplane. The results from this conference paper have been merged into the full version of "Multivariate Regression Depth".
(SODA talk slides – SODA paper)
Surveys projective duality and its uses in algorithms for robust regression. The MSRI talk used the alternative title "Computational geometry and robust statistics" but contained essentially the same content.
(DIMACS talk slides – video and slides of MSRI talk)
Defines quasiconvex programming, a form of generalized linear programming in which one seeks the point minimizing the pointwise maximum of a collection of quasiconvex functions. Surveys algorithms for solving quasiconvex programs either numerically or via generalizations of the dual simplex method from linear programming, and describe varied applications of this geometric optimization technique in meshing, scientific computation, information visualization, automated algorithm analysis, and robust statistics.
(DIMACS talk slides – ALGO talk slides)
We describe an efficient streaming-model construction of epsilon-nets and epsilon-approximations, and use it to find deterministic streaming-model approximation algorithms for iceberg range queries and for various robust statistics problems.
Discusses a paper by Mizera and Müller on depth-based methods for simultaneously fitting both a center and a radius to a set of sample points, by viewing the points as lying on the boundary of a model of a higher dimensional hyperbolic space. Reformulates their method in combinatorial terms more likely to be familiar to computational geometers, and discusses the algorithmic implications of their work.
We find efficient constant factor approximation algorithms for hierarchically clustering of a point set in any metric space, minimizing the sum of minimimum spanning tree lengths within each cluster, and in the hyperbolic or Euclidean planes, minimizing the sum of cluster perimeters. Our algorithms for the hyperbolic and Euclidean planes can also be used to provide a pants decomposition with approximately minimum total length.
Publications – David Eppstein – Theory Group – Inf. & Comp. Sci. – UC Irvine
Semi-automatically filtered from a common source file.