Evaluating probabilistic queries over imprecise data.

Appeared in ACM SIGMOD 2003 Conference.

Reynold Cheng, Dmitri V. Kalashnikov, and Sunil Prabhakar

Department of Computer Sciences
Purdue University
PLACE project (http://www.cs.purdue.edu/place/)

Abstract

Many applications employ sensors for monitoring entities such as temperature and wind speed. A centralized database tracks these entities to enable query processing. Due to continuous changes in these values and limited resources (e.g., network bandwidth and battery power), it is often infeasible to store the exact values at all times. A similar situation exists for moving object environments that track the constantly changing locations of objects. In this environment, it is possible for database queries to produce incorrect or invalid results based upon old data. However, if the degree of error (or uncertainty) between the actual value and the database value is controlled, we can place more confidence in the answers to queries. More generally, query answers can be augmented with probabilistic estimates of the validity of the answers. In this paper we study probabilistic query evaluation based upon uncertain data. A classification of queries is made based upon the nature of the result set. For each class, we develop algorithms for computing probabilistic answers. We address the important issue of measuring the quality of the answers to these queries, and provide algorithms for efficiently pulling data from relevant sensors or moving objects in order to improve the quality of the executing queries. Extensive experiments are performed to examine the effectiveness of several data update policies.


Keywords:

Sensor databases, querying imprecise data, uncertainty, uncertainty region, handling uncertainty, models of uncertainty, aggregate query, entity-based query, value-based query, probabilistic queries, quality of probabilistic answer, update heuristics, location-aware computing,


Downloadable files:

Paper: SIGMOD03_dvk.pdf (close to the final version)
Slides: SIGMOD03.ppt
See also a moving-object environment solution to a similar problem.


BibTeX entry:

@inproceedings{SIGMOD03::dvk,
   author    = {Reynold Cheng and Dmitri V. Kalashnikov and Sunil Prabhakar},
   title     = {Evaluating probabilistic queries over imprecise data},
   booktitle = {Proc. of ACM SIGMOD Int'l Conf. on Management of Data (ACM SIGMOD 2003)},
   year      = {2003},
   month     = {June 9--12},
   address   = {San Diego, CA, USA}
}


Back to Kalashnikov's homepage