Self-tuning in Graph-based Reference Disambiguation.

Appeared in Int'l Conf. on Database Systems for Advanced Applications (DASFAA), 2007


Rabia Nuray-Turan, Dmitri V. Kalashnikov, and Sharad Mehrotra

Computer Science Department
University of California, Irvine
GDF project (http://www.ics.uci.edu/~dvk/GDF)

Abstract

Nowadays many data mining/analysis applications use the graph analysis techniques for decision making. Many of these techniques are based on the importance of relationships among the interacting units. A number of models and measures that analyze the relationship importance (link structure) have been proposed (e.g., centrality, importance and page rank) and they are generally based on intuition, where the analyst intuitively decides a reasonable model that fits the underlying data. In this paper, we address the problem of learning such models directly from training data. Specifically, we study a way to calibrate a connection strength measure from training data in the context of reference disambiguation problem. Experimental evaluation demonstrates that the proposed model surpasses the best model used for reference disambiguation in the past, leading to better quality of reference disambiguation.


Categories and Subject Descriptors:

H.2.m [Database Management]: Miscellaneous - Data cleaning;
H.2.8 [Database Management]: Database Applications - Data mining;
H.2.5 [Information Systems]: Heterogeneous Databases;
H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval


Keywords:

Connection strength, data cleaning, entity resolution, graph analysis, reference disambiguation, relationship analysis, GDF, adaptiveness to data


Downloadable files:

Paper: DASFAA07_dvk.pdf
Presentation Slides: DASFAA07_dvk.ppt

BibTeX entry:

@inproceedings{DASFAA07::dvk,
   author    = {Rabia Nuray-Turan and Dmitri V.\ Kalashnikov and Sharad Mehrotra},
   title     = {Self-tuning in Graph-based Reference Disambiguation},
   booktitle = {Proc.\ of the 12th International Conference on Database Systems for 
                Advanced Applications (DASFAA 2007), {Spinger LNCS}},
   year      = {2007},
   month     = {April 9--12},
   address   = {Bangkok, Thailand}
}



Back to Kalashnikov's homepage