September 9, 2009
Carey, Li Awarded NSF Grant to Develop Technologies for Storing and Analyzing Semi-Structured Data
Grant part of $2.7 million grant across three University of California campuses
Michael Carey and Chen Li, professors of Computer Science at the University of California, Irvine’s Donald Bren School of Information and Computer Sciences, have been awarded a $1.8 million grant from the National Science Foundation’s Data-Intensive Computing Program. The project entitled “ASTERIX: A Highly Scalable Parallel Platform for Semistructured Data Management and Analysis” will research and develop new technologies for storing and analyzing semi-structured data.
Semi-structured data is a type of data that does not conform with the formal structure of tables and data models associated with databases. They do, however contain tags or other markers to separate semantic elements and hierarchies of records and fields within the data. The amount of semi-structured data is increasing rapidly as the Internet has allowed for information sets beyond traditional full-text documents and databases to exist.
“The evolution of the ‘human Web’, powered by HTML and HTTP, has revolutionized the way that people find information, buy things, communicate, and collaborate,” says Carey. “Web services and semi-structured data formats are having a similar impact on the ‘machine Web’.”
“Semi-structured data has been widely used in many popular Web services such as Google Map and eBay,” says Li.
The funds are a part of a larger $2.7 million grant to be distributed over three years across three UC campuses. Carey and Li will be collaborating with Yannis Papakonstantinou and Alin Deutsch of UC San Diego and Vassilis Tsotras of UC Riverside. Previously, Carey and Li were awarded $132,000 in seed funding in support of ASTERIX from a UC Discovery Grant and eBay.
A Bren Professor in Information and Computer Sciences, Carey’s research interests are in database systems, information integration, service-oriented computing, middleware, distributed systems, and computer system performance evaluation.
Li's research interests are in the fields of databases and information systems, including text search, data cleansing, data integration, and distributed data systems.