Chen Li

Associate Professor

Department of Computer Science, University of California, Irvine, CA 92697-3435

chenli @ ics.uci.edu, Office: 949-824-9470, Fax: 949-824-4056

http://www.ics.uci.edu/~chenli

 

Education

 

1996 – 2001

Stanford University

CA

Ph.D. Computer Science

Thesis: Query Processing and Optimization in Information-Integration Systems.

Advisor: Professor Jeffrey D. Ullman

 

1994 – 1996

Tsinghua University

Beijing, China

M.S. Computer Science

Dissertation: A Communication Interface based on SCI.

Advisor: Professor Meiming Shen.

 

1989 – 1994

Tsinghua University

Beijing, China

B.S., Computer Science. Minor: Enterprise Management

 

Research Interests

 

 

Databases, information systems, search, data integration, data warehousing, data sharing,  data quality, data privacy, web-data management, XML

Experience

 

7/2007 – current

UCI, Department of Computer Science

Irvine, CA

Associate Professor

Research and teaching in the areas of databases and information systems.

 

2001 – 7/2007

UCI, Department of Computer Science

Irvine, CA

Assistant Professor

Research and teaching in the areas of databases and information systems.

 

9/2005–12/2007

Google

Santa Monica/Irvine, CA

Visiting Research Scientist        

Full time before 12/31/2005, part time after 1/1/2006.

 

1996 – 2001

Stanford University, Computer Science

CA

Graduate Student Researcher

Focused on database research, including data integration, web databases, data warehouses, and multimedia databases.

 

2000

IBM Almaden Research Center

CA

Supplemental Research Associate

Researched on a web-based database system

 

2000

Common Object Inc.

San Mateo, CA

Consultant

Participated in the design and implementation of a corporate database system using data-integration technologies.

 

1998

Hewlett Packard Labs

Palo Alto, CA

Research Intern

Ported a well-known main-memory database system to a shared-memory distributed system.

 

1995

Chinese University of Hong Kong

Hong Kong

Student Visitor

Studied how to reduce communication cost using a simplified transfer protocol in parallel processing.

 

1993 – 1996

Tsinghua University

Beijing, China

Research Assistant

Researched on parallel and distributed processing and high-performance networking.

 

 

Publications

 

Journal Articles:

J1.          Answering Queries with Useful Bindings. Chen Li and Edward Chang. ACM Transactions on Database Systems (TODS), Volume 26, Issue 3, September 2001.

J2.          Clustering for Approximate Similarity Search in High-Dimensional Spaces. Chen Li, Edward Chang, Hector Garcia-Molina, and Gio Wiederhold. IEEE Transaction on Knowledge and Data Engineering, Volume 14, Number 4, pp.792-808, July/August 2002.

J3.          Computing Complete Answers to Queries in the Presence of Limited Access Patterns. Chen Li. VLDB J. 12(3): 211-227 (2003).

J4.          Recent Progress on Selected Topics on Database Research -- A Report from Nine Young Chinese Researchers Working in the United States. The Journal of Computer Science and Technology, Zhiyuan Chen, Chen Li, Jian Pei, Yufei Tao, Haixun Wang, Wei Wang, Jiong Yang, Jun Yang, and Donghui Zhang. Journal of Computer Science and Technology, Volume 18, Issue 5, Pages 538 – 552, September 2003.

J5.          Answering Queries Using Materialized Views with Minimum Size, Rada Chirkova, Chen Li, and Jia Li, VLDB Journal (2006), Volume 15, Number 3, 191-210.

J6.          Achieving Communication Efficiency through Push-Pull Partitioning of Semantic Spaces in Client-Server Architectures, Amitabha Bagchi, Amitabh Chaudhary, Michael T. Goodrich, Chen Li, and Michal Shmueli-Scheuer. TKDE, October 2006 (Vol. 18, No. 10).

J7.          Supporting Efficient Record Linkage for Large Data Sets Using Mapping Techniques, Liang Jin, Chen Li, and Sharad Mehrotra, World Wide Web Journal, Volume 9, Issue 4, 557-584, 2006.

J8.          Rewriting Queries Using Views in the Presence of Arithmetic Comparisons, Foto Afrati, Chen Li, and Prasenjit Mitra, Theoretical Computer Science, Volume 368, Numbers 1-2, pages 88-123, 2006.

J9.          Using Views to Generate Efficient Evaluation Plans for Queries, Foto Afrati, Chen Li, and Jeff Ullman, Journal of Computer and System Sciences.

J10.       SEPIA: Estimating Selectivities of Approximate String Predicates in Large Databases, Liang Jin, Chen Li, and Rares Vernica, VLDB Journal, Volume 17, Issue 5, pages 1213-1229, August 2008.

                                           

Peer-reviewed Conferences Full Papers:

 

C1.        Performance Analysis of the Communication Mechanism for POE Workstation Cluster. Weiqiang Zhuang, Chen Li, Meiming Shen. Microcomputer & Micro-system, Jan, 1995.

C2.        2D BubbleUp: Managing Parallel Disks for Media Servers. Edward Chang, Hector Garcia-Molina, and Chen Li. Proc. of International Conference of Foundations of Data Organization (FODO'98), pages 221-230, Kobe, Japan, 1998.  

C3.        RIME: A Replicated Image Detector for the World-Wide Web. Edward Chang, James Ze Wang, Chen Li, and Gio Wiederhold. Proceedings of SPIE Symposium of Voice, Video, and Data Communications, pages 58—67, Boston, MA, November 1998. 

C4.        Searching Near-Replicas of Images via Clustering. Edward Chang, Chen Li, James Wang, Peter Mork, and Gio Wiederhold. Proc. of SPIE Symposium of Voice, Video, and Data Communications, Multimedia Storage and Archiving Systems VI, pages 281-292, Boston, MA, September, 1999.  

C5.        Optimizing Large Join Queries in Mediation Systems. Ramana Yerneni, Chen Li, Jeffrey Ullman, Hector Garcia-Molina. International Conference on Database Theory (ICDT), Jerusalem, Israel, January, 1999. (29% accepted)  

C6.        Computing Capabilities of Mediators. Ramana Yerneni, Chen Li, Hector Garcia-Molina, Jeffrey Ullman. SIGMOD'99, Philadelphia, PA, May 1999. (20% accepted)  

C7.        Query Planning with Limited Source Capabilities. Chen Li and Edward Chang. Proc. of International Conference on Database Engineering (ICDE), pages 401-412, San Diego, CA, February, 2000. (14% accepted)

C8.        On Answering Queries in the Presence of Limited Access Patterns. Chen Li and Edward Chang. Proc. of International Conference on Database Theory (ICDT), pages 219-233, London, UK, January, 2001.   (35% accepted)

C9.        Minimizing View Sets without Losing Query-Answering Power. Chen Li, Mayank Bawa, and Jeff Ullman. Proc. of International Conference on Database Theory (ICDT), pages 99-113, London, UK, January, 2001. (35% accepted)

C10.     Generating Efficient Plans for Queries Using Views. Foto Afrati, Chen Li, and Jeff Ullman. Proc. of ACM SIGMOD, pages 319 - 330, Santa Barbara, CA, May, 2001. (15% accepted) 

C11.     Answering Queries Using Views with Arithmetic Comparisons. Foto Afrati, Chen Li, and Prasenjit Mitra. In Proc. of ACM PODS, pages 209-220, June 2002, Madison, Wisconsin. (22% accepted) 

C12.     Executing SQL over Encrypted Data in the Database-Service-Provider Model. Hakan Hacigumus, Bala Iyer, Chen Li, and Sharad Mehrotra. In Prof. of ACM SIGMOD, pages 216-227, June 2002, Madison, Wisconsin. (18% accepted)

C13.     Efficient Record Linkage in Large Data Sets, Liang Jin, Chen Li, and Sharad Mehrotra, Proc. of the 8th International Conference on Database Systems for Advanced Applications (DASFAA), pages 137-146, March, 2003, Kyoto, Japan. (33% accepted)

C14.     Materializing Views with Minimal Size to Answer Queries. Rada Chirkova and Chen Li. Proc. of ACM PODS, pages 38-48, San Diego, CA, June 2003. (20% accepted).

C15.     On Containment of Conjunctive Queries with Arithmetic Comparisons. Foto Afrati, Chen Li, Prasenjit Mitra. Proc. of EDBT’04, pages 459-476, Crete, Greece, March 2004 (14% accepted).

C16.     NNH: Improving Performance of Nearest-Neighbor Searches Using Histograms. Liang Jin, Nick Koudas, Chen Li. Proc. of EDBT’04, pages 385-402, Crete, Greece, March 2004. (14% accepted).

C17.     Secure XML Publishing without Information Leakage in the Presence of Data Inference. Xiaochun Yang and Chen Li, Proc. of VLDB'04, pages 96-107, August 29 -- September 3, Toronto, Canada. (16% accepted).

C18.     Indexing Mixed Types for Approximate Retrieval, Liang Jin, Nick Koudas, Chen Li, Anthony K.H. Tung. VLDB 2005. (16% accepted).

C19.     Selectivity Estimation for Fuzzy String Predicates in Large Data Sets, Liang Jin and Chen Li. VLDB 2005. (16% accepted).

C20.     Relaxing Join and Selection Queries. Nick Koudas, Chen Li, Anthony Tung, and Rares Vernica. VLDB 2006, Seoul, Korea, 2006. (13.2% accepted)

C21.     Supporting Approximate Similarity Queries with Quality Guarantees in P2P Systems, Qi Zhong, Iosif Lazaridis, Mayur Deshpande, Chen Li, Sharad Mehrotra, Hal Stern, COMAD 2006, December 14-16, 2006, Delhi, India. (26% accepted)

C22.     Protecting Individual Information Against Inference Attacks in Data Publishing, Chen Li, Houtan  Shirani-Mehr, and Xiaochun  Yang. DASFAA 2007. (18.7% accepted)

C23.     VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams, Chen Li, Bin Wang, and Xiaochun Yang. VLDB 2007.

C24.     Data Exchange with Arithmetic Comparisons, Foto Afrati, Chen Li, and Vassia Pavlaki. EDBT 2008.

C25.     Efficient Merging and Filtering Algorithms for Approximate String Searches, Chen Li, Jiaheng Lu, Yiming Lu, ICDE 2008, 257-266.

C26.     Supporting Keyword Queries on Structured Databases with Limited Search Interfaces Nurcan Yuruk, Xiaowei Xu, Chen Li, Jeffrey Xu Yu, DASFAA 2008.

C27.     Cost-Based Variable-Length-Gram Selection for String Collections to Support Approximate Queries Efficiently, Xiaochun Yang, Bin Wang, and Chen Li, ACM SIGMOD 2008.

C28.     Space-Constrained Gram-Based Indexing for Efficient Approximate String Search, Alexander Behm, Shengyue Ji, Chen Li, and Jiaheng Lu, ICDE 2009, to appear

C29.     Best-Effort Top-k Query Processing Under Budgetary Constraints, Michal Shmueli-Scheuer, Chen Li, Yosi Mass, Haggai Roitman, Ralf Schenkel, and Gerhard Weikum, ICDE 2009, to appear.

 

Peer-reviewed Workshop Publications and Demos (appeared in Proceedings):

W1.       HiComm -- A New Technique for Improving Communication Performance in Workstation Cluster. Chen Li, Weiqiang Zhuang, Meiming Shen, Dingxing Wang, Weimin Zheng, APPT'95, October, 1995, Beijing, China.

W2.       Capability Based Mediation in TSIMMIS. Chen Li, Ramana Yerneni, Vasilis Vassalos, Hector Garcia-Molina, Yannis Papakonstantinou, Jeffrey Ullman, Murty Valiveti. Proc. of ACM SIGMOD, Demo track, pages 564 – 566, Seattle, WA, June, 1998.

W3.       Towards Perception-Based Image Retrieval. Edward Chang, Beitao Li, and Chen Li. Proceedings of IEEE Workshop on Content-based Access of Image and Video Libraries, pages 401-412, South Carolina, June, 2000.  

W4.       Answering Queries with Database Restrictions (Research Summary). Chen Li. Symposium on Abstraction, Reformulation and Approximation (SARA), pages 328 – 329, July 2000, Horseshoe Bay (Lake LBJ), Texas.

W5.       Describing and Utilizing Constraints to Answer Queries in Data-Integration Systems. Chen Li. Online Proc. of IJCAI 2003 workshop on Information Integration on the Web, pages 163-168, August 2003, Acapulco, Mexico.

W6.       Schema-Guided Wrapper Maintenance for Web-Data Extraction. Xiaofeng Meng, Dongdong Hu, Chen Li. Proc. of International Workshop on Web Information and Data Management (WIDM), pages 1-8, New Orleans, Louisiana, Nov. 2003. (28% accepted)

W7.       A Supervised Visual Wrapper Generator for Web-Data Extraction. Xiaofeng Meng, Haiyan Wang, Dongdong Hu, Chen Li: Proc. of the 28th Annual International Computer Software and Applications Conference (COMPSAC), pages 657 – 662, Dallas, TX, Nov. 2003.

W8.       RACCOON: A Peer-Based System for Data Integration and Sharing. Chen Li, Jia Li, Qi Zhong. Proc. of ICDE'2004, page 852, Boston, MA, March 2004, demo track. (58% accepted)

W9.       XGuard: A System for Publishing XML Documents without Information Leakage in the Presence of Data Inference. Proc. of ICDE'2005, Demo track, Tokyo Japan, March 2005.

W10.    Answering Aggregation Queries on Hierarchical Web Sites Using Adaptive Sampling. Foto Afrati, Paraskevas Lekeas, and Chen Li. Technical Report, UCI ICS, August 2005. A short version appears in CIKM'2005, 31st October - 5th November, 2005 Bremen, Germany.

W11.    Quality-Driven Approximate Methods for GIS Data Integration. Ramaswamy Hariharan, Michal Schmueli-Scheuer, Chen Li, and Sharad Mehrotra. ACM GIS 2005, November 4-5th, 2005 Bremen, Germany.

W12.    Communication-Efficient Query Answering with Quality Guarantees in Client-Server Applications.  Michal Shmueli-Scheuer, Amitabh Chaudhary, Avigdor Gal, Chen Li.  WebDB  2007

W13.    Quality-Aware Retrieval of Data Objects from Autonomous Sources for Web-Based Repositories, Houtan Shirani-Mehr, Chen Li, Gang Liang, Michal Shmueli-Scheuer, to appear in ICDE 2008 as a poster.  

 

Technical Magazines:

M1.      Using Constraints to Describe Source Contents in Data Integration Systems. Chen Li, IEEE Intelligent Systems 18(5): 49-53 (2003).

 

Book Chapters:

B1.        Managing Parallel Disks for Continuous Media Data. Edward Chang, Chen Li, and Hector Garcia-Molina. A Book Chapter in Information Organization & Databases, p.107-120, Kluwer Publisher, 2000

 

Technical Reports:

 

Other publications:

O1.        Query Processing and Optimization in Information-Integration Systems. Chen Li. Ph.D. Thesis, Computer Science Department, Stanford University, August, 2001.

O2.        Report of the Workshop on Data Mining in the Internet Age, May 1 - 2, 2000, IBM Almaden Center, San Jose , California.

 

 

 

Professional activities

 

o        Recent Program Member: VLDB 2008, WWW 2008, ICDE 2008, KDD 2007, SIGMOD 2007, ICDT 2007, WebDB 2007, KDD 2006, CIKM 2006, VLDB 2005

o        Program Member and Proceedings Chair, ACM PODS 2005.

o        Program Member of Demonstrations Track, ACM SIGMOD 2005.

o        Program Member: DASFAA 2005.

o        Program Member:  The Third International Semantic Web Conference, ISWC 2004.

o        Program Member, CIKM 2004.

o        Program Member: The Sixth Asia Pacific Web Conference, 2004.

o        Program Member: the 9th International Conference on Database Systems for Advanced Applications (DASFAA 2004), Seogwipo KAL Hotel, Jeju Island, Korea.

o        Program Member: The Fourth International Conference on Web-Age Information Management (WAIM 2003) China, 2003.

o        Program Member: the first workshop on Semantics in Peer-to-Peer and Grid Computing. at the Twelfth International World Wide Web Conference 20 May 2003, Budapest, Hungary.

o        Panelist of a review panel of the NSF Division of Information and Intelligent Systems, 2003.

o        Demo/Industry Chair, committee member, and Session Chair of The Third International Conference on Web-Age Information Management (WAIM 2002), August, Beijing, China.

o        Panelist of Advances on Web-age Database Technologies: An International Forum (WDBT'02), Beijing, China

o        Organizer of Cal-It(2) Workshop for Crisis Response, Tuesday, March 19th, 2002, Beckman Conference Center, UC Irvine.

 

 

Invited talks

o        Answering Approximate Queries Efficiently, seminars at University of Toronto, University of Waterloo, and York University, Canada.

o        Answering Approximate Queries Efficiently, seminars at SRI and Yahoo!, August 2006.

o        Supporting Approximate String Matching, Talk at Google, December, 2005.

o        Answering Queries with Fuzzy String Predicates, seminar at University of Washington, July 2005

o        Answering Queries with Fuzzy String Predicates, Seminar at UCSB, June 2005.

o        Seminars at Tsinghua University, Beijing, China, August 2004.

o        Seminar at Microsoft Research Asia, Beijing, China, August 2004.

o        Seminar at National University of Singapore, July 2004.

o        Seminar at Hong Kong University of Science and Technology, July 2004.

o        Seminar at UCSD, May 2004.

o        CRITO, UCI, April 2003.

o        Seminar at ISI, Southern California, January, 2003.

o        Seminar at Wayne State University , January, 2003

o        Speaker at the Advanced on Web-age Database Technologies (WDBT), Peking University, Beijing, China, August, 2002.

o        Seminar at University of California, Los Angeles, November, 2001.

 

 

Affiliated Ph.D. students

o        Sattam Mubark Alsubaiee, PhD student, ICS, UCI

o        Alex Behm, PhD student, ICS, UCI

o        Minh Doan, PhD student, ICS, UCI

o        Shengyue Ji, PhD student, ICS, UCI

o        Michal Shmueli-Scheuer, Ph.D. student, ICS, UCI

o        Rares Vernica, Ph.D. student, ICS, UCI

 

 

Awards and Funding

 

o        2008, Research Funding Award of the "Research Funds for Oversea Scholars” program of the National Natural Science Foundation of China.

o        2007, NSF Grant titled “SGER: Answering Approximate String Queries Using Variable-Length Grams”

o        2006, ICS Ted & Janice Smith Faculty Seed Fund

o        2006, Google Research Award, $37.5K, renewable for a 2nd year.

o        2006: Gift Fund from Microsoft Research (7K).

o        2005: UCI Career Development Award.

o        2003 NSF CAREER Award, $400K, PI, starting 9/1/2003, for 5 years.

o        2003 NSF ITR Award 0331707, “Responding to the Unexpected,” $12.5M for 5 years, Senior Investigator, starting 10/1/2003.

o        Single Investigator Innovation Grant: CORCLR, UC Irvine, May 2003.

o        Stanford Graduate Fellowship, Stanford University, 1997 – 2001.

o        Merit-Based Scholarships, Tsinghua University, China, 1994 – 1996.

o        Outstanding Student Award, Tsinghua University, June 1994.

o        Entrance exams waived, Tsinghua University, 1994 and 1989.