Penn DB Group's logo
XMLCB, XML Query Optimization Using Chase and Backchase
Arrow; just used for page layout. People
Arrow, used for page layout Publications
Arrow, used for page layout Research
Arrow, used for page layout Classes
Arrow, used for page layout Seminar
Arrow, used for page layout Resources
Search this website

XML is becoming the principal medium for data exchange over the Web, and for information integration in general. Increasing amounts of public and private data are described in XML while more legacy sources (e.g., relational databases) offer public XML views. The feasibility of many applications that have emerged with the growth of XML on the Web requires new and complex query optimization techniques.

The goal of this research project is to develop a "chase & backchase" optimization method for XML queries. Based on chasing with constraints and incorporating cost-based optimization, the method brings together strategies such as use of indexes, use of materialized views, semantic optimization and join/scan minimization, allowing optimizations that depend on non-trivial interactions between these strategies. Particular attention is given to the challenges posed by XML document order and by regular path expressions in queries.

This project is expected to result in a theoretical foundation and a practical framework for defining and using indexes, materialized views and complex constraints in XML query processing systems. The practical framework will be demonstrated through a publicly available software prototype appropriate for teaching about XML query systems and for supporting related research projects.

Project Members

Val Tannen   Alin Deutsch   Arnaud Sahuguet   


  • Reasoning About Functional And Key Dependencies in Hierarchically Structured Data [.pdf] 
    Excerpt from PhD Thesis Carmem Hara (2004)
    Carmem Hara   

  • Reformulation of XML Queries and Constraints [.pdf] 
    International Conference on Database Theory (ICDT) (2003)
    Alin Deutsch   Val Tannen   

  • Querying XML with Mixed and Redundant Storage [.pdf] 
    Technical Report MS-CIS-02-01 (2002)
    Alin Deutsch   Val Tannen   

  • ubQL: A Distributed Query Language to Program Distributed Query Systems [.pdf] 
    Excerpt from PhD Thesis Arnaud Sahuguet (2002)
    Arnaud Sahuguet   

  • Containment for Classes of XPath Expressions Under Integrity Constraints [abstract] 
    Technical Report MS-CIS-01-21 (2001)
    Alin Deutsch   Val Tannen   

  • Optimization Properties for Classes of Conjunctive Regular Path Queries [.pdf] 
    International Workshop on Database Programming Languages (DBPL) (2001)
    Alin Deutsch   Val Tannen   

  • Containment and Integrity Constraints for XPath Fragments [.pdf] 
    KRDB (2001)
    Alin Deutsch   Val Tannen   

Levine Hall
3330 Walnut Street
Philadelphia, PA 19104

Last update: 08/02/11     Comments