Penn DB Group's logo
Update in databases and views
Arrow; just used for page layout. People
Arrow, used for page layout Publications
Arrow, used for page layout Research
Arrow, used for page layout Classes
Arrow, used for page layout Seminar
Arrow, used for page layout Resources
   
Search this website

Update in databases and views

Executive Summary

The acess to information from heterogeneous data sources, such as the World Wide Web, public databases and other, proprietary databases plays an important role in today's information technology. The efficient materialization of such derived data (views) is important, since large-scale queries to the source database are often quite expensive.

One of the crucial issues connected to the data redundancies in such environment is how to maintain derived data in the context of database changes. Generally, we can identify two classes of problems connected to updates:

  1. View Maintenance describes the problem of maintaining a materialized view while updating the source database(s). Updates to the source database are either immediately propagated to the view or are accumulated over time and the view is updates in frequent intervals (for instance, during night).
  2. View Update is the problem of propagating updates to the view to the source database. An updatable view must have certain characteristics -- i.e. the view cannot be defined by an arbitrary query on the source database.
For both problems, there is a large need for the efficient transformation and execution of updates. Furthermore, issues such as concurrency and consistency need to be investigated.

Updates in object-oriented databases are most often specified through methods. Typically, methods are written in powerful programming languages, such as C++ (ObjectStore), or O2, and are compiled before the database is used. Methods can be invoked and perform the requested updates on the database. Furthermore, transaction schedules are generated and updates to other data sources could be issued. There are various problems with this approach:

  • Updates are static: Only the updates defined through methods can be invoked. Any other potential update that cannot be expressed by the current method must be defined in the schema through additional methods and compiled into executable code.
  • The possibilities of optimizations is limited, since the compilations of the program code into executable code does not allow a flexible treatment of updates. In particular, complex updates involving sequences of updates and the invocation of methods within other methods cannot be rewritten into more efficient updates.
  • The specification of updates is system dependent: The interfaces to the database system in programming languages such as C++ or the proprietary languages to specify updates are not
Query languages, such as SQL and OQL do are standardized and do not have these disadvantages. However, update languages have not been studied to the same extent.

We developed a generic update language, CPL+, for updating complex value databases -- databases containing values composed of base values, sets, tuples, and variants. The complex value model is a generalization of the relational model. We propose various simplification and optimizations so that an update on a given database is transformed into a more efficient update expression.

Recently, we extended this work to the object-oriented data model. A new language, OQL+, has been developed to specify updates for such databases in the flavor of OQL and the update primitives known from SQL. Interesing issues such as efficient execution, non-deterministism of updates, and cost-based optimizations are investigated in this project.

Project Members

Susan Davidson   Hartmut Liefke   

Publications

Levine Hall
3330 Walnut Street
Philadelphia, PA 19104
 

Last update: 08/02/11     Comments