Data Drivers

K2 accesses external information through data drivers. This is an intermediate layer between the K2 system proper and the actual data sources. There are two kinds of drivers in K2, those that are tightly integrated with the server, and those that are more loosely connected.

Integrated Data Drivers

Two abstract Java™ classes, K2.driver.DriverA and K2.driver.DriverConnectionA, form K2's driver API. One way to connect a new data source to K2 is to extend these classes so that they can export a set of entry points to K2, connect to the data source, send queries to it and receive results from it, and package the results for use in the rest of the K2 system. This kind of driver is called an Integrated Data Driver, or IDD. The tight coupling of IDD's with the K2 system minimizes the overhead associated with connecting to the data source, and allows for additional optimizations to be performed. Complete instructions for building an IDD and connecting it to K2 can be found in the techdoc titled, "Connecting to a Data Source Using an IDD".

K2 comes with an IDD which can connect to any database system that implements Sun's JDBC API. Complete instructions for hooking up a database to K2 can be found in the techdoc titled, "Connecting to a JDBC Database".

Another IDD provided with K2 makes use of W4F, the World Wide Web Wrapper Factory, also developed at the University of Pennsylvania. W4F is a toolkit for the generation of wrappers for Web sources. It consists of a retrieval language to identify Web sources, a declarative extraction language to express robust extraction rules and a mapping interface to export the extracted information into some user-defined data-structures. The K2 W4F IDD uses the mapping interface to instantiate its DriverA and DriverConnectionA classes. Once you've built a W4F wrapper, you can connect it to K2 using the instructions in the techdoc titled, "Connecting to a Website".

A very powerful feature of K2 is its ability to distribute query execution using its IDD for Java™ RMI. This IDD can make an RMI connection to a remote K2 server and send it part of the local query for processing.

Decoupled Data Drivers

Sometimes it's difficult to instantiate an IDD to connect to a data source directly, due to limitations in Java™ or the data source itself. In these cases, it's often easier to write an intermediate program in some other language, like C or Perl, that can connect to the data source. This kind of driver, called a Decoupled Data Driver, or DDD, can then be connected to K2 through an IDD called the PipeDriver.

The PipeDriver executes the DDD as a separate process, and communicates with it using a simple protocol. The DDD simply establishes a connection (whatever that entails for the data source in question), tells K2 it has "made the connection", and waits for a query to come in through its standard input stream. When the DDD receives a query it sends it to the data source, gets the result, and prints it out in a data exchange format that resembles Kleisli's CPL. Once the result has been printed the DDD returns to waiting for the next query.

This method has been used to connect K2 to drivers that were originally developed for Kleisli. After some minor modifications in their input parameters and output format, the Kleisli drivers could be used as DDD's. More information about DDD's and how they can be used with K2 can be found in the techdoc titled, "Connecting to a Data Source using a DDD".


[K2 Front Page]