Penn DB Group's logo
Web Wrapper Factory
Arrow; just used for page layout. People
Arrow, used for page layout Publications
Arrow, used for page layout Research
Arrow, used for page layout Classes
Arrow, used for page layout Seminar
Arrow, used for page layout Resources
Search this website

Executive Summary

W4F is a toolkit for the generation of wrappers for Web sources. W4F consists of a retrieval language to identify Web sources, a declarative extraction language (the HTML Extraction Language) to express robust extraction rules and a mapping interface to export the extracted information into some user-defined data-structures. To assist the user and make the creation of wrappers rapid and easy, the toolkit offers some wysiwyg support via some wizards. Together, they permit the fast and semi-automatic generation of ready-to-go wrappers provided as Java classes. W4F has been successfully used to generate wrappers for database systems and software agents, making the content of Web sources easily accessible to any kind of application.
W4F Web site
Old Web site

Project Members

Arnaud Sahuguet   Fabien Azavant   


Levine Hall
3330 Walnut Street
Philadelphia, PA 19104

Last update: 08/02/11     Comments