Title: Autonomic Data Integration Systems Speaker: AnHai Doan (from UIUC) Abstract: The world today is a vast information bazaar, with millions of sources providing data in every imaginable format and mode of interaction. Data integration systems hold the promise of acting as crucial middlemen in this chaotic market, by interacting with data sources, translating, and combining their data in order to obtain the information requested by users. However, today such systems are still very hard to build and costly to operate. They must be told in tedious detail how to interact with data sources, and must be constantly modified to deal with changes at the sources. In this talk I will describe the AIDA project whose vision is autonomic data integration systems: those that take only minutes to be deployed (instead of weeks or months as is the case today), that require only minimal human coaching to rapidly reach and maintain competence, and that continuously improve over time, in terms of both performance and capabilities. I will discuss some fundamental issues that arise -- such as schema reconciliation, object matching and fusion, and source schema construction. I will also describe a conceptually novel approach to building autonomic systems on the Web: that of mass collaboration. I show how machine learning techniques can be extended to deal effectively with these problems. Finally, I discuss how the ideas developed in the context of this project can be applied to extracting information from text, marking up data on the Semantic Web, and constructing the "Capitalist" World-Wide Web.