Information integration on the Web: a view from AI and databases (report on IIWeb-03)

This document is a report on the workshop on Information Integration on the Web (IIWeb-03), held in Acapulco, Mexico, on August 9-10, as part of the 2003 International Joint Conference on Artificial Intelligence. The full proceedings of the workshop are available online [1]. A small sample of the papers presented at the workshop were also included in a special issue of IEEE Intelligent Systems [2]. Effective integration of heterogeneous databases and information sources has been cited as the most pressing challenge in spheres as diverse as corporate data management, homeland security, counter-terrorism and the human genome project. An important impediment to scaling up integration frameworks to large-scale applications has been the fact that the autonomous and decentralized nature of the data sources constrains the mediators to operate with very little information about the structure, scope, profile, quality and inter-relations of the information sources they are trying to integrate. As stated, the problem of information integration1 crosses the boundaries of AI and Databases, and includes research in the areas of machine learning, data mining, automated planning, constraint reasoning, databases, view integration, information extraction, semantic web, web services, and other related areas. Not surprisingly, the problem of information integration has drawn significant interest from both AI and Databases. Although there are a variety of forums where research on information integration is presented, most of these forums are naturally seen as “belonging” to either the Artificial Intelligence (AAAI, IJCAI, ICML, etc.) or Database (SIGMOD, VLDB, ICDE, CIKM, etc.) communities. The primary purpose of this workshop was thus to bring together researchers from AI and Databases who are working in a variety of problems related to integrating information on the Web. 2