Virtually every enterprise, scientific domain, or health care provider will assert that information integration is their most pressing information technology need. Yet despite the fact that research in data integration has been ongoing for over 20 years, we see few success stories in the real world.
There are many reasons for this: perhaps predominantly that (1) integration encompasses a wide variety of tasks and domains, and there is a delicate balance between general solutions and domain-specific ones; and (2) general solutions typically require a combination of techniques from a range of communities, including databases, information retrieval, machine learning, and knowledge representation or Semantic Web. For instance, integrating contact center call transcripts with structured (transaction and profile) data in real-time requires efficient techniques which can work on noisy transcribed data, integrating Web data may need to deal with adversarial content providers, and integrating genetic data may require similarity matching on gene sequences.
In recent years there has been a new emphasis on best-effort systems that combine automated approaches with user refinement or feedback, on integration techniques that combine the traditional stages of integration, and on using machine learning and other techniques with database concepts to address the needs of integration. These new approaches, generally targeting certain subclasses of the information integration problem, are highly promising.
The aim of the workshop is to encourage researchers from the information integration community to present novel issues and techniques related to applying information integration in different areas (especially in the context of integrating structured and unstructured data). The workshop will serve as a confluence of new ideas that will help drive research in the area of information integration from being 'generic' to being more focused, interactive, and realistic. We invite papers from researchers and practitioners working in information integration, data warehousing, privacy and trustworthy data systems and related areas to submit their original papers in this workshop. The main topics include, but are not limited to:
. Schema discovery and mapping . Handling data uncertainty and reliability in integration . Information integration issues in noisy unstructured text . Entity Discovery over poor quality of data . Data materialization and virtualization approaches . Record Linkage Issues . Constraint based data integration . Real-Time Information Integration . Interactive and best-effort integration . Leveraging linked open data . Leveraging Semantic web . Actionable Information Integration . Context Oriented Information Integration . Event Based Information Integration . User centric Information Integration . Integration issues in Peer-to-peer networks . Privacy Preserving Information Integration . Streaming Data Integration . Enterprise specific applications in all domains . Information Mash-up and Web2.0 . Real-Time business intelligence over integrated data . Semantic search over integrated data
Authors are invited to submit original, unpublished research papers that are not being considered for publication in any other forum. We also encourage submissions describing work-in-progress or lessons learnt in practice. Submissions must clearly identify the nature of the paper as research, experience, or position. All submitted papers will be peer reviewed and evaluated on originality, significance, technical soundness, and clarity of expression. By submitting a paper, authors explicitly agree that at least one of them will register for the workshop and present the paper.
Submissions must be in the standard ICDE format. The papers can be 4-6 pages in length. We also encourage early papers on novel work. Please submit your paper through the CMT site: https://cmt.research.microsoft.com/NTII2010/ The accepted papers will be published in the ICDE proceedings (CD version).
THE CAMERA READY PAPER WILL BE LIMITED TO 4 PAGES, THEREBY NOT RESTRICTING THE PUBLICATION OF THE MAIN IDEA IN OTHER CONFERENCES.
Workshop Chair: Laura Haas (IBM Almaden Research Center) PC co-chairs: Zachary G. Ives (University of Penssylvania) Manish A Bhide (IBM India Research Lab) Publicity Chair: Sumit Negi (IBM India Research Lab)
Michael J Cafarella (University of Michigan, USA) Yi Chen (Arizona State University, USA) Kevin Chang (UIUC, USA) Anish Das Sarma (Yahoo Research, USA) Luna Dong (AT&T Research, USA) Christoph Koch (Cornell University, USA) Ullas B Nambiar (IBM Research, India) Felix Naumann (Hasso Plattner Institute, Germany) Michalis Petropolous (University at Buffalo, USA) Evaggelia Pitoura (University of Ioannina, Greece) Prasan Roy (Aster Data Systems, USA) Michael Schrefl (JKU Linz, Austria) Kohichi Takeda (IBM Research, Japan) Millist Vincent (University of South Australia, Australia) Ji-Rong Wen (Microsoft Research Asia, China) Xiaofang Zhou (University of Queensland, Brisbane)
Submission Deadline : 28th August 2009 Notification Deadline : 30th Oct 2009