Skip to main navigation Skip to search Skip to main content

Modeling and extracting deep-web query interfaces

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

24 Scopus citations

Abstract

Interface modeling & extraction is a fundamental step in building a uniform query interface to a multitude of databases on the Web. Existing solutions are limited in that they assume interfaces are flat and thus ignore the inherent structure of interfaces, which then seriously hampers the effectiveness of interface integration. To address this limitation, in this chapter, we model an interface with a hierarchical schema (e.g., an ordered-tree of attributes). We describe ExQ, a novel schema extraction system with two distinct features. First, ExQ discovers the structure of an interface based on its visual representation via spatial clustering. Second, ExQ annotates the discovered schema with labels from the interface by imitating the human-annotation process. ExQ has been extensively evaluated with real-world query interfaces in five different domains and the results show that ExQ achieves above 90% accuracy rate in both structure discovery & schema annotation tasks.

Original languageEnglish
Title of host publicationAdvances in Information and Intelligent Systems
EditorsZbigniew Ras, William Ribarsky
Pages65-90
Number of pages26
DOIs
StatePublished - 2009

Publication series

NameStudies in Computational Intelligence
Volume251

Fingerprint

Dive into the research topics of 'Modeling and extracting deep-web query interfaces'. Together they form a unique fingerprint.

Cite this