Mining approximate functional dependencies and concept similarities to answer imprecise queries

Current approaches for answering queries with imprecise constraints require users to provide distance metrics and importance measures for attributes of interest. In this paper we focus on providing a domain and end-user independent solution for supporting imprecise queries over Web databases without affecting the underlying database. We propose a query processing framework that integrates techniques from IR and database research to efficiently determine answers for imprecise queries. We mine and use approximate functional dependencies between attributes to create precise queries having tuples relevant to the given imprecise query. An approach to automatically estimate the semantic distances between values of categorical attributes is also proposed. We provide preliminary results showing the utility of our approach.

[1]  K. K. Nambiar,et al.  Some Analytic Tools for the Design of Relational Database Systems , 1980, VLDB.

[2]  Subbarao Kambhampati,et al.  BibFinder/StatMiner: Effectively Mining and Using Coverage and Overlap Statistics in Data Integration , 2003, VLDB.

[3]  Heikki Mannila,et al.  Approximate Inference of Functional Dependencies from Relations , 1995, Theor. Comput. Sci..

[4]  Subbarao Kambhampati,et al.  Effectively mining and using coverage and overlap statistics for data integration , 2005, IEEE Transactions on Knowledge and Data Engineering.

[5]  Amihai Motro FLEX: A Tolerant and Cooperative User Interface to Databases , 1990, IEEE Trans. Knowl. Data Eng..

[6]  Qiming Chen,et al.  A Structured Approach for Cooperative Query Answering , 1994, IEEE Trans. Knowl. Data Eng..

[7]  Tony T. Lee,et al.  An Infornation-Theoretic Analysis of Relational Databases—Part I: Data Dependencies and Information Metric , 1987, IEEE Transactions on Software Engineering.

[8]  Amihai Motro,et al.  VAGUE: a user interface to relational databases that permits vague queries , 1988, TOIS.

[9]  Hannu Toivonen,et al.  Efficient discovery of functional and approximate dependencies using partitions , 1998, Proceedings 14th International Conference on Data Engineering.

[10]  Sharad Mehrotra,et al.  Integrating similarity based retrieval and query refinement in databases , 2002 .

[11]  Heikki Mannila,et al.  Approximate Dependency Inference from Relations , 1992, ICDT.

[12]  Mehmet M. Dalkilic,et al.  Information dependencies , 2000, PODS '00.

[13]  Roy Goldman,et al.  Proximity Search in Databases , 1998, VLDB.

[14]  Subbarao Kambhampati,et al.  Answering imprecise database queries: a novel approach , 2003, WIDM '03.

[15]  James Allan,et al.  Automatic Retrieval With Locality Information Using SMART , 1992, TREC.

[16]  Subbarao Kambhampati,et al.  Providing ranked relevant results for web database queries , 2004, WWW Alt. '04.

[17]  Dan Klein,et al.  Evaluating strategies for similarity search on the web , 2002, WWW '02.

[18]  William W. Cohen Integration of heterogeneous databases without common domains using queries based on textual similarity , 1998, SIGMOD '98.

[19]  Joan M. Morrissey,et al.  Imprecise information and uncertainty in information systems , 1990, TOIS.

[20]  Qiming Chen,et al.  Cooperative Query Answering via Type Abstraction Hierarchy , 1991 .