A New Approach For Top-k Flexible Queries In Large Database Using The Knowledge Discovered

In this paper, we propose our contribution to support top-k flexible query in large DB. Generally, the current top-k query processing techniques focus on Boolean queries, and cannot be applied to the large DB seen the gigantic number of data. Our approach proposes to uses the generated knowledge result of an algorithm for Knowledge Discovery in Database (KDD). It consists of two steps: 1) Extraction of Knowledge by applying a new approach for KDD through the fusion of conceptual clustering, fuzzy logic and formal concept analysis, and 2) generation efficient answers to top-k flexible queries using the generated knowledge in the first step. We prove that this approach is optimum sight that the evaluation of the query is not done on the set of starting data which are enormous but rather by using the set of knowledge on these data; what is to our opinion one of the principal’s goal of KDD approaches. Keywords-Top-k queries; KDD; Data minig; FCA; Fuzzy logic.

[1]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[2]  John R. Smith,et al.  The onion technique: indexing for linear optimization queries , 2000, SIGMOD '00.

[3]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[4]  Ronald Fagin,et al.  Incorporating User Preferences in Multimedia Queries , 1997, ICDT.

[5]  Vagelis Hristidis,et al.  PREFER: a system for the efficient execution of multi-parametric ranked queries , 2001, SIGMOD '01.

[6]  Keke Chen,et al.  “Best K”: critical clustering structures in categorical datasets , 2008, Knowledge and Information Systems.

[7]  Le Gruenwald,et al.  A survey of data mining and knowledge discovery software tools , 1999, SKDD.

[8]  Hanan Samet,et al.  Index-driven similarity search in metric spaces (Survey Article) , 2003, TODS.

[9]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS.

[10]  Luis Gravano,et al.  Optimizing top-k selection queries over multimedia repositories , 2004, IEEE Transactions on Knowledge and Data Engineering.

[11]  Siu Cheung Hui,et al.  A Fuzzy FCA-based Approach to Conceptual Clustering for Automatic Generation of Concept Hierarchy on Uncertainty Data , 2004, CLA.

[12]  Patrick Valduriez,et al.  Reducing network traffic in unstructured P2P systems using Top-k queries , 2006, Distributed and Parallel Databases.

[13]  Minyar Sassi Hidri,et al.  Clustering Quality Evaluation Based on Fuzzy FCA , 2007, DEXA.

[14]  Ronald Fagin,et al.  Fuzzy queries in multimedia database systems , 1998, PODS '98.

[15]  Zhe Wang,et al.  Efficient top-K query calculation in distributed networks , 2004, PODC '04.

[16]  Luis Gravano,et al.  Top-k selection queries over relational databases: Mapping strategies and performance evaluation , 2002, TODS.

[17]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[18]  Luis Gravano,et al.  Evaluating Top-k Selection Queries , 1999, VLDB.

[19]  Shengrui Wang,et al.  FCM-Based Model Selection Algorithms for Determining the Number of Clusters , 2004, Pattern Recognit..

[20]  Jianliang Xu,et al.  Monitoring Top-k Query inWireless Sensor Networks , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[21]  Minyar Sassi Hidri,et al.  An Innovative Contribution to Flexible Query Through the Fusion of Conceptual Clustering, Fuzzy Logic, and Formal Concept Analysis , 2009, Int. J. Comput. Their Appl..

[22]  Yves Bastide,et al.  Intelligent Structuring and Reducing of Association Rules with Formal Concept Analysis , 2001, KI/ÖGAI.

[23]  Christopher Olston,et al.  Distributed top-k monitoring , 2003, SIGMOD '03.

[24]  Gerd Stumme,et al.  Computing iceberg concept lattices with T , 2002, Data Knowl. Eng..

[25]  Surya Nepal,et al.  Query processing issues in image (multimedia) databases , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[26]  Yehoshua Sagiv,et al.  Finding and approximating top-k answers in keyword proximity search , 2006, PODS '06.

[27]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[28]  Fereidoon Sadri,et al.  Recognizing Credible Experts in Inaccurate Databases , 1994, ISMIS.

[29]  Divyakant Agrawal,et al.  An integrated efficient solution for computing frequent and top-k elements in data streams , 2006, TODS.

[30]  R. Varshney,et al.  Supporting top-k join queries in relational databases , 2011 .

[31]  John R. Smith,et al.  Supporting Incremental Join Queries on Ranked Inputs , 2001, VLDB.

[32]  Seung-won Hwang,et al.  Minimal probing: supporting expensive predicates for top-k queries , 2002, SIGMOD '02.

[33]  Mohammed J. Zaki Mining Non-Redundant Association Rules , 2004, Data Min. Knowl. Discov..