Automatic Entity Extraction From an N-Ary Relation: Toward a General Law for Information Decomposition

Abstract N -ary relations provide a convinient tool for database modeling. In earlier work, we had shown that n -ary relations can be decomposed into simpler, more structured relations in such a way as to save storage space and expedite information retrieval. In this paper we submit the thesis that our proposed decomposition preserves the important information of the original relation, that it preserves the discriminating power of the n -ary relation, and that it permits us to extract entity types from a database instance. Experimentation with this approach on actual databases and its preliminary validation have given encouraging results.

[1]  Roger King,et al.  Semantic database modeling: survey, applications, and research issues , 1987, CSUR.

[2]  Tom DeMarco,et al.  Structured Analysis and System Specification , 1978 .

[3]  Ramez Elmasri,et al.  The Category Concept: An Extension to the Entity-Relationship Model , 1985, Data Knowl. Eng..

[4]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[5]  Chris Gane,et al.  Structured Systems Analysis: Tools and Techniques , 1977 .

[6]  Sunit K. Gala,et al.  Classification as a query processing technique in the CANDIDE semantic data model , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[7]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[8]  Yair Wand,et al.  An Automated Approach to Information Systems Decomposition , 1992, IEEE Trans. Software Eng..

[9]  Alfred Tarski,et al.  Relational selves as self-affirmational resources , 2008 .

[10]  Claude Delobel,et al.  Normalization and hierarchical dependencies in the relational data model , 1978, TODS.

[11]  Herbert A. Simon,et al.  Aggregation of Variables in Dynamic Systems , 1961 .

[12]  Kenneth T. Orr Structured systems development , 1977 .

[13]  S. Zeldin,et al.  Higher Order Software—A Methodology for Defining Software , 1976, IEEE Transactions on Software Engineering.

[14]  Ning Zhong,et al.  Discovering Concept Clusters by Decomposing Databases , 1994, Data Knowl. Eng..

[15]  J. Riguet,et al.  Relations binaires, fermetures, correspondances de Galois , 1948 .

[16]  Ronald J. Brachman,et al.  An overview of the KL-ONE Knowledge Representation System , 1985 .

[17]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[18]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[19]  C. J. Date An Introduction to Database Systems, Volume I, 5th Edition , 1986 .

[20]  Pierre-Jacques Courtois,et al.  On time and space decomposition of complex structures , 1985, CACM.

[21]  Ali Jaoua,et al.  Décomposition Rectangulaire Optimale D’une Relation Binaire: Application Aux Bases De Données Documentaires , 1994 .

[22]  Gregory Piatetsky-Shapiro,et al.  Knowledge discovery workbench for exploring business databases , 1992, Int. J. Intell. Syst..

[23]  Deborah L. McGuinness,et al.  CLASSIC: a structural data model for objects , 1989, SIGMOD '89.

[24]  Shamkant B. Navathe,et al.  A Conceptual Clustering Algorithm for Database Schema Design , 1994, IEEE Trans. Knowl. Data Eng..