Automatic generation of domain representations using thesaurus structures

Domain analysis was first used 15 years ago as one of the most important techniques for software reuse. Even today, new techniques appear every year, and different authors propose different domain representation structures to represent and store all the different software components and the relationships among them. These relationships among components are the kernel of the domain semantics. In this report, a set of techniques and tools is presented regarding mathematical, statistical, and neural fields that, when linked together, enable semiautomatically building domain representations and storing them in a thesaurus structure of software components. Thesaurus structures, widely used in information science, are presented as the domain-modeling key concept, due to their higher automation possibilities compared with previous structures. New metrics to evaluate the quality, consistency, and completeness of the domain model obtained through this technique are also presented.

[1]  Alberto Muñoz,et al.  Creating Term Associations Using a Hierarchical ART Architecture , 1996, ICANN.

[2]  M. Callon,et al.  From translations to problematic networks: An introduction to co-word analysis , 1983 .

[3]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[4]  Rubén Prieto-Díaz,et al.  Domain analysis: an introduction , 1990, SOEN.

[5]  Hsinchun Chen,et al.  Automatic Thesaurus Generation for an Electronic Community System , 1995, J. Am. Soc. Inf. Sci..

[6]  Ruben Prieto-Diaz,et al.  Criteria for Comparing Reuse-Oriented Domain Analysis Approaches , 1992, Int. J. Softw. Eng. Knowl. Eng..

[7]  Yuen Ren Chao,et al.  Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology , 1950 .

[8]  Fernando Sánchez León,et al.  Development of a Spanish Version of the Xerox Tagger , 1995, ArXiv.

[9]  Ken Arnold,et al.  The Java Programming Language , 1996 .

[10]  Peter Freeman,et al.  Classifying Software for Reusability , 1987, IEEE Software.

[11]  Alan Gilchrist,et al.  Thesaurus construction: a practical manual , 1972 .

[12]  J. Llorens,et al.  Software Thesaurus: a tool for reusing software objects , 1996, Proceedings of the Fourth International Symposium on Assessment of Software Tools.

[13]  Zygmunt Mazur Models of a Distributed Information Retrieval System Based on Thesauri with Weights , 1994, Inf. Process. Manag..

[14]  Stephen I. Gallant,et al.  Neural network learning and expert systems , 1993 .

[15]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[16]  Alain Lelu Modeles neuronaux pour l'analyse de donnees documentaires et textuelles : organiser de très grands tableaux de données qualitatives en pôles et zones d'influence , 1993 .

[17]  Shiyali Ramamrita Ranganathan,et al.  Prolegomena to Library Classification , 1967 .

[18]  Jonathan D. Cohen Highlights: language- and domain-independent automatic indexing terms for abstracting , 1995 .

[19]  James M. Neighbors,et al.  The Draco Approach to Constructing Software from Reusable Components , 1984, IEEE Transactions on Software Engineering.

[20]  Melvil Dewey Decimal classification and relative index , 1942 .

[21]  Guillermo Arango,et al.  Domain Analysis and Software Systems Modeling , 1991 .

[22]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach (McGraw-Hill Series in Computer Science) , 2004 .

[23]  James Milne Neighbors,et al.  Software construction using components , 1980 .

[24]  Rubén Prieto-Díaz Implementing faceted classification for software reuse , 1991, CACM.