Clustering of LDAP directory schemas to facilitate information resources interoperability across organizations

Directories provide a well-defined general mechanism for describing organizational resources such as the resources of the Internet2 higher education research community and the Grid community. Lightweight directory access protocol directory services enable data sharing by defining the information's metadata (schema) and access protocol. Interoperability of directory information between organizations is increasingly important. Improved discovery of directory schemas across organizations, better presentation of their semantic meaning, and fast definition and adoption (reuse) of existing schemas promote interoperability of information resources in directories. This paper focuses on the discovery of related directory object class schemas and in particular on clustering schemas to facilitate discovering relationships and so enable reuse. The results of experiments exploring the use of self-organizing maps (SOMs) to cluster directory object classes at a level comparable to a set of human experts are presented. The results show that it is possible to discover the values of the parameters of the SOM algorithm so as to cluster directory metadata at a level comparable to human experts

[1]  Jay F. Nunamaker,et al.  Verifying the Proximity and Size Hypothesis for Self-Organizing Maps , 2000, J. Manag. Inf. Syst..

[2]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[3]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[4]  Cheng Hsu,et al.  Information Resources Management in Heterogeneous, Distributed Environments: A Metadatabase Approach , 1991, IEEE Trans. Software Eng..

[5]  Jouko Lampinen,et al.  Overtraining and model selection with the self-organizing map , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[6]  David West,et al.  A comparison of SOM neural network and hierarchical clustering methods , 1996 .

[7]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[8]  Cheng Hsu Enterprise integration and modeling - the metadatabase approach , 1995 .

[9]  Vijay K. Vaishnavi,et al.  Universal Enterprise Integration: Challenges of and Approaches to Web-Enabled Virtual Organizations , 2005, Inf. Technol. Manag..

[10]  Waiman Cheung,et al.  AN OBJECT ORIENTED SHELL FOR DISTRIBUTED PROCESSING , 1999 .

[11]  Bala Srinivasan,et al.  Dynamic self-organizing maps with controlled growth for knowledge discovery , 2000, IEEE Trans. Neural Networks Learn. Syst..

[12]  Morten T. Hansen,et al.  What's your strategy for managing knowledge? , 1999, Harvard business review.

[13]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[14]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[15]  Uday R. Kulkarni,et al.  Self-organizing map network as an interactive clustering tool - An application to group technology , 1995, Decis. Support Syst..

[16]  Dmitri Roussinov,et al.  A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation , 1998 .

[17]  T. Howes,et al.  Understanding and Deploying LDAP Directory Services , 2003 .

[18]  Michelle Q. Wang Baldonado,et al.  SONIA: a service for organizing networked information autonomously , 1998, DL '98.

[19]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[20]  Sandra Heiler,et al.  Semantic interoperability , 1995, CSUR.

[21]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[22]  Tharam S. Dillon,et al.  Automated knowledge acquisition , 1994, Prentice Hall International series in computer science and engineering.

[23]  Vijay K. Vaishnavi,et al.  An architecture to support communities of interest using directory services capabilities , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[24]  Sudha Ram,et al.  Clustering Schema Elements for Semantic Integration of Heterogeneous Data Sources , 2004, J. Database Manag..

[25]  Huimin Zhao,et al.  Combining schema and instance information for integrating heterogeneous databases : an analytical approach and empirical evaluation , 2002 .

[26]  Hsinchun Chen,et al.  Document clustering for electronic meetings: an experimental comparison of two techniques , 1999, Decis. Support Syst..

[27]  M. McInerney,et al.  Training the self-organizing feature map using hybrids of genetic and Kohonen methods , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[28]  Chinatsu Aone,et al.  Fast and effective text mining using linear-time document clustering , 1999, KDD '99.

[29]  Abbe Mowshowitz,et al.  Virtual Organization: A Vision of Management in the Information Age , 1994, Inf. Soc..

[30]  Daniel Polani,et al.  Training Kohonen Feature Maps in Different Topologies: An Analysis Using Genetic Algorithms , 1993, ICGA.

[31]  Andreas Nürnberger,et al.  Clustering of Document Collections using a Growing Self-Organizing Map , 2001 .

[32]  Jouko Lampinen,et al.  On the generative probability density model in the self-organizing map , 2002, Neurocomputing.

[33]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[34]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[35]  Amrit Tiwana,et al.  E-services: problems, opportunities, and digital platforms , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.

[36]  Oren Etzioni,et al.  Fast and Intuitive Clustering of Web Documents , 1997, KDD.

[37]  Samuel Kaski,et al.  Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..