Searching databases for sematically-related schemas

In this paper, we address the problem of searching schema databases for semantically-related schemas. We first give a method of finding semantic similarity between pair-wise schemas based on tokenization, part-of-speech tagging, word expansion, and ontology matching. We then address the problem of indexing the schema database through a semantic hash table. Matching schemas in the database are found by hashing the query attributes and recording peaks in the histogram of schema hits. Results indicated a 90% improvement in search performance while maintaining high precision and recall.