Self-Adaptive Semantic Focused Crawler for Mining Services Information Discovery

It is well recognized that the Internet has become the largest marketplace in the world, and online advertising is very popular with numerous industries, including the traditional mining service industry where mining service advertisements are effective carriers of mining service information. However, service users may encounter three major issues - heterogeneity, ubiquity, and ambiguity, when searching for mining service information over the Internet. In this paper, we present the framework of a novel self-adaptive semantic focused crawler - SASF crawler, with the purpose of precisely and efficiently discovering, formatting, and indexing mining service information over the Internet, by taking into account the three major issues. This framework incorporates the technologies of semantic focused crawling and ontology learning, in order to maintain the performance of this crawler, regardless of the variety in the Web environment. The innovations of this research lie in the design of an unsupervised framework for vocabulary-based ontology learning, and a hybrid algorithm for matching semantically relevant concepts and metadata. A series of experiments are conducted in order to evaluate the performance of this crawler. The conclusion and the direction of future work are given in the final section.

[1]  C. Jackson,et al.  Goods and services , 2019, Global Political Economy.

[2]  José L. Martínez Lastra,et al.  Semantic web services in factory automation: fundamental insights and research roadmap , 2006, IEEE Transactions on Industrial Informatics.

[3]  WangHuaiqing,et al.  Consumer privacy concerns about Internet marketing , 1998 .

[4]  Hai Dong,et al.  Focused Crawling for Automatic Service Discovery, Annotation, and Classification in Industrial Digital Ecosystems , 2011, IEEE Transactions on Industrial Electronics.

[5]  Elizabeth Chang,et al.  A framework for discovering and classifying ubiquitous services in digital health ecosystems , 2011, J. Comput. Syst. Sci..

[6]  Dirk Timmermann,et al.  Beyond 6LoWPAN: Web Services in Wireless Sensor Networks , 2013, IEEE Transactions on Industrial Informatics.

[7]  Chen Wang,et al.  Consumer privacy concerns about Internet marketing , 1998, CACM.

[8]  Alexander Fay,et al.  Software Support for Building Automation Requirements Engineering—An Application of Semantic Web Technologies in Automation , 2011, IEEE Transactions on Industrial Informatics.

[9]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[10]  Elizabeth Chang,et al.  A Service Search Engine for the Industrial Digital Ecosystems , 2011, IEEE Transactions on Industrial Electronics.

[11]  José L. Martínez Lastra,et al.  Service-Oriented Architecture for Distributed Publish/Subscribe Middleware in Electronics Production , 2006, IEEE Transactions on Industrial Informatics.

[12]  C. Lovelock Classifying Services to Gain Strategic Marketing Insights , 1983 .

[13]  Elizabeth Chang,et al.  A context‐aware semantic similarity model for ontology environments , 2011, Concurr. Comput. Pract. Exp..

[14]  Kok Kiong Tan,et al.  Development of Bluewave: A Wireless Protocol for Industrial Automation , 2006, IEEE Transactions on Industrial Informatics.

[15]  Elizabeth Chang,et al.  State of the Art in Semantic Focused Crawlers , 2009, ICCSA.

[16]  Andrew McCallum,et al.  Using Reinforcement Learning to Spider the Web Efficiently , 1999, ICML.

[17]  Eugenio Di Sciascio,et al.  Semantic-Based Enhancement of ISO/IEC 14543-3 EIB/KNX Standard for Building Automation , 2011, IEEE Transactions on Industrial Informatics.

[18]  Hong-Gee Kim,et al.  An ontology-based approach to learnable focused crawling , 2008, Inf. Sci..

[19]  Mohammed Bennamoun,et al.  Ontology learning from text: A look back and into the future , 2012, CSUR.

[20]  Barbara Pernici,et al.  URBE: Web Service Retrieval Based on Similarity Evaluation , 2009, IEEE Transactions on Knowledge and Data Engineering.

[21]  R. C. Judd The Case for Redefining Services , 1964 .

[22]  Elizabeth Chang,et al.  Ontology-Learning-Based Focused Crawling for Online Service Advertising Information Discovery and Classification , 2012, ICSOC.

[23]  Benjamin Fabian,et al.  SHARDIS: A Privacy-Enhanced Discovery Service for RFID-Based Product Information , 2012, IEEE Transactions on Industrial Informatics.

[24]  Ieee Xplore,et al.  IEEE Transactions on Industrial Informatics , 2005 .

[25]  Yang Gao,et al.  An efficient adaptive focused crawler based on ontology learning , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).

[26]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..