Extracting significant Website Key Objects: A Semantic Web mining approach

Web mining has been traditionally used in different application domains in order to enhance the content that Web users are accessing. Likewise, Website administrators are interested in finding new approaches to improve their Website content according to their users' preferences. Furthermore, the Semantic Web has been considered as an alternative to represent Web content in a way which can be used by intelligent techniques to provide the organization, meaning, and definition of Web content. In this work, we define the Website Key Object Extraction problem, whose solution is based on a Semantic Web mining approach to extract from a given Website core ontology, new relations between objects according to their Web user interests. This methodology was applied to a real Website, whose results showed that the automatic extraction of Key Objects is highly competitive against traditional surveys applied to Web users.

[1]  Richard E. Berry,et al.  Common User Access - A Consistent and Usable Human-Computer Interface for the SAA Environments , 1988, IBM Syst. J..

[2]  Juan D. Velásquez,et al.  Web site keywords: A methodology for improving gradually the web site text content , 2012, Intell. Data Anal..

[3]  Václav Snásel,et al.  Web Content Mining Focused on Named Objects , 2009, IHCI.

[4]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[5]  Nathanael Chambers,et al.  Using Semantics to Identify Web Objects , 2006, AAAI.

[6]  Mirina Grosz,et al.  World Wide Web Consortium , 2010 .

[7]  George A. Vouros,et al.  Learning subsumption hierarchies of ontology concepts from texts , 2010, Web Intell. Agent Syst..

[8]  Jakob Nielsen,et al.  Prioritizing Web Usability , 2006 .

[9]  Santosh S. Vempala,et al.  Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[10]  W. Jim Zheng,et al.  Text-mining approach to evaluate terms for ontology development , 2009, J. Biomed. Informatics.

[11]  Mehran Sahami,et al.  Mining the Web to Determine Similarity Between Words, Objects, and Communities , 2006, FLAIRS.

[12]  Ning Zhong,et al.  Web Intelligence Meets Brain Informatics, First WICI International Workshop, WImBI 2006, Beijing, China, December 15-16, 2006, Revised Selected and Invited Papers , 2007, WImBI.

[13]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[14]  Y. Li,et al.  Ontology-based Web mining model: representations of user profiles , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[15]  Myra Spiliopoulou,et al.  Analysis of navigation behaviour in web sites integrating multiple information systems , 2000, The VLDB Journal.

[16]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[17]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[18]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[19]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[20]  Till Plumbaum,et al.  Semantic Web Usage Mining: Using Semantics to Understand User Intentions , 2009, UMAP.

[21]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[22]  San Murugesan,et al.  Extraction of keyterms by simple text mining for business information retrieval , 2005, IEEE International Conference on e-Business Engineering (ICEBE'05).

[23]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[24]  Andreas Hotho,et al.  Towards Semantic Web Mining , 2002, SEMWEB.

[25]  Wenfei Fan,et al.  Path constraints on semistructured and structured data , 1998, PODS '98.

[26]  V. Palade,et al.  Adaptive Web Sites - A Knowledge Extraction from Web Data Approach , 2008, Frontiers in Artificial Intelligence and Applications.

[27]  Radek Burget,et al.  Web Page Element Classification Based on Visual Features , 2009, 2009 First Asian Conference on Intelligent Information and Database Systems.

[28]  Jing-Song Hu,et al.  Automatic keyphrases extraction from document using backpropagation , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[29]  James A. Thom,et al.  Entity Extraction from the Web with WebKnox , 2010 .

[30]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[31]  Yuefeng Li,et al.  Ontology Based Web Mining for Information Gathering , 2006, WImBI.

[32]  Tantek Çelik,et al.  Microformats: a pragmatic path to the semantic web , 2006, WWW '06.

[33]  Nivio Ziviani,et al.  Retrieving Similar Documents from the Web , 2003, J. Web Eng..

[34]  Gerhard Paass,et al.  Learning Prototype Ontologies by Hierachical Latent Semantic Analysis , 2004, LWA.

[35]  Pedro M. Domingos,et al.  Unsupervised Ontology Induction from Text , 2010, ACL.

[36]  Andreas Hotho,et al.  Semantic Web Mining: State of the art and future directions , 2006, J. Web Semant..

[37]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[38]  Terumasa Aoki,et al.  Towards the Identification of Keywords in the Web Site Text Content: A Methodological Approach , 2005, Int. J. Web Inf. Syst..

[39]  Yuefeng Li,et al.  Mining ontology for automatically acquiring Web user information needs , 2006, IEEE Transactions on Knowledge and Data Engineering.

[40]  M. V. Velzen,et al.  Self-organizing maps , 2007 .