Hyperlink Classification via Structured Graph Embedding

We formally define a hyperlink classification problem in web search by classifying hyperlinks into three classes based on their roles: navigation, suggestion, and action. Real-world web graph datasets are generated for this task. We approach the hyperlink classification problem from a structured graph embedding perspective, and show that we can solve the problem by modifying the recently proposed knowledge graph embedding techniques. The key idea of our modification is to introduce a relation perturbation while the original knowledge graph embedding models only corrupt entities when generating negative triplets in training. To the best of our knowledge, this is the first study to apply the knowledge graph embedding idea to the hyperlink classification problem. We show that our model significantly outperforms the original knowledge graph embedding models in classifying hyperlinks on web graphs.

[1]  Daniel R. Figueiredo,et al.  struc2vec: Learning Node Representations from Structural Identity , 2017, KDD.

[2]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[3]  Philip S. Yu,et al.  On Edge Classification in Networks with Structure and Content , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[4]  Pu-Jen Cheng,et al.  Translating Representations of Knowledge Graphs with Neighbors , 2018, SIGIR.

[5]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[6]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[7]  Charu C. Aggarwal,et al.  Edge classification in networks , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[8]  Zhendong Mao,et al.  Knowledge Graph Embedding: A Survey of Approaches and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[9]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[10]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[11]  Peter Fankhauser,et al.  Boilerplate detection using shallow text features , 2010, WSDM '10.

[12]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[13]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[14]  Charles F. Leinberry,et al.  2017@@@Fluoroscopic Exposure With Use of Mini-C-Arm During Routine Hand Surgery: A Prospective Comparison of Hand Versus Eye Radiation Dosage@@@102: 105 , 2017 .

[15]  Inderjit S. Dhillon,et al.  Non-Exhaustive, Overlapping Co-Clustering , 2017, CIKM.

[16]  Ming-Syan Chen,et al.  Mining Web informative structures and contents based on entropy analysis , 2004, IEEE Transactions on Knowledge and Data Engineering.