Learning to Infer Competitive Relationships in Heterogeneous Networks

Detecting and monitoring competitors is fundamental to a company to stay ahead in the global market. Existing studies mainly focus on mining competitive relationships within a single data source, while competing information is usually distributed in multiple networks. How to discover the underlying patterns and utilize the heterogeneous knowledge to avoid biased aspects in this issue is a challenging problem. In this article, we study the problem of mining competitive relationships by learning across heterogeneous networks. We use Twitter and patent records as our data sources and statistically study the patterns behind the competitive relationships. We find that the two networks exhibit different but complementary patterns of competitions. Overall, we find that similar entities tend to be competitors, with a probability of 4 times higher than chance. On the other hand, in social network, we also find a 10 minutes phenomenon: when two entities are mentioned by the same user within 10 minutes, the likelihood of them being competitors is 25 times higher than chance. Based on the discovered patterns, we propose a novel Topical Factor Graph Model. Generally, our model defines a latent topic layer to bridge the Twitter network and patent network. It then employs a semi-supervised learning algorithm to classify the relationships between entities (e.g., companies or products). We test the proposed model on two real data sets and the experimental results validate the effectiveness of our model, with an average of +46% improvement over alternative methods. Besides, we further demonstrate the competitive relationships inferred by our proposed model can be applied in the job-hopping prediction problem by achieving an average of +10.7% improvement.

[1]  M. Gribaudo,et al.  2002 , 2001, Cell and Tissue Research.

[2]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[3]  Jiawei Han,et al.  Mining advisor-advisee relationships from research publication networks , 2010, KDD.

[4]  Yi-fang Brook Wu,et al.  Web mining from competitors' websites , 2005, KDD '05.

[5]  Wang-Chien Lee,et al.  Patent Citation Recommendation for Examiners , 2015, 2015 IEEE International Conference on Data Mining.

[6]  Juan-Zi Li,et al.  Mining competitive relationships by learning across heterogeneous networks , 2012, CIKM.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Bo Gao,et al.  PatentMiner: topic-driven patent analysis and mining , 2012, KDD.

[9]  Philip S. Yu,et al.  Ensemble of Diverse Sparsifications for Link Prediction in Large-Scale Networks , 2015, 2015 IEEE International Conference on Data Mining.

[10]  Mary Anne Kennan,et al.  The State of the Nation: A Snapshot of Australian Institutional Repositories , 2009, First Monday.

[11]  Zhoujun Li,et al.  Comparable Entity Mining from Comparative Questions , 2010, ACL.

[12]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[13]  F. Heider Attitudes and cognitive organization. , 1946, The Journal of psychology.

[14]  Jimeng Sun,et al.  Social influence analysis in large-scale networks , 2009, KDD.

[15]  Youngjoong Ko,et al.  Extracting Comparative Entities and Predicates from Texts Using Comparative Type Classification , 2011, ACL.

[16]  Jie Tang,et al.  Inferring social ties across heterogenous networks , 2012, WSDM '12.

[17]  Xin Jin,et al.  Patent Maintenance Recommendation with Patent Information Network Model , 2011, 2011 IEEE 11th International Conference on Data Mining.

[18]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[19]  Myong Kee Jeong,et al.  Graph kernel based measure for evaluating the influence of patents in a patent citation network , 2015, Expert Syst. Appl..

[20]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[21]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[22]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[23]  Jalil Khavand Kar,et al.  Intellectual capital: management, development and measurement models , 2013 .

[24]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[25]  Jimeng Sun,et al.  Social action tracking via noise tolerant time-varying factor graphs , 2010, KDD.

[26]  Jie Tang,et al.  Who will follow you back?: reciprocal relationship prediction , 2011, CIKM '11.

[27]  Holger Ernst,et al.  Patent information for strategic technology management , 2003 .

[28]  Rui Li,et al.  Competitor Mining with the Web , 2008, IEEE Transactions on Knowledge and Data Engineering.

[29]  Rizal Setya Perdana What is Twitter , 2013 .

[30]  Huan Liu,et al.  Exploring characteristics of suspended users and network stability on Twitter , 2016, Social Network Analysis and Mining.

[31]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[32]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[33]  P. Lazarsfeld,et al.  Friendship as Social process: a substantive and methodological analysis , 1964 .

[34]  Zheng Chen,et al.  CWS: a comparative web search system , 2006, WWW '06.

[35]  Jure Leskovec,et al.  Predicting positive and negative links in online social networks , 2010, WWW '10.

[36]  Lise Getoor,et al.  Relationship Identification for Social Network Discovery , 2007, AAAI.

[37]  Philip S. Yu,et al.  Discovering unexpected information from your competitors' web sites , 2001, KDD '01.

[38]  Aron Culotta,et al.  Predicting the Demographics of Twitter Users from Website Traffic Data , 2015, AAAI.

[39]  Xiaojin Zhu,et al.  Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning , 2005, ICML.

[40]  Kas Kasravi,et al.  Patent Mining - Discover y of Business Value from Patent Repositor ies , 2007, 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07).

[41]  Jie Tang,et al.  Learning to Infer Social Ties in Large Networks , 2011, ECML/PKDD.

[42]  Christos Faloutsos,et al.  Fast direction-aware proximity for graph mining , 2007, KDD '07.

[43]  Xiangyu Wang,et al.  Clustering-Based Collaborative Filtering for Link Prediction , 2015, AAAI.

[44]  Christos Faloutsos,et al.  Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[45]  Roland Geraerts,et al.  Towards social behavior in virtual-agent navigation , 2016, Science China Information Sciences.