Knowledge transfer: what, how, and why

People learn from prior experiences. We first learn how to use a spoon and then know how to use a different size of spoon. We first learn how to sew and then learn how to embroider. Transferring knowledge from one situation to another related situation often increases the speed of learning. This observation is relevant to human learning, as well as machine learning. This thesis focuses on the problem of knowledge transfer — an area of study in machine learning. The goal of knowledge transfer is to train a system to recognize and apply knowledge acquired from previous tasks to new tasks or new domains. An effective knowledge transfer system facilitates learning processes for novel tasks, where little information is available. For example, the ability to transfer knowledge from a model that identifies writers born in the U.S. to identify writers born in Kiribati, a much lesser known country, would increase the speed of learning to identify writers born in Kiribati from scratch. In this thesis, we investigate three dimensions of knowledge transfer: what, how, and why. We present and elaborate on these questions: What type of knowledge to transfer? How to transfer knowledge across entities? Why a certain pattern of knowledge transfer is observed? We first propose Segmented Transfer — a novel knowledge transfer model — to identify and learn from the most informative partitions from prior tasks. The proposed model is applied to Wikipedia vandalism detection problem and to entity search and retrieval problem and improves the predictions. Based on the foundation of knowledge transfer and network theory, we propose Knowledge Transfer Network (KTN), a novel type of network describing transfer learning relationships among problems. KTN is not only a knowledge representation, but also a framework to select an effective and efficient ensemble of learners to improve a predictive model. This novel type of network provides insights on identifying ontological connections that were initially obscured. For example, we may observe knowledge transfer occurs among dissimilar tasks, such as transferring from using a knife and fork to using chopsticks.

[1]  A. Moore,et al.  Dynamic social network analysis using latent space models , 2005, SKDD.

[2]  D. Cook,et al.  Author's Personal Copy Pervasive and Mobile Computing Activity Knowledge Transfer in Smart Environments , 2022 .

[3]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[4]  Martin Wattenberg,et al.  Studying cooperation and conflict between authors with history flow visualizations , 2004, CHI.

[5]  Yiming Yang,et al.  Mining social networks for personalized email prioritization , 2009, KDD.

[6]  Fabrizio Silvestri,et al.  Know your neighbors: web spam detection using the web topology , 2007, SIGIR.

[7]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[8]  Bradley Malin,et al.  Unsupervised Name Disambiguation via Social Network Similarity , 2005 .

[9]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[10]  Nagarajan Natarajan,et al.  Scalable Affiliation Recommendation using Auxiliary Networks , 2011, TIST.

[11]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[12]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[13]  Michael D. McClintock,et al.  The Declining Use of Legal Scholarship by Courts: An Empirical Study , 1998 .

[14]  Yang Wang,et al.  Learning Transferable Distance Functions for Human Action Recognition , 2010 .

[15]  Jianping Fan,et al.  Towards More Precise Social Image-Tag Alignment , 2011, MMM.

[16]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[17]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[18]  Tao Li,et al.  A Non-negative Matrix Tri-factorization Approach to Sentiment Classification with Lexical Prior Knowledge , 2009, ACL.

[19]  Iryna Gurevych,et al.  Analysis of the Wikipedia Category Graph for NLP Applications , 2007 .

[20]  Qiang Yang,et al.  Transfer Learning in Collaborative Filtering for Sparsity Reduction , 2010, AAAI.

[21]  N. Christakis,et al.  Dynamic spread of happiness in a large social network: longitudinal analysis over 20 years in the Framingham Heart Study , 2008, BMJ : British Medical Journal.

[22]  Alun Preece,et al.  An Empirical Investigation of Learning From the Semantic Web , 2002 .

[23]  George Forman,et al.  Tackling concept drift by temporal inductive transfer , 2006, SIGIR.

[24]  Qiang Yang,et al.  Transfer Learning for Text Mining , 2012, Mining Text Data.

[25]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[26]  Qiang Yang,et al.  Transfer Learning for Collective Link Prediction in Multiple Heterogenous Domains , 2010, ICML.

[27]  Vittorio Loreto,et al.  Network properties of folksonomies , 2007, AI Commun..

[28]  Insup Lee,et al.  Detecting Wikipedia vandalism via spatio-temporal analysis of revision metadata? , 2010, EUROSEC '10.

[29]  Gianluca Demartini,et al.  Overview of the INEX 2009 Entity Ranking Track , 2009, INEX.

[30]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[31]  Qiang Yang,et al.  Can Movies and Books Collaborate? Cross-Domain Collaborative Filtering for Sparsity Reduction , 2009, IJCAI.

[32]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[33]  Qiang Yang,et al.  Cross-domain activity recognition via transfer learning , 2011, Pervasive Mob. Comput..

[34]  Thomas G. Dietterich,et al.  To transfer or not to transfer , 2005, NIPS 2005.

[35]  R. Rosenfeld,et al.  Two decades of statistical language modeling: where do we go from here? , 2000, Proceedings of the IEEE.

[36]  Qiang Yang,et al.  Transfer learning for collaborative filtering via a rating-matrix generative model , 2009, ICML '09.

[37]  Hao Chen,et al.  Content-rich biological network constructed by mining PubMed abstracts , 2004, BMC Bioinformatics.

[38]  Jude Shavlik,et al.  Chapter 11 Transfer Learning , 2009 .

[39]  Wendy A. Lawrence-Fowler,et al.  Integrating query thesaurus, and documents through a common visual representation , 1991, SIGIR '91.

[40]  William Nick Street,et al.  Sharing classifiers among ensembles from related problem domains , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[41]  Noah E. Friedkin,et al.  A Structural Theory of Social Influence: List of Tables and Figures , 1998 .

[42]  Qiang Yang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Transfer Learning to Predict Missing Ratings via Heterogeneous User Feedbacks , 2022 .

[43]  Eric Eaton,et al.  Selective Transfer Between Learning Tasks Using Task-Based Boosting , 2011, AAAI.

[44]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[45]  Guido Caldarelli,et al.  Folksonomies and clustering in the collaborative system CiteULike , 2007, 0710.2835.

[46]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[47]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[48]  Doina Caragea,et al.  Exploring Wikipedia and DMoz as Knowledge Bases for Engineering a User Interests Hierarchy for Social Network Applications , 2009, OTM Conferences.

[49]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[50]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[51]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Clustering via the SocialWeb , 2009, ACL.

[52]  William Nick Street,et al.  Survival analysis of click logs , 2012, SIGIR '12.

[53]  Timothy W. Finin,et al.  Wikipedia as an Ontology for Describing Documents , 2008, ICWSM.

[54]  Bart Goethals,et al.  Automatic Vandalism Detection in Wikipedia : Towards a Machine Learning Approach , 2008 .

[55]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email , 2007, J. Artif. Intell. Res..

[56]  Christos Faloutsos,et al.  Cascading Behavior in Large Blog Graphs , 2007 .

[57]  Norman P. Hummon,et al.  Connectivity in a citation network: The development of DNA theory☆ , 1989 .

[58]  Steven B. Andrews,et al.  Structural Holes: The Social Structure of Competition , 1995, The SAGE Encyclopedia of Research Design.

[59]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[60]  Padmini Srinivasan,et al.  Detecting Wikipedia vandalism with active learning and statistical language models , 2010, WICOW '10.

[61]  Ying Ding,et al.  Scientific collaboration and endorsement: Network analysis of coauthorship and citation networks , 2011, J. Informetrics.

[62]  Matthew Richardson,et al.  Trust Management for the Semantic Web , 2003, SEMWEB.

[63]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[64]  Eugene Garfield,et al.  THE USE OF CITATION DATA IN WRITING THE HISTORY OF SCIENCE , 1964 .

[65]  Aristides Gionis,et al.  Improving recommendation for long-tail queries via templates , 2011, WWW.

[66]  Robert E. Mercer,et al.  Selective transfer of neural network task knowledge , 2000 .

[67]  Shotaro Akaho,et al.  TrBagg: A Simple Transfer Learning Method and its Application to Personalization in Collaborative Tagging , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[68]  Ulrik Brandes,et al.  Network analysis of collaboration structure in Wikipedia , 2009, WWW '09.

[69]  John Riedl,et al.  Creating, destroying, and restoring value in wikipedia , 2007, GROUP.

[70]  Avare Stewart,et al.  A transfer approach to detecting disease reporting events in blog social media , 2011, HT '11.

[71]  N. Christakis,et al.  The Spread of Obesity in a Large Social Network Over 32 Years , 2007, The New England journal of medicine.

[72]  Andrew McCallum,et al.  Group and topic discovery from relations and text , 2005, LinkKDD '05.

[73]  Jan W. Buzydlowski,et al.  Term Co-occurrence Analysis as an Interface for Digital Libraries , 2002, Visual Interfaces to Digital Libraries.

[74]  William W. Cohen,et al.  Contextual search and name disambiguation in email using graphs , 2006, SIGIR.

[75]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[76]  Virgílio A. F. Almeida,et al.  From bias to opinion: a transfer-learning approach to real-time sentiment analysis , 2011, KDD.

[77]  N. Christakis,et al.  Social Network Sensors for Early Detection of Contagious Outbreaks , 2010, PloS one.

[78]  Haizhou Li,et al.  A cross-domain adaptation method for sentiment classification using probabilistic latent analysis , 2011, CIKM '11.

[79]  Hsinchun Chen,et al.  Automatic patent classification using citation network information: an experimental study in nanotechnology , 2007, JCDL '07.

[80]  Michael Sussna,et al.  Word sense disambiguation for free-text indexing using a massive semantic network , 1993, CIKM '93.

[81]  Chengqing Zong,et al.  Multi-domain Sentiment Classification , 2008, ACL.

[82]  Benno Stein,et al.  Automatic Vandalism Detection in Wikipedia , 2008, ECIR.

[83]  Rong Jin,et al.  Unsupervised transfer classification: application to text categorization , 2010, KDD.

[84]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[85]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[86]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[87]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[88]  Jon M. Kleinberg,et al.  Challenges in mining social network data: processes, privacy, and paradoxes , 2007, KDD '07.

[89]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[90]  Gjergji Kasneci,et al.  YAWN: A Semantically Annotated Wikipedia XML Corpus , 2007, BTW.

[91]  Simone Paolo Ponzetto,et al.  WikiRelate! Computing Semantic Relatedness Using Wikipedia , 2006, AAAI.

[92]  Aniket Kittur,et al.  He says, she says: conflict and coordination in Wikipedia , 2007, CHI.

[93]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.

[94]  May T. Lim,et al.  Tracking the dynamic variations in a social network formed through shared interests , 2010 .

[95]  Jon M. Kleinberg,et al.  Feedback effects between similarity and social influence in online communities , 2008, KDD.

[96]  Feiping Nie,et al.  Cross-language web page classification via dual knowledge transfer using nonnegative matrix tri-factorization , 2011, SIGIR.

[97]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[98]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[99]  E. Garfield Citation analysis as a tool in journal evaluation. , 1972, Science.

[100]  Andreas Hotho,et al.  Information Retrieval in Folksonomies: Search and Ranking , 2006, ESWC.

[101]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[102]  Yiming Yang,et al.  Introducing the Enron Corpus , 2004, CEAS.

[103]  Anna Goldenberg,et al.  Bayes net graphs to understand co-authorship networks? , 2005, LinkKDD '05.

[104]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[105]  Simone Paolo Ponzetto,et al.  Deriving a Large-Scale Taxonomy from Wikipedia , 2007, AAAI.

[106]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[107]  Gilad Mishne,et al.  Blocking Blog Spam with Language Model Disagreement , 2005, AIRWeb.

[108]  Qiang Yang,et al.  EigenTransfer: a unified framework for transfer learning , 2009, ICML '09.

[109]  Xiang Ji,et al.  Topic evolution and social interactions: how authors effect research , 2006, CIKM '06.

[110]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[111]  Xiaojun Wan,et al.  Co-Training for Cross-Lingual Sentiment Classification , 2009, ACL.