The case for anomalous link detection

In this paper, we describe the challenges inherent to the Link Prediction (LP) problem in multirelational data mining, and explore the reasons why many LP models have performed poorly. We present the alternate (and complimentary) task of Anomalous Link Discovery (ALD) and qualitatively demonstrate the effectiveness of simple LP models for the ALD task.

[1]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[2]  Raymond J. Mooney,et al.  Relational Data Mining with Inductive Logic Programming for Link Discovery , 2002 .

[3]  Kristian Kersting,et al.  Scaled CGEM: A Fast Accelerated EM , 2003, ECML.

[4]  Michelangelo Ceci,et al.  Learning Logic Programs for Layout Analysis Correction , 2003 .

[5]  Anthony K. H. Tung,et al.  Efficient Mining of Intertransaction Association Rules , 2003, IEEE Trans. Knowl. Data Eng..

[6]  Stefan Eicker,et al.  Einsatz virtueller Computerpools im E-Learning , 2003, Wirtschaftsinformatik.

[7]  Lyle H. Ungar,et al.  Statistical Relational Learning for Link Prediction , 2003 .

[8]  Saso Dzeroski,et al.  Using Domain Specific Knowledge for Automated Modeling , 2003, IDA.

[9]  Takashi Washio,et al.  State of the art of graph-based data mining , 2003, SKDD.

[10]  Lise Getoor,et al.  Link mining: a new data mining challenge , 2003, SKDD.

[11]  David D. Jensen,et al.  Information awareness: a prospective technical assessment , 2003, KDD '03.

[12]  Jean-François Boulicaut,et al.  Using transposition for pattern discovery from microarray data , 2003, DMKD '03.

[13]  Shou-De Lin,et al.  Unsupervised link discovery in multi-relational data via rarity analysis , 2003, Third IEEE International Conference on Data Mining.

[14]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[15]  Ashwin Srinivasan,et al.  An Empirical Study of the Use of Relevance Information in Inductive Logic Programming , 2003, J. Mach. Learn. Res..

[16]  Jiawei Han,et al.  Mining scale-free networks using geodesic clustering , 2004, KDD.

[17]  Jiawei Han,et al.  MAIDS: mining alarming incidents from data streams , 2004, SIGMOD '04.

[18]  Jeffrey F. Naughton,et al.  On the integration of structure indexes and inverted lists , 2004, Proceedings. 20th International Conference on Data Engineering.

[19]  Michelangelo Ceci,et al.  Machine learning methods for automatically processing historical documents: from paper acquisition to XML transformation , 2004, First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings..

[20]  Noboru Babaguchi,et al.  Constructive Inductive Learning Based on Meta-attributes , 2004, Discovery Science.

[21]  Michelangelo Ceci,et al.  Redundant feature elimination for multi-class problems , 2004, ICML.

[22]  Ruggero G. Pensa,et al.  Using Classification and Visualization on Pattern Databases for Gene Expression Data Analysis , 2004, PaRMa.

[23]  Jian Pei,et al.  Mining constrained gradients in large databases , 2004, IEEE Transactions on Knowledge and Data Engineering.

[24]  Christos Faloutsos,et al.  Fast discovery of connection subgraphs , 2004, KDD.

[25]  Foster J. Provost,et al.  Active feature-value acquisition for classifier induction , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[26]  Jean-François Boulicaut,et al.  Constraint-Based Mining of Formal Concepts in Transactional Data , 2004, PAKDD.

[27]  Saso Dzeroski,et al.  First Order Random Forests with Complex Aggregates , 2004, ILP.

[28]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..