Hub Co-occurrence Modeling for Robust High-Dimensional kNN Classification

The emergence of hubs in k-nearest neighbor kNN topologies of intrinsically high dimensional data has recently been shown to be quite detrimental to many standard machine learning tasks, including classification. Robust hubness-aware learning methods are required in order to overcome the impact of the highly uneven distribution of influence. In this paper, we have adapted the Hidden Naive Bayes HNB model to the problem of modeling neighbor occurrences and co-occurrences in high-dimensional data. Hidden nodes are used to aggregate all pairwise occurrence dependencies. The result is a novel kNN classification method tailored specifically for intrinsically high-dimensional data, the Augmented Naive Hubness Bayesian k nearest Neighbor ANHBNN. Neighbor co-occurrence information forms an important part of the model and our analysis reveals some surprising results regarding the influence of hubness on the shape of the co-occurrence distribution in high-dimensional data. The proposed approach was tested in the context of object recognition from images in class imbalanced data and the results show that it offers clear benefits when compared to the other hubness-aware kNN baselines.

[1]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[2]  Dunja Mladenic,et al.  A probabilistic approach to nearest-neighbor classification: naive hubness bayesian kNN , 2011, CIKM '11.

[3]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[4]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Yannis Manolopoulos,et al.  Adaptive k-Nearest-Neighbor Classification Using a Dynamic Number of Nearest Neighbors , 2007, ADBIS.

[6]  Dunja Mladenic,et al.  The Role of Hubness in Clustering High-Dimensional Data , 2011, IEEE Transactions on Knowledge and Data Engineering.

[7]  Dunja Mladenic,et al.  Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification , 2011, International Journal of Machine Learning and Cybernetics.

[8]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[9]  Dunja Mladenic,et al.  Nearest neighbor voting in high dimensional data: Learning from past occurrences , 2012, Comput. Sci. Inf. Syst..

[10]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[11]  Nenad Tomašev,et al.  WIKImage: CORRELATED IMAGE AND TEXT DATASETS , 2011 .

[12]  Alexandros Nanopoulos,et al.  Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data , 2010, J. Mach. Learn. Res..

[13]  Antal van den Bosch,et al.  When small disjuncts abound, try lazy learning: A case study , 1997 .

[14]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[15]  Sung-Bae Cho,et al.  Hybrid Artificial Intelligent Systems , 2015, Lecture Notes in Computer Science.

[16]  Alexandros Nanopoulos,et al.  Nearest neighbors in high-dimensional data: the emergence and influence of hubs , 2009, ICML '09.

[17]  Markus Schedl,et al.  Using Mutual Proximity to Improve Content-Based Audio Similarity , 2011, ISMIR.

[18]  Thomas J. Watson,et al.  An empirical study of the naive Bayes classifier , 2001 .

[19]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[20]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[21]  Jing Peng,et al.  Adaptive quasiconformal kernel nearest neighbor classification , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Dunja Mladenic,et al.  Hubness-aware shared neighbor distances for high-dimensional $$k$$-nearest neighbor classification , 2014, Knowledge and Information Systems.

[23]  Dunja Mladenic,et al.  The influence of hubness on nearest-neighbor methods in object recognition , 2011, 2011 IEEE 7th International Conference on Intelligent Computer Communication and Processing.

[24]  Robert C. Holte,et al.  Concept Learning and the Problem of Small Disjuncts , 1989, IJCAI.

[25]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[26]  Philip S. Yu,et al.  Early prediction on time series: a nearest neighbor approach , 2009, IJCAI 2009.

[27]  François Pachet,et al.  Improving Timbre Similarity : How high’s the sky ? , 2004 .

[28]  Michel Verleysen,et al.  The Concentration of Fractional Distances , 2007, IEEE Transactions on Knowledge and Data Engineering.

[29]  Jerzy Stefanowski,et al.  Identification of Different Types of Minority Class Examples in Imbalanced Data , 2012, HAIS.

[30]  Richard Bellman,et al.  Adaptive Control Processes - A Guided Tour (Reprint from 1961) , 2015, Princeton Legacy Library.

[31]  A. Nanopoulos,et al.  Adaptive k-Nearest Neighbor Classification Based on a Dynamic Number of Nearest Neighbors , 2007 .

[32]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[33]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[34]  Liangxiao Jiang,et al.  A Novel Bayes Model: Hidden Naive Bayes , 2009, IEEE Transactions on Knowledge and Data Engineering.