Feature Learning in Feature-Sample Networks Using Multi-Objective Optimization

Data and knowledge representation are fundamental concepts in machine learning. The quality of the representation impacts the performance of a learning model directly. Feature learning transforms or enhances raw data to structures that are effectively exploited by those methods. In recent years, several works have been using complex networks for data representation and analysis. However, no feature learning method has been proposed to enhance such category of representation. Here, we present an unsupervised feature learning mechanism that works on datasets with binary features. First, the dataset is mapped into a feature-sample network. Then, a multi-objective optimization process selects a set of new vertices to produce an enhanced version of the network. The new features depend on a nonlinear function of a combination of preexisting features. Effectively, the process projects the input data into a higher-dimensional space. To solve the optimization problem, we design two metaheuristics based on the lexicographic genetic algorithm and the improved strength Pareto evolutionary algorithm (SPEA2). We show that the enhanced network contains more useful information and can be exploited to improve the performance of machine learning methods. The advantages and disadvantages of each optimization strategy are discussed.

[1]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Gary B. Lamont,et al.  Evolutionary Algorithms for Solving Multi-Objective Problems , 2002, Genetic Algorithms and Evolutionary Computation.

[3]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[4]  Liang Zhao,et al.  Network Unfolding Map by Vertex-Edge Dynamics Modeling , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[6]  Liang Zhao,et al.  Random Walk in Feature-Sample Networks for Semi-supervised Classification , 2016, 2016 5th Brazilian Conference on Intelligent Systems (BRACIS).

[7]  Liang Zhao,et al.  Handwritten Data Clustering Using Agents Competition in Networks , 2012, Journal of Mathematical Imaging and Vision.

[8]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[9]  Alex Alves Freitas,et al.  A critical review of multi-objective optimization in data mining: a position paper , 2004, SKDD.

[10]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[11]  Francesca Bovolo,et al.  Semisupervised One-Class Support Vector Machines for Classification of Remote Sensing Data , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[12]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Fabricio A. Breve Machine learning in complex networks , 2010 .

[14]  Lalit M. Patnaik,et al.  Genetic algorithms: a survey , 1994, Computer.

[15]  Marco Laumanns,et al.  SPEA2: Improving the Strength Pareto Evolutionary Algorithm For Multiobjective Optimization , 2002 .

[16]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.