Enhanced perceptrons using contrastive biclusters

Perceptrons are neuronal devices capable of fully discriminating linearly separable classes. Although straightforward to implement and train, their applicability is usually hindered by non-trivial requirements imposed by real-world classification problems. Therefore, several approaches, such as kernel perceptrons, have been conceived to counteract such difficulties. In this paper, we investigate an enhanced perceptron model based on the notion of contrastive biclusters. From this perspective, a good discriminative bicluster comprises a subset of data instances belonging to one class that show high coherence across a subset of features and high differentiation from nearest instances of the other class under the same features (referred to as its contrastive bicluster). Upon each local subspace associated with a pair of contrastive biclusters a perceptron is trained and the model with highest area under the receiver operating characteristic curve (AUC) value is selected as the final classifier. Experiments conducted on a range of data sets, including those related to a difficult biosignal classification problem, show that the proposed variant can be indeed very useful, prevailing in most of the cases upon standard and kernel perceptrons in terms of accuracy and AUC measures.

[1]  Fabrício Olivetti de França,et al.  Extending features for multilabel classification with swarm biclustering , 2013, 2013 IEEE Congress on Evolutionary Computation.

[2]  Hong Yan,et al.  A fuzzy biclustering algorithm for social annotations , 2009, J. Inf. Sci..

[3]  Vipin Kumar,et al.  Subspace Differential Coexpression Analysis: Problem Definition and a General Approach , 2010, Pacific Symposium on Biocomputing.

[4]  Jin-Kao Hao,et al.  Survey on Biclustering of Gene Expression Data , 2013 .

[5]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[6]  Xuelong Li,et al.  Exploiting Local Coherent Patterns for Unsupervised Feature Ranking , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Alexander Schliep,et al.  Clustering cancer gene expression data: a comparative study , 2008, BMC Bioinformatics.

[8]  Clodoaldo Ap. M. Lima,et al.  Kernel machines for epilepsy diagnosis via EEG signal classification: A comparative study , 2011, Artif. Intell. Medicine.

[9]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[10]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[11]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[12]  Xuelong Li,et al.  Biclustering Learning of Trading Rules , 2015, IEEE Transactions on Cybernetics.

[13]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[14]  Abdulhamit Subasi,et al.  EEG signal classification using wavelet feature extraction and a mixture of expert model , 2007, Expert Syst. Appl..

[15]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[16]  James Bailey,et al.  Contrast Data Mining: Concepts, Algorithms, and Applications , 2012 .

[17]  Yuan Yan Tang,et al.  Wavelet Theory Approach to Pattern Recognition - 2nd Edition , 2009, Series in Machine Perception and Artificial Intelligence.

[18]  Chandan K. Reddy,et al.  Differential biclustering for gene expression analysis , 2010, BCB '10.

[19]  Chandan K. Reddy,et al.  Efficient mining of discriminative co-clusters from gene expression data , 2014, Knowledge and Information Systems.

[20]  Xuequn Shang,et al.  Efficient mining differential co-expression biclusters in microarray datasets. , 2013, Gene.

[21]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[22]  Fabrício Olivetti de França,et al.  Predicting missing values with biclustering: A coherence-based approach , 2013, Pattern Recognit..

[23]  Wlodzislaw Duch,et al.  Make it cheap: Learning with O(nd) complexity , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[24]  Hong Yan,et al.  Biclustering Analysis for Pattern Discovery: Current Techniques, Comparative Studies and Applications , 2012 .

[25]  João Paulo Papa,et al.  EEG signal classification for epilepsy diagnosis via optimum path forest - A systematic assessment , 2014, Neurocomputing.

[26]  Fabrício Olivetti de França,et al.  A biclustering approach for classification with mislabeled data , 2015, Expert Syst. Appl..

[27]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[28]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[29]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[30]  L. Breiman Arcing Classifiers , 1998 .

[31]  Fabrício Benevenuto,et al.  Detecting tip spam in location-based social networks , 2013, SAC '13.