EEF: Exponentially Embedded Families With Class-Specific Features for Classification

In this paper, we present a novel exponentially embedded families (EEF) based classification method, in which the probability density function (PDF) on raw data is estimated from the PDF on features. With the PDF construction, we show that class-specific features can be used in the proposed classification method, instead of a common feature subset for all classes as used in conventional approaches. We apply the proposed EEF classifier for text categorization as a case study and derive an optimal Bayesian classification rule with class-specific feature selection based on the Information Gain score. The promising performance on real-life data sets demonstrates the effectiveness of the proposed approach and indicates its wide potential applications.

[1]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[2]  Paul M. Baggenstoss Maximum Entropy PDF Design Using Feature Density Constraints: Applications in Signal Processing , 2015, IEEE Transactions on Signal Processing.

[3]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[4]  W. Wong,et al.  Optional P\'{o}lya tree and Bayesian inference , 2010, 1010.0490.

[5]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[6]  Bo Tang,et al.  A Parametric Classification Rule Based on the Exponentially Embedded Family , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[7]  P.M. Baggenstoss,et al.  Class-specific classifier: avoiding the curse of dimensionality , 2004, IEEE Aerospace and Electronic Systems Magazine.

[8]  Bo Tang,et al.  Toward Optimal Feature Selection in Naive Bayes for Text Categorization , 2016, IEEE Transactions on Knowledge and Data Engineering.

[9]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[10]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[11]  Alex Pentland,et al.  Bayesian face recognition , 2000, Pattern Recognit..

[12]  Evgueni A. Haroutunian,et al.  Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[13]  Joydeep Ghosh,et al.  A Hierarchical Multiclassifier System for Hyperspectral Data Analysis , 2000, Multiple Classifier Systems.

[14]  Bo Tang,et al.  A Bayesian Classification Approach Using Class-Specific Features for Text Categorization , 2016, IEEE Transactions on Knowledge and Data Engineering.

[15]  Hui Jiang,et al.  Multivariate Density Estimation by Bayesian Sequential Partitioning , 2013 .

[16]  Bo Tang,et al.  Probability Density Function Estimation Using the EEF With Application to Subset/Feature Selection , 2016, IEEE Transactions on Signal Processing.

[17]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Michel Verleysen,et al.  Class-Specific Feature Selection for One-Against-All Multiclass SVMs , 2011, ESANN.

[19]  Bo Tang,et al.  ENN: Extended Nearest Neighbor Method for Pattern Recognition [Research Frontier] , 2015, IEEE Computational Intelligence Magazine.

[20]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[21]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[22]  S. Kay Exponentially embedded families - new approaches to model order estimation , 2005, IEEE Transactions on Aerospace and Electronic Systems.