Sparse learning of maximum likelihood model for optimization of complex loss function

Traditional machine learning methods usually minimize a simple loss function to learn a predictive model and then use a complex performance measure to measure the prediction performance. However, minimizing a simple loss function cannot guarantee an optimal performance. In this paper, we study the problem of optimizing the complex performance measure directly to obtain a predictive model. We proposed to construct a maximum likelihood model for this problem, and to learn the model parameter, we minimize a complex loss function corresponding to the desired complex performance measure. To optimize the loss function, we approximate the upper bound of the complex loss. We also propose to impose the sparsity to the model parameter to obtain a sparse model. An objective was constructed by combining the upper bound of the loss function and the sparsity of the model parameter, and we develop an iterative algorithm to minimize it by using the fast iterative shrinkage-thresholding algorithm framework. The experiments on optimization on three different complex performance measures, including F-score, receiver operating characteristic curve, and recall precision curve break-even point, over three real-world applications, aircraft event recognition of civil aviation safety, intrusion detection in wireless mesh networks, and image classification, show the advantages of the proposed method over state-of-the-art methods.

[1]  Haoxiang Wang,et al.  Multiple Kernel Multivariate Performance Learning Using Cutting Plane Algorithm , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[2]  Jie Yang,et al.  Structure Design of Vascular Stents , 2013 .

[3]  Fernando Barbosa,et al.  A simple and practical control of the authenticity of organic sugarcane samples based on the use of machine-learning algorithms and trace elements determination by inductively coupled plasma mass spectrometry. , 2015, Food chemistry.

[4]  Thomas Villmann,et al.  Precision-Recall-Optimization in Learning Vector Quantization Classifiers for Improved Medical Classification Systems , 2014, 2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[5]  Kotagiri Ramamohanarao,et al.  Enabling Precision/Recall Preferences for Semi-supervised SVM Training , 2014, CIKM.

[6]  Zhenghua Zhou,et al.  Human face recognition based on ensemble of polyharmonic extreme learning machine , 2013, Neural Computing and Applications.

[7]  D. Gómez-Almaguer,et al.  Evaluation of hemoglobin performance in the assessment of iron stores in feto-maternal pairs in a high-risk population: receiver operating characteristic curve analysis , 2015, Revista brasileira de hematologia e hemoterapia.

[8]  Benjamin Edwards,et al.  Supervised learning of sparse context reconstruction coefficients for data representation and classification , 2015, Neural Computing and Applications.

[9]  Zhijie Xu,et al.  Learning with positive and unlabeled examples using biased twin support vector machine , 2014, Neural Computing and Applications.

[10]  Chan-Gun Lee,et al.  Computational fluid dynamics simulation based on Hadoop Ecosystem and heterogeneous computing , 2015 .

[11]  Jie Yang,et al.  Computational modeling of magnetic nanoparticle targeting to stent surface under high gradient field , 2014, Computational mechanics.

[12]  Hyun Seung Yang,et al.  Sorted Consecutive Local Binary Pattern for Texture Classification , 2015, IEEE Transactions on Image Processing.

[13]  Walter Hu,et al.  Biomarker Binding on an Antibody-Functionalized Biosensor Surface: The Influence of Surface Properties, Electric Field, and Coating Density , 2014 .

[14]  Nong Sang,et al.  Multi-ring local binary patterns for rotation invariant texture classification , 2011, Neural Computing and Applications.

[15]  Yang Liu,et al.  Modeling Nanoparticle Targeting to a Vascular Surface in Shear Flow Through Diffusive Particle Dynamics , 2015, Nanoscale Research Letters.

[16]  Amalraj Irudayasamy,et al.  SCALABLE MULTIDIMENSIONAL ANONYMIZATION ALGORITHM OVER BIG DATA USING MAP REDUCE ON PUBLIC CLOUD , 2015 .

[17]  Haoxiang Wang,et al.  An Effective Image Representation Method Using Kernel Classification , 2014, 2014 IEEE 26th International Conference on Tools with Artificial Intelligence.

[18]  Springer-Verlag London,et al.  m-Nonparallel support vector machine for pattern classification , 2014 .

[19]  Fabien Subtil,et al.  The precision--recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. , 2015, Journal of clinical epidemiology.

[20]  Yong Shi,et al.  ν-Nonparallel support vector machine for pattern classification , 2014, Neural Computing and Applications.

[21]  Xuhui Wang,et al.  Incremental Support Vector Machine Learning Method for Aircraft Event Recognition , 2014, 2014 Enterprise Systems Conference.

[22]  Qin Zhang,et al.  ν-Nonparallel support vector machine for pattern classification , 2014, Neural Computing and Applications.

[23]  Gracián Triviño,et al.  Walking pattern classification using a granular linguistic analysis , 2015, Appl. Soft Comput..

[24]  Simon X. Yang,et al.  Determination of internal qualities of Newhall navel oranges based on NIR spectroscopy using machine learning , 2015 .

[25]  Bhaskar Bhattacharya,et al.  On shape properties of the receiver operating characteristic curve , 2015 .

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Haoxiang Wang,et al.  Image Tag Completion by Local Learning , 2015, ISNN.

[28]  Hai Jin,et al.  Mammoth: Gearing Hadoop Towards Memory-Intensive MapReduce Applications , 2015, IEEE Transactions on Parallel and Distributed Systems.

[29]  P. Sedgwick How to read a receiver operating characteristic curve , 2015, BMJ : British Medical Journal.

[30]  Mohamed Helmy Khafagy,et al.  JOMR: Multi-join optimizer technique to enhance map-reduce job , 2014, 2014 9th International Conference on Informatics and Systems.

[31]  Jingyan Wang,et al.  Representing Data by Sparse Combination of Contextual Data Points for Classification , 2015, ISNN.

[32]  Lavanya Ramakrishnan,et al.  Performance and energy efficiency of big data applications in cloud environments: A Hadoop case study , 2014, J. Parallel Distributed Comput..

[33]  Lixin Gao,et al.  GOM-Hadoop: A distributed framework for efficient analytics on ordered datasets , 2015, J. Parallel Distributed Comput..

[34]  Peter Schlattmann,et al.  Mixture models in diagnostic meta-analyses--clustering summary receiver operating characteristic curves accounted for heterogeneity and correlation. , 2015, Journal of clinical epidemiology.

[35]  Emanuel Sallinger,et al.  Using Statistics for Computing Joins with MapReduce , 2015, AMW.

[36]  Takaya Saito,et al.  The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets , 2015, PloS one.

[37]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[38]  Ivor W. Tsang,et al.  Efficient Optimization of Performance Measures by Classifier Adaptation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  C. K. Jha,et al.  Handling Big Data Efficiently by Using Map Reduce Technique , 2015, 2015 IEEE International Conference on Computational Intelligence & Communication Technology.

[40]  Jim Jing-Yan Wang,et al.  Supervised Cross-Modal Factor Analysis for Multiple Modal Data Classification , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[41]  Xinhua Zhang,et al.  Smoothing multivariate performance measures , 2011, J. Mach. Learn. Res..

[42]  Kamlesh Mistry,et al.  Intelligent facial emotion recognition using a layered encoding cascade optimization model , 2015, Appl. Soft Comput..

[43]  Ivor W. Tsang,et al.  A Feature Selection Method for Multivariate Performance Measures , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.