An innovative multi-label learning based algorithm for city data computing

Investigating correlation between example features and example labels is essential to the solving of classification problems. However, identification and calculation of the correlation between features and labels can be rather difficult in case involving high-dimensional multi-label data. Both feature embedding and label embedding have been developed to tackle this challenge, and a shared subspace for both labels and features is usually learned by applying existing embedding methods to simultaneously reduce the dimension of features and labels. By contrast, this paper suggests learning separate subspaces for features and labels by maximizing the independence between the components in each subspace, as well as maximizing the correlation between these two subspaces. The learned independent label components indicate the fundamental combinations of labels in multi-label datasets, which thus helps to reveal the correlation between labels. Furthermore, the learned independent feature components lead to a compact representation of example features. The connections between the proposed algorithm and existing embedding methods are discussed in detail. Experimental results on real-world multi-label datasets demonstrate that it is necessary for us to explore independent components from multi-label data, and further prove the effectiveness of the proposed algorithm.

[1]  K. Bretonnel Cohen,et al.  A shared task involving multi-label classification of clinical free text , 2007, BioNLP@ACL.

[2]  Dacheng Tao,et al.  Robust Extreme Multi-label Learning , 2016, KDD.

[3]  Bo Du,et al.  Multi-label Active Learning Based on Maximum Correntropy Criterion: Towards Robust and Discriminative Labeling , 2016, ECCV.

[4]  Jieping Ye,et al.  Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation, Extensions, and Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Manik Varma,et al.  Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages , 2013, WWW.

[6]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[7]  Bo Du,et al.  Robust and Discriminative Labeling for Multi-Label Active Learning Based on Maximum Correntropy Criterion , 2017, IEEE Transactions on Image Processing.

[8]  Hsuan-Tien Lin,et al.  Multilabel Classification with Principal Label Space Transformation , 2012, Neural Computation.

[9]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[10]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[11]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[12]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  Hsuan-Tien Lin,et al.  Feature-aware Label Space Dimension Reduction for Multi-label Classification , 2012, NIPS.

[14]  Zhi-Hua Zhou,et al.  Multilabel dimensionality reduction via dependence maximization , 2008, TKDD.

[15]  Huan Liu,et al.  Multi-Label Informed Feature Selection , 2016, IJCAI.

[16]  Panos Kalnis,et al.  Personalized trajectory matching in spatial networks , 2014, The VLDB Journal.

[17]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[18]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[19]  Panos Kalnis,et al.  Collective Travel Planning in Spatial Networks , 2016, IEEE Transactions on Knowledge and Data Engineering.

[20]  Volker Tresp,et al.  Multi-label informed latent semantic indexing , 2005, SIGIR '05.

[21]  Prateek Jain,et al.  Sparse Local Embeddings for Extreme Multi-label Classification , 2015, NIPS.

[22]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[23]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[24]  Olivier Bachem,et al.  Recent Advances in Autoencoder-Based Representation Learning , 2018, ArXiv.

[25]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[26]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[27]  John Langford,et al.  Multi-Label Prediction via Compressed Sensing , 2009, NIPS.

[28]  Yoshua Bengio,et al.  Learning Independent Features with Adversarial Nets for Non-linear ICA , 2017, 1710.05050.

[29]  Yang Yu,et al.  Binary Linear Compression for Multi-label Classification , 2017, IJCAI.

[30]  Dacheng Tao,et al.  Local Rademacher Complexity for Multi-Label Learning , 2014, IEEE Transactions on Image Processing.

[31]  Grigorios Tsoumakas,et al.  Multilabel Text Classification for Automated Tag Suggestion , 2008 .

[32]  Jeff G. Schneider,et al.  Multi-Label Output Codes using Canonical Correlation Analysis , 2011, AISTATS.

[33]  Jianmin Wang,et al.  Multi-label Classification via Feature-aware Implicit Label Space Encoding , 2014, ICML.

[34]  Geoff Holmes,et al.  Multi-label Classification Using Ensembles of Pruned Sets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[35]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[36]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.

[37]  Hugo Jair Escalante,et al.  The segmented and annotated IAPR TC-12 benchmark , 2010, Comput. Vis. Image Underst..

[38]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[39]  Aaron C. Courville,et al.  MINE: Mutual Information Neural Estimation , 2018, ArXiv.

[40]  Chris H. Q. Ding,et al.  Multi-label Linear Discriminant Analysis , 2010, ECCV.

[41]  Quoc V. Le,et al.  ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning , 2011, NIPS.

[42]  Olof Mogren,et al.  Disentangled activations in deep networks , 2018 .

[43]  Panos Kalnis,et al.  Parallel Trajectory-to-Location Join , 2019, IEEE Transactions on Knowledge and Data Engineering.