Few-shot Learning for Unsupervised Feature Selection

We propose a few-shot learning method for unsupervised feature selection, which is a task to select a subset of relevant features in unlabeled data. Existing methods usually require many instances for feature selection. However, sufficient instances are often unavailable in practice. The proposed method can select a subset of relevant features in a target task given a few unlabeled target instances by training with unlabeled instances in multiple source tasks. Our model consists of a feature selector and decoder. The feature selector outputs a subset of relevant features taking a few unlabeled instances as input such that the decoder can reconstruct the original features of unseen instances from the selected ones. The feature selector uses the Concrete random variables to select features via gradient descent. To encode task-specific properties from a few unlabeled instances to the model, the Concrete random variables and decoder are modeled using permutation-invariant neural networks that take a few unlabeled instances as input. Our model is trained by minimizing the expected test reconstruction error given a few unlabeled instances that is calculated with datasets in source tasks. We experimentally demonstrate that the proposed method outperforms existing feature selection methods.

[1]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Sergey Levine,et al.  Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.

[3]  Bamshad Mobasher,et al.  Data Mining for Web Personalization , 2007, The Adaptive Web.

[4]  Michael I. Jordan,et al.  Multi-task feature selection , 2006 .

[5]  Atsutoshi Kumagai,et al.  Meta-learning from Tasks with Heterogeneous Attribute Spaces , 2020, NeurIPS.

[6]  Ohad Shamir,et al.  Efficient Learning with Partially Observed Attributes , 2010, ICML.

[7]  Huan Liu,et al.  Reconstruction-based Unsupervised Feature Selection: An Embedded Approach , 2017, IJCAI.

[8]  Thibault Helleputte,et al.  Feature Selection by Transfer Learning with Linear Regularized Models , 2009, ECML/PKDD.

[9]  Leonid Karlinsky,et al.  TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification , 2020, ECCV.

[10]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[11]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[12]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[13]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[14]  Carole Lartizien,et al.  Feature Selection for Unsupervised Domain Adaptation using Optimal Transport , 2018, ECML/PKDD.

[15]  Mehmet Fatih Akay,et al.  Support vector machines combined with feature selection for breast cancer diagnosis , 2009, Expert Syst. Appl..

[16]  Donald A. Adjeroh,et al.  Unified Deep Supervised Domain Adaptation and Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Tony Jebara,et al.  Multi-task feature and kernel selection for SVMs , 2004, ICML.

[18]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[19]  Jaime G. Carbonell,et al.  Feature Selection for Transfer Learning , 2011, ECML/PKDD.

[20]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[21]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[22]  Julien Mairal,et al.  Selecting Relevant Features from a Multi-domain Representation for Few-Shot Classification , 2020, ECCV.

[23]  Xiaogang Wang,et al.  Finding Task-Relevant Features for Few-Shot Learning by Category Traversal , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Yun Fu,et al.  Feature Selection Guided Auto-Encoder , 2017, AAAI.

[25]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[26]  Yoshua Bengio,et al.  Diet Networks: Thin Parameters for Fat Genomic , 2016, ICLR.

[27]  Yuan Shi,et al.  Transferred Feature Selection , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[28]  Masashi Sugiyama,et al.  High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso , 2012, Neural Computation.

[29]  Kenji Fukumizu,et al.  Post Selection Inference with Kernels , 2016, AISTATS.

[30]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[31]  Chao Xu,et al.  Autoencoder Inspired Unsupervised Feature Selection , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  Ying Liu,et al.  A Comparative Study on Feature Selection Methods for Drug Discovery , 2004, J. Chem. Inf. Model..

[33]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[34]  Yee Whye Teh,et al.  Conditional Neural Processes , 2018, ICML.

[35]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[36]  Ofir Lindenbaum,et al.  Deep supervised feature selection using Stochastic Gates , 2018, ICML.

[37]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[38]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[39]  Sergey Levine,et al.  Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[40]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Chelsea Finn,et al.  Meta-Learning without Memorization , 2020, ICLR.

[42]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[43]  Makoto Yamada,et al.  FsNet: Feature Selection Network on High-dimensional Biological Data , 2020, ArXiv.

[44]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[45]  Jennifer G. Dy,et al.  From Transformation-Based Dimensionality Reduction to Feature Selection , 2010, ICML.

[46]  George Forman,et al.  BNS feature scaling: an improved representation over tf-idf for svm text classification , 2008, CIKM '08.

[47]  Mengjie Zhang,et al.  Domain Generalization for Object Recognition with Multi-task Autoencoders , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[48]  James Zou,et al.  Concrete Autoencoders for Differentiable Feature Selection and Reconstruction , 2019, ArXiv.

[49]  Robert Tibshirani,et al.  LassoNet: Neural Networks with Feature Sparsity , 2019, AISTATS.

[50]  Yoshua Bengio,et al.  Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.

[51]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[52]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[53]  Kilian Q. Weinberger,et al.  Large Margin Multi-Task Metric Learning , 2010, NIPS.

[54]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[55]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[56]  Yu Zhang,et al.  Deep Neural Networks for High Dimension, Low Sample Size Data , 2017, IJCAI.

[57]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[58]  Kristen Grauman,et al.  Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation , 2013, ICML.

[59]  Bo Zhao,et al.  MSplit LBI: Realizing Feature Selection and Dense Estimation Simultaneously in Few-shot and Zero-shot Learning , 2018, ICML.

[60]  Rong Jin,et al.  Exclusive Lasso for Multi-task Feature Selection , 2010, AISTATS.

[61]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.