今日推荐

2011 - ICML

Learning with Whom to Share in Multi-task Feature Learning

In multi-task learning (MTL), multiple tasks are learnt jointly. A major assumption for this paradigm is that all those tasks are indeed related so that the joint training is appropriate and beneficial. In this paper, we study the problem of multi-task learning of shared feature representations among tasks, while simultaneously determining "with whom" each task should share. We formulate the problem as a mixed integer programming and provide an alternating minimization technique to solve the optimization problem of jointly identifying grouping structures and parameters. The algorithm mono-tonically decreases the objective function and converges to a local optimum. Compared to the standard MTL paradigm where all tasks are in a single group, our algorithm improves its performance with statistical significance for three out of the four datasets we have studied. We also demonstrate its advantage over other task grouping techniques investigated in literature.

2012 - KDD : proceedings. International Conference on Knowledge Discovery & Data Mining

Robust multi-task feature learning

Multi-task learning (MTL) aims to improve the performance of multiple related tasks by exploiting the intrinsic relationships among them. Recently, multi-task feature learning algorithms have received increasing attention and they have been successfully applied to many applications involving high dimensional data. However, they assume that all tasks share a common set of features, which is too restrictive and may not hold in real-world applications, since outlier tasks often exist. In this paper, we propose a Robust Multi-Task Feature Learning algorithm (rMTFL) which simultaneously captures a common set of features among relevant tasks and identifies outlier tasks. Specifically, we decompose the weight (model) matrix for all tasks into two components. We impose the well-known group Lasso penalty on row groups of the first component for capturing the shared features among relevant tasks. To simultaneously identify the outlier tasks, we impose the same group Lasso penalty but on column groups of the second component. We propose to employ the accelerated gradient descent to efficiently solve the optimization problem in rMTFL, and show that the proposed algorithm is scalable to large-size problems. In addition, we provide a detailed theoretical analysis on the proposed rMTFL formulation. Specifically, we present a theoretical bound to measure how well our proposed rMTFL approximates the true evaluation, and provide bounds to measure the error between the estimated weights of rMTFL and the underlying true weights. Moreover, by assuming that the underlying true weights are above the noise level, we present a sound theoretical result to show how to obtain the underlying true shared features and outlier tasks (sparsity patterns). Empirical studies on both synthetic and real-world data demonstrate that our proposed rMTFL is capable of simultaneously capturing shared features among tasks and identifying outlier tasks.

2012 - Advances in neural information processing systems

Multi-stage multi-task feature learning

Multi-task sparse feature learning aims to improve the generalization performance by exploiting the shared features among tasks. It has been successfully applied to many applications including computer vision and biomedical informatics. Most of the existing multi-task sparse feature learning algorithms are formulated as a convex sparse regularization problem, which is usually suboptimal, due to its looseness for approximating an [Formula: see text]-type regularizer. In this paper, we propose a non-convex formulation for multi-task sparse feature learning based on a novel regularizer. To solve the non-convex optimization problem, we propose a Multi-Stage Multi-Task Feature Learning (MSMTFL) algorithm. Moreover, we present a detailed theoretical analysis showing that MSMTFL achieves a better parameter estimation error bound than the convex formulation. Empirical studies on both synthetic and real-world data sets demonstrate the effectiveness of MSMTFL in comparison with the state of the art multi-task sparse feature learning algorithms.

2010 - NIPS

Probabilistic Multi-Task Feature Selection

Recently, some variants of the l1 norm, particularly matrix norms such as the l1,2 and l1,∞ norms, have been widely used in multi-task learning, compressed sensing and other related areas to enforce sparsity via joint regularization. In this paper, we unify the l1,2 and l1,∞ norms by considering a family of l1,q norms for 1 < q < ∞ and study the problem of determining the most appropriate sparsity enforcing norm to use in the context of multi-task feature selection. Using the generalized normal distribution, we provide a probabilistic interpretation of the general multi-task feature selection problem using the l1,q norm. Based on this probabilistic interpretation, we develop a probabilistic model using the noninformative Jeffreys prior. We also extend the model to learn and exploit more general types of pairwise relationships between tasks. For both versions of the model, we devise expectation-maximization (EM) algorithms to learn all model parameters, including q, automatically. Experiments have been conducted on two cancer classification applications using microarray gene expression data.

2014 - Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

Efficient multi-task feature learning with calibration

Multi-task feature learning has been proposed to improve the generalization performance by learning the shared features among multiple related tasks and it has been successfully applied to many real-world problems in machine learning, data mining, computer vision and bioinformatics. Most existing multi-task feature learning models simply assume a common noise level for all tasks, which may not be the case in real applications. Recently, a Calibrated Multivariate Regression (CMR) model has been proposed, which calibrates different tasks with respect to their noise levels and achieves superior prediction performance over the non-calibrated one. A major challenge is how to solve the CMR model efficiently as it is formulated as a composite optimization problem consisting of two non-smooth terms. In this paper, we propose a variant of the calibrated multi-task feature learning formulation by including a squared norm regularizer. We show that the dual problem of the proposed formulation is a smooth optimization problem with a piecewise sphere constraint. The simplicity of the dual problem enables us to develop fast dual optimization algorithms with low per-iteration cost. We also provide a detailed convergence analysis for the proposed dual optimization algorithm. Empirical studies demonstrate that, the dual optimization algorithm quickly converges and it is much more efficient than the primal optimization algorithm. Moreover, the calibrated multi-task feature learning algorithms with and without the squared norm regularizer achieve similar prediction performance and both outperform the non-calibrated ones. Thus, the proposed variant not only enables us to develop fast optimization algorithms, but also keeps the superior prediction performance of the calibrated multi-task feature learning over the non-calibrated one.

论文关键词

neural network differential equation deep learning convolutional neural network convolutional neural software development deep neural network feature selection open source deep neural optical network learning approach geographic information system distributed generation geographic information stochastic differential equation power allocation open source software stochastic differential deep convolutional neural source software deep convolutional named entity recognition loss function human action space telescope personal area network mental health feature learning wireless personal area elastic optical multi-task learning personal area elastic optical network network reconfiguration wireless personal learning problem hubble space telescope learning task software developer high rate water pressure hubble space doubly stochastic open source tool source software development free and open capacity allocation low rate source software project group lasso path protection sparse learning backup path rate wireles open source gi spare capacity optimal reconfiguration source gi multi-task deep backward doubly stochastic doubly stochastic differential backward doubly big bang-big crunch low rate wireles survivable routing multi-task deep learning guidance sensor bang-big crunch big bang-big multi-task learning approach source software tool deep multi-task deep multi-task learning pose-invariant face high water pressure fine guidance shared backup fine guidance sensor multi-task convolutional neural multi-task learning model source gis software shared backup path multi-task feature bang-big crunch algorithm backup path protection approximating optimal crunch algorithm multi-task learning method software movement multi-task network spare capacity allocation multi-task convolutional multi-task cnn multi-task sparse multi-task feature learning multi-task regression multi-task allocation feature learning algorithm multiple related task multi-task multi-view multi-task learning algorithm soils contaminated