A rekindled the interest in auto-encoder algorithms has been spurred by recent work on deep learning. Current efforts have been directed towards effective training of auto-encoder architectures with a large number of coding units. Here, we propose a learning algorithm for auto-encoders based on a rate-distortion objective that minimizes the mutual information between the inputs and the outputs of the auto-encoder subject to a fidelity constraint. The goal is to learn a representation that is minimally committed to the input data, but that is rich enough to reconstruct the inputs up to certain level of distortion. Minimizing the mutual information acts as a regularization term whereas the fidelity constraint can be understood as a risk functional in the conventional statistical learning setting. The proposed algorithm uses a recently introduced measure of entropy based on infinitely divisible matrices that avoids the plug in estimation of densities. Experiments using over-complete bases show that the rate-distortion auto-encoders can learn a regularized input-output mapping in an implicit manner.
[1]
Pascal Vincent,et al.
Higher Order Contractive Auto-Encoder
,
2011,
ECML/PKDD.
[2]
Marc'Aurelio Ranzato,et al.
Efficient Learning of Sparse Representations with an Energy-Based Model
,
2006,
NIPS.
[3]
Thomas Hofmann,et al.
Efficient Learning of Sparse Representations with an Energy-Based Model
,
2007
.
[4]
Pascal Vincent,et al.
Contractive Auto-Encoders: Explicit Invariance During Feature Extraction
,
2011,
ICML.
[5]
Yoshua Bengio,et al.
Extracting and composing robust features with denoising autoencoders
,
2008,
ICML '08.
[6]
C. Berg,et al.
Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions
,
1984
.
[7]
José Carlos Príncipe,et al.
Information Theoretic Learning with Infinitely Divisible Kernels
,
2013,
ICLR.
[8]
Toby Berger,et al.
Rate distortion theory : a mathematical basis for data compression
,
1971
.
[9]
William Bialek,et al.
Optimal Manifold Representation of Data: An Information Theoretic Approach
,
2003,
NIPS.
[10]
Thomas M. Cover,et al.
Elements of Information Theory
,
2005
.