We introduce a novel framework for simultaneous structure and parameter learning in hidden-variable conditional probability models, based on an entropic prior and a solution for its maximum a posteriori (MAP) estimator. The MAP estimate minimizes uncertainty in all respects: cross-entropy between model and data; entropy of the model; entropy of the data's descriptive statistics. Iterative estimation extinguishes weakly supported parameters, compressing and sparsifying the model. Trimming operators accelerate this process by removing excess parameters and, unlike most pruning schemes, guarantee an increase in posterior probability. Entropic estimation takes a overcomplete random model and simplifies it, inducing the structure of relations between hidden and observed variables. Applied to hidden Markov models (HMMs), it finds a concise finite-state machine representing the hidden structure of a signal. We entropically model music, handwriting, and video time-series, and show that the resulting models are highly concise, structured, predictive, and interpretable: Surviving states tend to be highly correlated with meaningful partitions of the data, while surviving transitions provide a low-perplexity model of the signal dynamics.
[1]
Lawrence R. Rabiner,et al.
A tutorial on Hidden Markov Models
,
1986
.
[2]
Matthew Brand,et al.
Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction
,
1999,
Neural Computation.
[3]
Matthew Brand,et al.
Pattern discovery via entropy minimization
,
1999,
AISTATS.
[4]
Andreas Stolcke,et al.
Best-first Model Merging for Hidden Markov Model Induction
,
1994,
ArXiv.
[5]
F. Wolfertstetter,et al.
Structured Markov models for speech recognition
,
1995,
1995 International Conference on Acoustics, Speech, and Signal Processing.
[6]
Vladimir Vovk.
Minimum description length estimators under the optimal coding scheme
,
1995,
EuroCOLT.
[7]
Lawrence R. Rabiner,et al.
A tutorial on hidden Markov models and selected applications in speech recognition
,
1989,
Proc. IEEE.
[8]
Catherine Blake,et al.
UCI Repository of machine learning databases
,
1998
.