A Fast Normalized Maximum Likelihood Algorithm for Multinomial Data

Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as model selection or data clustering. In the case of multinomial data, computing the modern version of stochastic complexity, defined as the Normalized Maximum Likelihood (NML) criterion, requires computing a sum with an exponential number of terms. Furthermore, in order to apply NML in practice, one often needs to compute a whole table of these exponential sums. In our previous work, we were able to compute this table by a recursive algorithm. The purpose of this paper is to significantly improve the time complexity of this algorithm. The techniques used here are based on the discrete Fourier transformand the convolution theorem.

[1]  Henry Tirri,et al.  On predictive distributions and Bayesian networks , 2000, Stat. Comput..

[2]  Petri Kontkanen COMPUTING THE REGRET TABLE FOR MULTINOMIAL DATA , 2005 .

[3]  Jorma Rissanen,et al.  Hypothesis Selection and Testing by the MDL Principle , 1999, Comput. J..

[4]  Peter Gr Unwald The minimum description length principle and reasoning under uncertainty , 1998 .

[5]  Mikko Koivisto,et al.  Sum-Product Algorithms for the Analysis of Genetic Risks , 2004 .

[6]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[7]  Jorma Rissanen,et al.  Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.

[8]  Jorma Rissanen,et al.  An MDL Framework for Data Clustering , 2005 .

[9]  Jorma Rissanen,et al.  Efficient Computation of Stochastic Complexity , 2003 .

[10]  Neri Merhav,et al.  Universal Prediction , 1998, IEEE Trans. Inf. Theory.

[11]  Y. Shtarkov AIM FUNCTIONS AND SEQUENTIAL ESTIMATION OF THE SOURCE MODEL FOR UNIVERSAL CODING , 1999 .

[12]  Henry Tirri,et al.  On Bayesian Case Matching , 1998, EWCBR.

[13]  Henry Tirri,et al.  Minimum Encoding Approaches for Predictive Modeling , 1998, UAI.

[14]  Andrew R. Barron,et al.  Asymptotic minimax regret for data compression, gambling, and prediction , 1997, IEEE Trans. Inf. Theory.

[15]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[16]  Henry Tirri,et al.  Supervised model-based visualization of high-dimensional data , 2000, Intell. Data Anal..

[17]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[18]  Jorma Rissanen,et al.  Strong optimality of the normalized ML models as universal codes and information in data , 2001, IEEE Trans. Inf. Theory.