Occlusion-based estimation of independent multinomial random variables using occurrence and sequential information

Abstract This paper deals with the relatively new field of sequence-based estimation in which the goal is to estimate the parameters of a distribution by utilizing both the information in the observations and in their sequence of appearance. Traditionally, the Maximum Likelihood (ML) and Bayesian estimation paradigms work within the model that the data, from which the parameters are to be estimated, is known, and that it is treated as a set rather than as a sequence. The position that we take is that these methods ignore, and thus discard, valuable sequence -based information, and our intention is to obtain ML estimates by “extracting” the information contained in the observations when perceived as a sequence. The results of Oommen (November 2007) introduced the concepts of Sequence Based Estimation (SBE) for the Binomial distribution, where the authors derived the corresponding MLE results when the samples are taken two-at-a-time, and then extended these for the cases when they are processed three-at-a-time, four-at-a-time etc. This current paper generalizes these results for the multinomial case. The strategy we invoke involves a novel phenomenon called “Occlusion” that has not been reported in the field of estimation. The phenomenon can be described as follows: By occluding (hiding or concealing) certain observations, we map the estimation problem onto a lower-dimensional space, i.e., onto a binomial space. Once these occluded SBEs have been computed, we demonstrate how the overall Multinomial SBE (MSBE) can be obtained by mapping several lower-dimensional estimates, that are all bound by rigid probability constraints, onto the original higher-dimensional space. In each case, we formally prove and experimentally demonstrate the convergence of the corresponding estimates. The estimation methods proposed here have also been tested on real-life datasets from the UCI repository (Frank and Asuncion, 2013), and the accuracies obtained have been remarkable. We also discuss how various MSBEs can be fused to yield a superior MSBE, and present some potential applications of MSBEs. Our new estimates have great potential for practitioners, especially when the cardinality of the observation set is small.

[1]  B. John Oommen,et al.  On the Foundations of Multinomial Sequence Based Estimation , 2016, ICCCI.

[2]  E. Kreyszig,et al.  Advanced Engineering Mathematics. , 1974 .

[3]  B. John Oommen,et al.  Multinomial Sequence Based Estimation Using Contiguous Subsequences of Length Three , 2016, ICIAR.

[4]  Ralf Herbrich,et al.  Learning Kernel Classifiers: Theory and Algorithms , 2001 .

[5]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Sheldon M. Ross Introduction to probability models , 1998 .

[7]  J. Susan Milton,et al.  Introduction to Probability and Statistics: Principles and Applications for Engineering and the Computing Sciences , 1990 .

[8]  David G. Stork,et al.  Pattern Classification , 1973 .

[9]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[10]  Paul H. Garthwaite,et al.  Statistical Inference , 2002 .

[11]  Bangjun Lei,et al.  Classification, Parameter Estimation and State Estimation: An Engineering Approach Using MATLAB, 2nd Edition , 2017 .

[12]  S. Goldberg Probability; an Introduction , 1961 .

[13]  Horst Bunke Structural and Syntactic Pattern Recognition , 1993, Handbook of Pattern Recognition and Computer Vision.

[14]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[15]  Abraham Kandel,et al.  Introduction to Pattern Recognition: Statistical, Structural, Neural and Fuzzy Logic Approaches , 1999 .

[16]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Horst Bunke,et al.  Advances In Structural And Syntactic Pattern Recognition , 1993 .

[18]  Mario Vento,et al.  A Multi-expert Approach for Shot Classification in News Videos , 2004, ICIAR.

[19]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[20]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[21]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[22]  R. C. Sprinthall Basic Statistical Analysis , 1982 .

[23]  B. John Oommen,et al.  On the estimation of independent binomial random variables using occurrence and sequential information , 2007, Pattern Recognit..

[24]  José L. Núñez-Yáñez,et al.  A configurable statistical lossless compression core based on variable order Markov modeling and arithmetic coding , 2005, IEEE Transactions on Computers.