Bridges between Deterministic and Probabilistic Models for Binary Data

For the analysis of binary data, various deterministic models have been proposed, which are generally simpler to fit and easier to understand than probabilistic models. We claim that corresponding to any deterministic model is an implicit stochastic model in which the deterministic model fits imperfectly, with errors occurring at random. In the context of binary data, we consider two error models in the first model, all predictions are equally likely to be in error; in the second model, the probability of error depends on the model prediction. We show how to fit these models using a stochastic modification of deterministic optimization schemes. The advantages of fitting the stochastic models explicitly (rather than implicitly, by simply fitting a deterministic model and accepting the occurrence of errors) include quantification of uncertainty in the deterministic model's parameter estimates, better estimation of the true model error rate, and the ability to check the fit of the model nontrivially. We illustrate with a simple theoretical example of item response data and with empirical examples from archaeology and the psychology of choice.

[1]  Jeroen Poblome,et al.  Production and Distribution of Sagalassos Red Slip Ware. A Dialogue with the Roman Economy , 1996 .

[2]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[3]  Peter C. C. Wang On incidence matrices , 1970 .

[4]  Iven Van Mechelen,et al.  Models for ordinal hierarchical classes analysis , 2001 .

[5]  Sakti P. Ghosh File organization , 1972, Commun. ACM.

[6]  Jeroen Poblome,et al.  Sagalassos Red Slip Ware: Typology and Chronology , 1998 .

[7]  Xiao-Li Meng,et al.  POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , 1996 .

[8]  R. Hofmann On Testing a Guttman Scale for Significance , 1979 .

[9]  Marc Waelkens,et al.  Report on the survey and excavation campaigns of 1996 and 1997 , 1997 .

[10]  Anatolian Studies , 1924, The Classical Review.

[11]  D. A. Preece,et al.  Good statistical practice , 1987 .

[12]  S. Stouffer,et al.  Measurement and Prediction , 1954 .

[13]  D. Andrich Hyperbolic Cosine Latent Trait Models for Unfolding Direct Responses and Pairwise Preferences , 1995 .

[14]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[15]  Iven Van Mechelen,et al.  Bayesian probabilistic extensions of a deterministic classification model , 2000, Comput. Stat..

[16]  M. Waelkens,et al.  Cremna and Sagalassus 1987 , 1988, Anatolian Studies.

[17]  D. Kendall Incidence matrices, interval graphs and seriation in archeology. , 1969 .

[18]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[19]  J. Poblome,et al.  The clay raw materials of Sagalassos Red Slip Ware. A chronological Evaluation , 1997 .

[20]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[21]  C. Coombs A theory of data. , 1965, Psychological review.

[22]  Academia Republicii Socialiste România,et al.  Mathematics in the archaeological and historical sciences : proceedings of the Anglo-Romanian Conference, Mamaia, 1970 , 1973 .

[23]  I. W. Molenaar,et al.  Rasch models: foundations, recent developments and applications , 1995 .

[24]  Joseph L. Zinnes,et al.  Theory and Methods of Scaling. , 1958 .

[25]  D. R. Fulkerson,et al.  Incidence matrices and interval graphs , 1965 .

[26]  L. Hubert,et al.  Combinatorial Data Analysis , 1992 .

[27]  Leo A. Goodman,et al.  A New Model for Scaling Response Patterns: An Application of the Quasi-Independence Concept , 1975 .

[28]  Herbert Hoijtink,et al.  A latent trait model for dichotomous choice data , 1990 .

[29]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[30]  Charles H. Proctor,et al.  A probabilistic formulation and statistical analysis of guttman scaling , 1970 .

[31]  L. A. Goodman Simple statistical methods for scalogram analysis , 1959 .

[32]  I. Vanmechelen,et al.  A Latent Criteria Model for Choice Data , 1994 .