Maximum-Likelihood Estimation of the Discrete Coefficient of Determination in Stochastic Boolean Systems

The discrete Coefficient of Determination (CoD) has become a key component of inference methods for stochastic Boolean models. We develop a parametric maximum-likelihood (ML) method for the inference of the discrete CoD for static Boolean systems and for dynamical Boolean systems in the steady state. Using analytical and numerical approaches, we compare the performance of the parametric ML approach against that of common nonparametric alternatives for CoD estimation, which show that the parametric approach has the least bias, variance, and root mean-square (RMS) error, provided that the system noise level is not too high. Next we consider the application of the proposed estimation approach to the problem of system identification, where only partial knowledge about the system is available. Inference procedures are proposed for both the static and dynamical cases, and their performance in logic gate and wiring identification is assessed through numerical experiments. The results indicate that identification rates converge to 100% as sample size increases, and that the convergence rate is much faster as more prior knowledge is available. For wiring identification, the parametric ML approach is compared to the nonparametric approaches, and it produced superior identification rates, though as the amount of prior knowledge is reduced, its performance approaches that of the nonparametric ML estimator, which was generally the best nonparametric approach in our experiments.

[1]  Xiaotong Shen,et al.  Empirical Likelihood , 2002 .

[2]  Le Yu,et al.  Inference of a Probabilistic Boolean Network from a Single Observed Temporal Sequence , 2007, EURASIP J. Bioinform. Syst. Biol..

[3]  David Correa Martins,et al.  Intrinsically Multivariate Predictive Genes , 2008, IEEE Journal of Selected Topics in Signal Processing.

[4]  Jeffrey J. Hunter A survey of generalized inverses and their use in stochastic modelling , 2000 .

[5]  P. McCullagh Tensor Methods in Statistics , 1987 .

[6]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[7]  Edward R. Dougherty,et al.  Coefficient of determination in nonlinear signal processing , 2000, Signal Process..

[8]  Ting Chen,et al.  Exact Performance of CoD Estimators in Discrete Prediction , 2010, EURASIP J. Adv. Signal Process..

[9]  Yufei Huang,et al.  Genomic Signal Processing , 2012, IEEE Signal Processing Magazine.

[10]  E. Dougherty,et al.  Multivariate measurement of gene expression relationships. , 2000, Genomics.

[11]  M. R. Mickey,et al.  Estimation of Error Rates in Discriminant Analysis , 1968 .

[12]  D. Simon Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches , 2006 .

[13]  Xiaodong Wang,et al.  Binarization of microarray data on the basis of a mixture model. , 2003, Molecular cancer therapeutics.

[14]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[15]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[16]  A. Agresti [A Survey of Exact Inference for Contingency Tables]: Rejoinder , 1992 .

[17]  Gerry Leversha,et al.  Statistical inference (2nd edn), by Paul H. Garthwaite, Ian T. Jolliffe and Byron Jones. Pp.328. £40 (hbk). 2002. ISBN 0 19 857226 3 (Oxford University Press). , 2003, The Mathematical Gazette.

[18]  David Heckerman,et al.  Learning Bayesian Networks: A Unification for Discrete and Gaussian Domains , 1995, UAI.

[19]  Ting Chen,et al.  Maximum likelihood estimation of the binary Coefficient of Determination , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[20]  James H. Torrie,et al.  Principles and procedures of statistics: a biometrical approach (2nd ed) , 1980 .

[21]  Stuart A. Kauffman,et al.  The origins of order , 1993 .

[22]  Alex Simpkins,et al.  System Identification: Theory for the User, 2nd Edition (Ljung, L.; 1999) [On the Shelf] , 2012, IEEE Robotics & Automation Magazine.

[23]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[24]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[25]  Ting Chen,et al.  Sample-based estimators for the instrinsically multivariate prediction score , 2011, 2011 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS).