Bi-linear matrix-variate analyses, integrative hypothesis tests, and case-control studies

We pursue a threefold purpose in this paper. First, we suggest a Kullback-Leibler formulation for developing a statistics and making discriminative projection for case-control studies, based on which existing typical methods are revisited and then further extended to matrix-variate counterparts. Second, we propose a bi-linear matrix form, based on which multivariate discriminative analysis and logistic, Cox, and linear mixed regression are extended into their matrix-variate counterparts. Third, we systematically address the necessity, feasibility, and methodology of integrative hypothesis tests (IHT) from the complementarity of model-based test and boundary-based test (BBT) in the data (D)-space, statistics (S)-space, and probability (P)-space. We elaborate four IHT components (modelling, comparison, classification, and assurance) and summarise four IHT types in the D-space. Then, we extend the existing efforts on multivariate tests to BBTs in the S-space. Particularly, we extend the classic univariate one-tail z-test to the multivariate ones, which is then applied to a multivariate sample-pairing delta (SPD) test for detecting a collective inclining dominance. Also, we propose a SPD discriminative analysis that extends this SPD test. Moreover, we propose a multivariate bi-test that tests the classic null and also a null about the inference reliability due to test space complexity, including a further development of Fisher combination. Finally, we suggest possible applications for gene expression biomarkers and exome-sequencing-based joint single-nucleotide variant (SNV) detection.

[1]  Eugene Demidenko,et al.  Mixed Models: Theory and Applications with R , 2013 .

[2]  Lei Xu,et al.  Matrix-Variate Discriminative Analysis, Integrative Hypothesis Testing, and Geno-Pheno A5 Analyzer , 2012, IScIDE.

[3]  Lei Xu,et al.  A theoretical investigation of several model selection criteria for dimensionality reduction , 2012, Pattern Recognit. Lett..

[4]  Lei Xu,et al.  Further advances on Bayesian Ying-Yang harmony learning , 2015, Applied Informatics.

[5]  Christopher K. I. Williams Learning Kernel Classifiers , 2003 .

[6]  Lei Xu,et al.  Learning binary factor analysis with automatic model selection , 2014, Neurocomputing.

[7]  H. Kipen,et al.  Questions and Answers 1 , 1994 .

[8]  D. Cox,et al.  Analysis of Survival Data. , 1985 .

[9]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[10]  Lei Xu,et al.  Independent Subspaces , 2009, Encyclopedia of Artificial Intelligence.

[11]  M. Degroot,et al.  Probability and Statistics , 2021, Examining an Operational Approach to Teaching Probability.

[12]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[13]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[14]  R. Wilson,et al.  The Next-Generation Sequencing Revolution and Its Impact on Genomics , 2013, Cell.

[15]  Lei Sun,et al.  Robust and Powerful Tests for Rare Variants Using Fisher's Method to Combine Evidence of Association From Two or More Complementary Tests , 2013, Genetic epidemiology.

[16]  Dajiang J. Liu,et al.  Meta-Analysis of Gene Level Tests for Rare Variant Association , 2013, Nature Genetics.

[17]  Nianjun Liu,et al.  Rare Variant Association Testing by Adaptive Combination of P-values , 2014, PloS one.

[18]  Uwe Fink,et al.  Classic Works Of The Dempster Shafer Theory Of Belief Functions , 2016 .

[19]  L. Xu Independent Component Analysis and Extensions with Noise and Time: A Bayesian Ying-Yang Learning Perspective , 2003 .

[20]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[21]  R. Engle Wald, likelihood ratio, and Lagrange multiplier tests in econometrics , 1984 .

[22]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[23]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[24]  I. Simon,et al.  Studying and modelling dynamic biological processes using time-series gene expression data , 2012, Nature Reviews Genetics.

[25]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[26]  Rodney X. Sturdivant,et al.  Applied Logistic Regression: Hosmer/Applied Logistic Regression , 2005 .

[27]  Lei Xu,et al.  Codimensional matrix pairing perspective of BYY harmony learning: hierarchy of bilinear systems, joint decomposition of data-covariance, and applications of network biology , 2011 .

[28]  M. K. Luhandjula Studies in Fuzziness and Soft Computing , 2013 .

[29]  H. Hotelling The Generalization of Student’s Ratio , 1931 .

[30]  X. Lei,et al.  An investigation of several typical model selection criteria for detecting the number of signals , 2011 .

[31]  Greg Gibson,et al.  Rare and common variants: twenty arguments , 2012, Nature Reviews Genetics.

[32]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[33]  Stephanie Roessler,et al.  MicroRNA expression, survival, and response to interferon in liver cancer. , 2009, The New England journal of medicine.

[34]  Shun-ichi Amari,et al.  Combining Classifiers and Learning Mixture-of-Experts , 2009, Encyclopedia of Artificial Intelligence.

[35]  L. Xu Bayesian Ying-Yang system, best harmony learning, and five action circling , 2010 .

[36]  Josef Kittler,et al.  Fast branch & bound algorithms for optimal feature selection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  J. Ioannidis,et al.  Meta-analysis methods for genome-wide association studies and beyond , 2013, Nature Reviews Genetics.

[38]  X. Lei,et al.  Codimensional matrix pairing perspective of BYY harmony learning: hierarchy of bilinear systems, joint decomposition of data-covariance, and applications of network biology , 2011 .

[39]  D. Zaykin,et al.  Optimally weighted Z‐test is a powerful method for combining probabilities in meta‐analysis , 2011, Journal of evolutionary biology.

[40]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[41]  L. Xu Independent Component Analysis and Extensions with Noise and Time: A Bayesian Ying-Yang Learning Perspective , 2003 .

[42]  Å. Borg,et al.  Identification of new microRNAs in paired normal and tumor breast tissue suggests a dual role for the ERBB2/Her2 gene. , 2011, Cancer research.

[43]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[44]  Lei Xu,et al.  On essential topics of BYY harmony learning: Current status, challenging issues, and gene analysis applications , 2012 .

[45]  L. Xu,et al.  Semi-blind bilinear matrix system, BYY harmony learning, and gene analysis applications , 2012, 2012 6th International Conference on New Trends in Information Science, Service Science and Data Mining (ISSDM2012).

[46]  Xiaotong Shen,et al.  A Powerful and Adaptive Association Test for Rare Variants , 2014, Genetics.

[47]  P. Dutilleul The mle algorithm for the matrix normal distribution , 1999 .

[48]  Lei Xu,et al.  Best first strategy for feature selection , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[49]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[50]  Lei Xu,et al.  Integrative Hypothesis Test and A5 Formulation: Sample Pairing Delta, Case Control Study, and Boundary Based Statistics , 2013, IScIDE.

[51]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[52]  M Richard Simon,et al.  Design and Analysis of DNA Microarray Investigations , 2004 .

[53]  Jeffrey A. Barnett,et al.  Computational Methods for a Mathematical Theory of Evidence , 1981, IJCAI.