Deep Neural Networks Guided Ensemble Learning for Point Estimation in Finite Samples.

As one of the most important estimators in classical statistics, the uniformly minimum variance unbiased estimator (UMVUE) has been adopted for point estimation in many statistical studies, especially for small sample problems. Moving beyond typical settings in the exponential distribution family, it is usually challenging to prove the existence and further construct such UMVUE in finite samples. For example in the ongoing Adaptive COVID-19 Treatment Trial (ACTT), it is hard to characterize the complete sufficient statistics of the underlying treatment effect due to pre-planned modifications to design aspects based on accumulated unblinded data. As an alternative solution, we propose a Deep Neural Networks (DNN) guided ensemble learning framework to construct an improved estimator from existing ones. We show that our estimator is consistent and asymptotically reaches the minimal variance within the class of linearly combined estimators. Simulation studies are further performed to demonstrate that our proposed estimator has considerable finite-sample efficiency gain. In the ACTT on COVID-19 as an important application, our method essentially contributes to a more ethical and efficient adaptive clinical trial with fewer patients enrolled.

[1]  Dapeng Wu,et al.  Learning Topology and Dynamics of Large Recurrent Neural Networks , 2014, IEEE Transactions on Signal Processing.

[2]  C. Vogel Computational Methods for Inverse Problems , 1987 .

[3]  Changbao Wu,et al.  Asymptotic Theory of Nonlinear Least Squares Estimation , 1981 .

[4]  Halbert White,et al.  Connectionist nonparametric regression: Multilayer feedforward networks can learn arbitrary mappings , 1990, Neural Networks.

[5]  L Shen,et al.  An improved method of evaluating drug effect in a multiple dose clinical trial , 2001, Statistics in medicine.

[6]  Christopher K. Wikle,et al.  An ensemble quadratic echo state network for non‐linear spatio‐temporal forecasting , 2017, 1708.05094.

[7]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[8]  Yujia Deng,et al.  Deep learning from a statistical perspective , 2020, Stat.

[9]  A remark on estimating the mean of a normal distribution with known coefficient of variation , 2015 .

[10]  Qingyi Gao,et al.  Theoretical Investigation of Generalization Bounds for Adversarial Learning of Deep Neural Networks , 2021, Journal of Statistical Theory and Practice.

[11]  D. Berry Bayesian clinical trials , 2006, Nature Reviews Drug Discovery.

[12]  Gilead Announces Results From Phase 3 Trial of Investigational Antiviral Remdesivir in Patients With Severe COVID-19 , 2020 .

[13]  Sarah Eichmann Introductory Econometrics With Applications , 2016 .

[14]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[15]  Hao Chen,et al.  Theoretical Investigation of Generalization Bound for Residual Networks , 2019, IJCAI.

[16]  Xiao Wang,et al.  Nonlinear Variable Selection via Deep Neural Networks , 2020, J. Comput. Graph. Stat..

[17]  Jelena Bradic Randomized maximum-contrast selection: subagging for large-scale regression , 2013, 1306.3494.

[18]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[19]  Joseph G Ibrahim,et al.  Bayesian design of superiority clinical trials for recurrent events data with applications to bleeding and transfusion events in myelodyplastic syndrome. , 2014, Biometrics.

[20]  Jinchi Lv,et al.  Statistical insights into deep neural network learning in subspace classification , 2020, Stat.

[21]  Jelena Bradic,et al.  DeepHazard: neural network for time-varying risks , 2020, ArXiv.

[22]  Han Liu,et al.  Challenges of Big Data Analysis. , 2013, National science review.

[23]  Frank Bretz,et al.  Adaptive designs for confirmatory clinical trials , 2009, Statistics in medicine.

[24]  John McDonald,et al.  Using least squares and tobit in second stage DEA efficiency analyses , 2009, Eur. J. Oper. Res..

[25]  Parul Parashar,et al.  Neural Networks in Machine Learning , 2014 .

[26]  Wenbin Lu,et al.  Deep advantage learning for optimal dynamic treatment regime , 2018, Statistical theory and related fields.

[27]  Francis R. Bach,et al.  Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..

[28]  R. Jennrich Asymptotic Properties of Non-Linear Least Squares Estimators , 1969 .

[29]  William Stafford Noble,et al.  DeepPINK: reproducible feature selection in deep neural networks , 2018, NeurIPS.

[30]  Xiao Wang,et al.  Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units , 2018, NeurIPS.

[31]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[32]  John Whitehead,et al.  Estimation following selection of the largest of two normal means. Journal of Statistical Planning and Inference 138, 1629-1638. , 2008 .

[33]  Jie Shen,et al.  Efficient Spectral Sparse Grid Methods and Applications to High-Dimensional Elliptic Problems , 2010, SIAM J. Sci. Comput..

[34]  Dmitry Yarotsky,et al.  Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.

[35]  Peter Wanke The uniform distribution as a first practical approach to new product inventory management , 2008 .

[36]  T. Amemiya Regression Analysis When the Variance of the Dependent Variable Is Proportional to the Square of Its Expectation , 1973 .

[37]  Isaac Meilijson,et al.  An Example of an Improvable Rao–Blackwell Improvement, Inefficient Maximum Likelihood Estimator, and Unbiased Generalized Bayes Estimator , 2016, The American statistician.

[38]  Han Liu,et al.  Heterogeneity adjustment with applications to graphical model inference. , 2016, Electronic journal of statistics.

[39]  Yang Feng,et al.  RaSE: Random Subspace Ensemble Classification , 2020, J. Mach. Learn. Res..

[40]  Guang Cheng,et al.  Directional Pruning of Deep Neural Networks , 2020, NeurIPS.

[41]  Guannan Zhang,et al.  A Hyperspherical Adaptive Sparse-Grid Method for High-Dimensional Discontinuity Detection , 2015, SIAM J. Numer. Anal..

[42]  D. Spiegelhalter,et al.  Summarizing historical information on controls in clinical trials , 2010, Clinical trials.

[43]  Guang Cheng,et al.  Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee , 2020, NeurIPS.

[44]  Dapeng Oliver Wu,et al.  Why Deep Learning Works: A Manifold Disentanglement Perspective , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[45]  G. M.,et al.  Partial Differential Equations I , 2023, Applied Mathematical Sciences.

[46]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[47]  Erwan Scornet,et al.  Neural Random Forests , 2016, Sankhya A.

[48]  Zhuoran Yang,et al.  Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval , 2018, ArXiv.

[49]  Jonathan R. Stroud,et al.  Understanding the Ensemble Kalman Filter , 2016 .

[50]  Haoda Fu,et al.  Boosting Algorithms for Estimating Optimal Individualized Treatment Rules , 2020, ArXiv.

[51]  Gérard Biau,et al.  Analysis of a Random Forests Model , 2010, J. Mach. Learn. Res..

[52]  Michael H. Anderson,et al.  Rent-to-own agreements: Customer characteristics and contract outcomes , 2009 .