Folded normal regression models with applications in biomedicine

Abstract In biomedical studies, a difference or deviation is usually measured and only the magnitude is recorded but the algebraic sign of the data is irretrievably lost, the resulting observed variable no longer follows a normal distribution, rather it follows a folded normal (FN) distribution. More importantly, the FN distribution could be used to fit data sets with the following two characteristics: (i) The density curve is similar to the normal density but truncated somewhere; (ii) The density curve of the truncated side is significantly higher than that of the other side. There are several issues on the statistical inferences with the FN distribution which are not (well) addressed in the existing literature. In this paper, starting from the stochastic representation, we develop a new expectation–maximization (EM) algorithm to calculate the maximum likelihood estimates of parameters in both FN distribution and the FN regression models. The EM structure can also facilitate the Bayesian inferences about the FN distribution and the FN regression models. Extensions to the generalized FN distribution are provided. Simulation studies are conducted to assess the estimation performances for the FN distribution and the FN regression model. Two real data sets are analyzed to illustrate the proposed methods.

[1]  Stelios Psarakis,et al.  The Folded T Distribution , 1990 .

[2]  Rolf Sundberg,et al.  On estimation and testing for the folded normal distribution , 1974 .

[3]  Ashis Kumar Chakraborty,et al.  On multivariate folded normal distribution , 2013, Sankhya B.

[4]  Victor E. Kane,et al.  Process Capability Indices , 1986 .

[5]  C. Daniel Use of Half-Normal Plots in Interpreting Factorial Two-Level Experiments , 1959 .

[6]  S. Weisberg,et al.  An Introduction to Regression Graphics , 1994 .

[7]  N. L. Johnson,et al.  The Folded Normal Distribution: Accuracy of Estimation By Maximum Likelihood , 1962 .

[8]  L. S. Nelson,et al.  The Folded Normal Distribution , 1961 .

[9]  Joseph G. Hoffman,et al.  Principles of noise , 1958 .

[10]  Wen Lea Pearn,et al.  Distributional and inferential properties of the process accuracy and process precision indices , 1998 .

[11]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[12]  Mou-Yuan Liao Economic tolerance design for folded normal data , 2010 .

[13]  Hossein Hassani,et al.  On the Folded Normal Distribution , 2014, 1402.3559.

[14]  R. Elandt The Folded Normal Distribution: Two Methods of Estimating Parameters from Moments , 1961 .

[15]  N. L. Johnson,et al.  Distributional and Inferential Properties of Process Capability Indices , 1992 .

[16]  Hea-Jung Kim On the Ratio of Two Folded Normal Distributions , 2006 .

[17]  M. H. Rizvi,et al.  Some Selection Problems Involving Folded Normal Distribution , 1971 .

[18]  Sumith Gunasekera,et al.  The Folded Logistic Distribution , 2006 .

[19]  Hung-Chin Lin,et al.  The measurement of a process capability for folded normal process data , 2004 .

[20]  S. Psarakis,et al.  On Some Bivariate Extensions of the Folded Normal and the Folded-T Distributions , 2006 .

[21]  P. C. Lin Application of the generalized folded-normal distribution to the process capability measures , 2005 .

[22]  Norman L. Johnson,et al.  Cumulative Sum Control Charts for the Folded Normal Distribution , 1963 .

[23]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.