Semiparametric inference on general functionals of two semicontinuous populations

In this paper, we propose new semiparametric procedures for making inference on linear functionals and their functions of two semicontinuous populations. The distribution of each population is usually characterized by a mixture of a discrete point mass at zero and a continuous skewed positive component, and hence such distribution is semicontinuous in the nature. To utilize the information from both populations, we model the positive components of the two mixture distributions via a semiparametric density ratio model. Under this model setup, we construct the maximum empirical likelihood estimators of the linear functionals and their functions, and establish the asymptotic normality of the proposed estimators. We show the proposed estimators of the linear functionals are more efficient than the fully nonparametric ones. The developed asymptotic results enable us to construct confidence regions and perform hypothesis tests for the linear functionals and their functions. We further apply these results to several important summary quantities such as the moments, the mean ratio, the coefficient of variation, and the generalized entropy class of inequality measures. Simulation studies demonstrate the advantages of our proposed semiparametric method over some existing methods. Two real data examples are provided for illustration.

[1]  Jiahua Chen,et al.  Empirical likelihood inference for multiple censored samples , 2018 .

[2]  Ing Rj Ser Approximation Theorems of Mathematical Statistics , 1980 .

[3]  Changbao Wu,et al.  Semiparametric inference of the Youden index and the optimal cut‐off point under density ratio models , 2020, Canadian Journal of Statistics.

[4]  Holger Dette,et al.  Box-Type Approximations in Nonparametric Factorial Designs , 1997 .

[5]  Jean-Marie Dufour,et al.  Permutation Tests for Comparing Inequality Measures , 2019 .

[6]  Changbao Wu,et al.  Empirical likelihood inference for two-sample problems , 2012 .

[7]  Pengfei Li,et al.  Testing homogeneity for multiple nonnegative distributions with excess zero observations , 2017, Comput. Stat. Data Anal..

[8]  Lili Tian,et al.  Empirical and Parametric Likelihood Interval Estimation for Populations With Many Zero Values: Application for Assessing Environmental Chemical Concentrations and Reproductive Health , 2010, Epidemiology.

[9]  Robert L. Schaefer,et al.  Introduction to Contemporary Statistical Methods , 1988 .

[10]  J. Shao,et al.  The jackknife and bootstrap , 1996 .

[11]  Yukun Liu,et al.  Quantile and quantile-function estimations under density ratio model , 2013, 1308.2845.

[12]  Xiao-Hua Zhou,et al.  Generalized confidence intervals for the ratio or difference of two means for lognormal populations with zeros , 2006 .

[13]  J. Qin,et al.  A goodness-of-fit test for logistic regression models based on case-control data , 1997 .

[14]  J. Anderson Multivariate logistic compounds , 1979 .

[15]  Marco Alfò,et al.  Editorial: Special issue on models for continuous data with a spike at zero , 2016, Biometrical journal. Biometrische Zeitschrift.

[16]  S G Thompson,et al.  Parametric modelling of cost data in medical studies , 2004, Statistics in medicine.

[17]  W. Tu,et al.  Comparison of Several Independent Population Means When Their Samples Contain Log‐Normal and Possibly Zero Observations , 1999, Biometrics.

[18]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[19]  Wanzhu Tu,et al.  Interval estimation for the ratio in means of log-normally distributed medical costs with zero values , 2000 .

[20]  L. Fernholz von Mises Calculus For Statistical Functionals , 1983 .

[21]  R. Kay,et al.  Transformations of the explanatory variables in the logistic regression model for binary data , 1987 .

[22]  Ya-hui Lu,et al.  A new two-part test based on density ratio model for zero-inflated continuous distributions , 2020, Applied Mathematics-A Journal of Chinese Universities.

[23]  Edgar Brunner,et al.  Asymptotic permutation tests in general factorial designs , 2015 .

[24]  Faysal Satter,et al.  Jackknife empirical likelihood for the mean difference of two zero-inflated skewed populations , 2021 .

[25]  M. Neuhäuser Nonparametric Statistical Tests , 2011 .

[26]  R. Serfling Approximation Theorems of Mathematical Statistics , 1980 .

[27]  Yukun Liu,et al.  Comparison of empirical likelihood and its dual likelihood under density ratio model , 2018 .

[28]  Shan Jiang,et al.  Inference on the probability P(T1 < T2) as a measurement of treatment effect under a density ratio model and random censoring , 2012, Comput. Stat. Data Anal..

[29]  J. Zidek,et al.  Hypothesis testing in the presence of multiple samples under density ratio models , 2013, 1309.4740.

[30]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[31]  Markus Neuhäuser,et al.  Nonparametric Statistical Tests: A Computational Approach , 2011 .

[32]  W. Tu,et al.  A Wald test comparing medical costs based on log-normal distributions with zero valued costs. , 1999, Statistics in medicine.

[33]  Pengfei Li,et al.  Semiparametric inference on the means of multiple nonnegative distributions with excess zero observations , 2018, J. Multivar. Anal..

[34]  J. Qin Biased sampling, over-identified parameter problems and beyond , 2017 .

[35]  Xiaotong Shen,et al.  Empirical Likelihood , 2002 .