Semiparametric empirical likelihood inference with estimating equations under density ratio models

The density ratio model (DRM) provides a flexible and useful platform for combining information from multiple sources. In this paper, we consider statistical inference under two-sample DRMs with additional parameters defined through and/or additional auxiliary information expressed as estimating equations. We examine the asymptotic properties of the maximum empirical likelihood estimators (MELEs) of the unknown parameters in the DRMs and/or defined through estimating equations, and establish the chi-square limiting distributions for the empirical likelihood ratio (ELR) statistics. We show that the asymptotic variance of the MELEs of the unknown parameters does not decrease if one estimating equation is dropped. Similar properties are obtained for inferences on the cumulative distribution function and quantiles of each of the populations involved. We also propose an ELR test for the validity and usefulness of the auxiliary information. Simulation studies show that correctly specified estimating equations for the auxiliary information result in more efficient estimators and shorter confidence intervals. Two real-data examples are used for illustrations.

[1]  S. Leather,et al.  Sampling theory and practice. , 2007 .

[2]  T. Mathew,et al.  Inferences on the means of lognormal distributions using generalized p-values and generalized confidence intervals , 2003 .

[3]  Thomas Mathew,et al.  Comparing the means and variances of a bivariate log‐normal distribution , 2008, Statistics in medicine.

[4]  Douglas M Hawkins,et al.  Diagnostics for conformity of paired quantitative measurements , 2002, Statistics in medicine.

[5]  Yukun Liu,et al.  Quantile and quantile-function estimations under density ratio model , 2013, 1308.2845.

[6]  A. Owen Empirical likelihood ratio confidence intervals for a single functional , 1988 .

[7]  Guoyong Jiang,et al.  Likelihood Analysis for the Ratio of Means of Two Independent Log‐Normal Distributions , 2002, Biometrics.

[8]  J. F. Lawless,et al.  Estimating equations, empirical likelihood and constraints on parameters† , 1995 .

[9]  R. Kay,et al.  Transformations of the explanatory variables in the logistic regression model for binary data , 1987 .

[10]  Xiaotong Shen,et al.  Empirical Likelihood , 2002 .

[11]  Kai Yu,et al.  Using covariate-specific disease prevalence information to increase the power of case-control studies , 2015 .

[12]  X H Zhou,et al.  Methods for comparing the means of two independent log-normal samples. , 1997, Biometrics.

[13]  Shan Jiang,et al.  Inference on the probability P(T1 < T2) as a measurement of treatment effect under a density ratio model and random censoring , 2012, Comput. Stat. Data Anal..

[14]  J. Zidek,et al.  Hypothesis testing in the presence of multiple samples under density ratio models , 2013, 1309.4740.

[15]  J. Anderson Multivariate logistic compounds , 1979 .

[16]  J. Simpson,et al.  A Bayesian Analysis of a Multiplicative Treatment Effect in Weather Modification , 1975 .

[17]  Jason P. Fine,et al.  On empirical likelihood for a semiparametric mixture model , 2002 .

[18]  J. Lawless,et al.  Empirical Likelihood and General Estimating Equations , 1994 .

[19]  Pengfei Li,et al.  Semiparametric Inference in a Genetic Mixture Model , 2017 .

[20]  G. Imbens,et al.  Combining Micro and Macro Data in Microeconometric Models , 1994 .

[21]  R. Carroll,et al.  Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-Level Information From External Big Data Sources , 2016, Journal of the American Statistical Association.

[22]  Jing Qin,et al.  Empirical likelihood ratio based confidence intervals for mixture proportions , 1999 .

[23]  Jiahua Chen,et al.  Semiparametric inference for the dominance index under the density ratio model , 2019, Biometrika.

[24]  Anthony C. Davison,et al.  Spectral Density Ratio Models for Multivariate Extremes , 2014 .

[25]  Jing Qin,et al.  A Semiparametric Approach to the One-Way Layout , 2001, Technometrics.

[26]  Min Tsao,et al.  Empirical likelihood inference for a common mean in the presence of heteroscedasticity , 2006 .

[27]  A. Keziou,et al.  On empirical likelihood for semiparametric two-sample density ratio models , 2008 .

[28]  Yukun Liu,et al.  Comparison of empirical likelihood and its dual likelihood under density ratio model , 2018 .

[29]  Pengfei Li,et al.  Testing homogeneity for multiple nonnegative distributions with excess zero observations , 2017, Comput. Stat. Data Anal..

[30]  Biao Zhang,et al.  Quantile estimation under a two-sample semi-parametric model , 2000 .

[31]  J. Qin,et al.  A goodness-of-fit test for logistic regression models based on case-control data , 1997 .

[32]  Pengfei Li,et al.  Composite empirical likelihood for multisample clustered data , 2021 .

[33]  Pengfei Li,et al.  Using a Monotonic Density Ratio Model to Find the Asymptotically Optimal Combination of Multiple Diagnostic Tests , 2016 .

[34]  Han Zhang,et al.  Generalized integration model for improved statistical inference by leveraging external summary data , 2020 .

[35]  Changbao Wu,et al.  Semiparametric inference of the Youden index and the optimal cut‐off point under density ratio models , 2020, Canadian Journal of Statistics.

[36]  Biao Zhang,et al.  Using logistic regression procedures for estimating receiver operating characteristic curves , 2003 .

[37]  Pengfei Li,et al.  Semiparametric inference on the means of multiple nonnegative distributions with excess zero observations , 2018, J. Multivar. Anal..

[38]  J. Qin Biased sampling, over-identified parameter problems and beyond , 2017 .