Fast construction of efficient composite likelihood equations

Growth in both size and complexity of modern data challenges the applicability of traditional likelihood-based inference. Composite likelihood (CL) methods address the difficulties related to model selection and computational intractability of the full likelihood by combining a number of low-dimensional likelihood objects into a single objective function used for inference. This paper introduces a procedure to combine partial likelihood objects from a large set of feasible candidates and simultaneously carry out parameter estimation. The new method constructs estimating equations balancing statistical efficiency and computing cost by minimizing an approximate distance from the full likelihood score subject to a L1-norm penalty representing the available computing resources. This results in truncated CL equations containing only the most informative partial likelihood score terms. An asymptotic theory within a framework where both sample size and data dimension grow is developed and finite-sample properties are illustrated through numerical examples.