Analyzing time‐ordered event data with missed observations

Abstract A common problem with observational datasets is that not all events of interest may be detected. For example, observing animals in the wild can difficult when animals move, hide, or cannot be closely approached. We consider time series of events recorded in conditions where events are occasionally missed by observers or observational devices. These time series are not restricted to behavioral protocols, but can be any cyclic or recurring process where discrete outcomes are observed. Undetected events cause biased inferences on the process of interest, and statistical analyses are needed that can identify and correct the compromised detection processes. Missed observations in time series lead to observed time intervals between events at multiples of the true inter‐event time, which conveys information on their detection probability. We derive the theoretical probability density function for observed intervals between events that includes a probability of missed detection. Methodology and software tools are provided for analysis of event data with potential observation bias and its removal. The methodology was applied to simulation data and a case study of defecation rate estimation in geese, which is commonly used to estimate their digestive throughput and energetic uptake, or to calculate goose usage of a feeding site from dropping density. Simulations indicate that at a moderate chance to miss arrival events (p = 0.3), uncorrected arrival intervals were biased upward by up to a factor 3, while parameter values corrected for missed observations were within 1% of their true simulated value. A field case study shows that not accounting for missed observations leads to substantial underestimates of the true defecation rate in geese, and spurious rate differences between sites, which are introduced by differences in observational conditions. These results show that the derived methodology can be used to effectively remove observational biases in time‐ordered event data.

[1]  Casey A. Volino,et al.  A First Course in Stochastic Models , 2005, Technometrics.

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  B. Nolet,et al.  DIVING OF OTTERS (LUTRA-LUTRA) IN A MARINE HABITAT - USE OF DEPTHS BY A SINGLE-PREY LOADER , 1993 .

[4]  J. Charrassin,et al.  Artifacts arising from sampling interval in dive depth studies of marine endotherms , 1995, Polar Biology.

[5]  J. Andrew Royle N‐Mixture Models for Estimating Population Size from Spatially Replicated Counts , 2004, Biometrics.

[6]  D. Fink,et al.  Spatiotemporal exploratory models for broad-scale survey data. , 2010, Ecological applications : a publication of the Ecological Society of America.

[7]  David R. Hunter,et al.  mixtools: An R Package for Analyzing Mixture Models , 2009 .

[8]  Martijn van de Pol,et al.  A simple method for distinguishing within- versus between-subject effects using mixed models , 2009, Animal Behaviour.

[9]  H. Prins,et al.  SPRING GRAZING AND THE MANIPULATION OF FOOD QUALITY BY BARNACLE GEESE , 1981 .

[10]  What is the best way to estimate vigilance? A comparison of two methods for Gunnison's prairie dogs, Cynomys gunnisoni , 2016, Animal Behaviour.

[11]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[12]  Andrew R Solow,et al.  Inferring extinction from a sighting record. , 2005, Mathematical biosciences.

[13]  K. Pollock,et al.  EXPERIMENTAL ANALYSIS OF THE AUDITORY DETECTION PROCESS ON AVIAN POINT COUNTS , 2007 .

[14]  F. Leisch FlexMix: A general framework for finite mixture models and latent class regression in R , 2004 .

[15]  Arnold Neumaier,et al.  Mathematical Modeling of the Dynamics of Macroscopic Structural Transformations in Self-Propagating High-Temperature Synthesis , 2004 .

[16]  S. C. Choi,et al.  Maximum Likelihood Estimation of the Parameters of the Gamma Distribution and Their Bias , 1969 .

[17]  Henk C. Tijms,et al.  A First Course in Stochastic Models: Tijms/Stochastic Models , 2003 .

[18]  R. Bradley High-Resolution Paleoclimatology , 2011 .

[19]  J. Altmann,et al.  Observational study of behavior: sampling methods. , 1974, Behaviour.

[20]  Bryan F. J. Manly,et al.  Statistics for Environmental Science and Management , 2000 .

[21]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[22]  W. Kendall,et al.  First-Time Observer Effects in the North American Breeding Bird Survey , 1996 .

[23]  S. Buckland Introduction to distance sampling : estimating abundance of biological populations , 2001 .

[24]  Ram C. Tripathi,et al.  Statistical tests involving several independent gamma distributions , 1993 .

[25]  Lee J. Bain,et al.  Inferences concerning the Mean of the Gamma Distribution , 1980 .

[26]  W. Lichtenbelt,et al.  Food digestion by geese predicted from the quality of the food and retention time , 2005 .

[27]  G. Gauthier,et al.  Assessment of faecal output in geese , 1986 .

[28]  E. V. van Loon,et al.  From Sensor Data to Animal Behaviour: An Oystercatcher Example , 2012, PloS one.

[29]  W. D. van Marken Lichtenbelt,et al.  Using food quality and retention time to predict digestion efficiency in geese , 2005 .

[30]  J. Woodruff,et al.  How Unique was Hurricane Sandy? Sedimentary Reconstructions of Extreme Flooding from New York Harbor , 2014, Scientific Reports.

[31]  D. Hunter,et al.  mixtools: An R Package for Analyzing Mixture Models , 2009 .

[32]  Chris S. Elphick,et al.  How you count counts: the importance of methods research in applied ecology , 2008 .

[33]  Roslyn Dakin,et al.  Analysis of the Optimal Duration of Behavioral Observations Based on an Automated Continuous Monitoring System in Tree Swallows (Tachycineta bicolor): Is One Hour Good Enough? , 2015, PloS one.

[34]  H. Dam,et al.  Coupling of ingestion and defecation as a function of diet in the calanoid copepod Acartia tonsa , 2002 .

[35]  C. Krebs,et al.  Can the Solar Cycle and Climate Synchronize the Snowshoe Hare Cycle in Canada? Evidence from Tree Rings and Ice Cores , 1993, The American Naturalist.

[36]  M. Owen THE SELECTION OF FEEDING SITE BY WHITE-FRONTED GEESE IN WINTER , 1971 .

[37]  S. S. Wilks The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses , 1938 .

[38]  Lee J. Bain,et al.  Test of equal gamma-distribution means with unknown and unequal shape parameters , 1988 .