Batch Model for Batched Timestamps Data Analysis with Application to the SSA Disability Program

The Office of Disability Adjudication and Review (ODAR) is responsible for holding hearings, issuing decisions, and reviewing appeals as part of the Social Security Administration's disability determining process. In order to control and process cases, the ODAR has established a Case Processing and Management System (CPMS) to record management information since December 2003. The CPMS provides a detailed case status history for each case. Due to the large number of appeal requests and limited resources, the number of pending claims at ODAR was over one million cases by March 31, 2015. Our National Institutes of Health (NIH) team collaborated with SSA and developed a Case Status Change Model (CSCM) project to meet the ODAR's urgent need of reducing backlogs and improve hearings and appeals process. One of the key issues in our CSCM project is to estimate the expected service time and its variation for each case status code. The challenge is that the system's recorded job departure times may not be the true job finished times. As the CPMS timestamps data of case status codes showed apparent batch patterns, we proposed a batch model and applied the constrained least squares method to estimate the mean service times and the variances. We also proposed a batch search algorithm to determine the optimal batch partition, as no batch partition was given in the real data. Simulation studies were conducted to evaluate the performance of the proposed methods. Finally, we applied the method to analyze a real CPMS data from ODAR/SSA.

[1]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[2]  Christian Hennig,et al.  Asymmetric Linear Dimension Reduction for Classification , 2004 .

[3]  Christian Hennig,et al.  A Method for Visual Cluster Validation , 2005, GfKl.

[4]  Sanjay Ranka,et al.  Statistical change detection for multi-dimensional data , 2007, KDD '07.

[5]  Martin Guha,et al.  Encyclopedia of Statistics in Behavioral Science , 2006 .

[6]  Michael I. Jordan,et al.  Bayesian inference for queueing networks and modeling of internet services , 2010, 1001.3355.

[7]  Shalabh,et al.  Linear Models and Generalizations: Least Squares and Alternatives , 2007 .

[8]  Ulrich Bodenhofer,et al.  APCluster: an R package for affinity propagation clustering , 2011, Bioinform..

[9]  Adrian E. Raftery,et al.  mclust Version 4 for R : Normal Mixture Modeling for Model-Based Clustering , Classification , and Density Estimation , 2012 .

[10]  Karline Soetaert,et al.  Package limSolve , solving linear inverse models in R , 2009 .

[11]  H AhmadiJavad,et al.  Batching and Scheduling Jobs on Batch and Discrete Processors , 1992 .

[12]  Arjun K. Gupta,et al.  Parametric Statistical Change Point Analysis , 2000 .

[13]  Pierre Comon,et al.  Handbook of Blind Source Separation: Independent Component Analysis and Applications , 2010 .

[14]  Yoni Nazarathy,et al.  Parameter and State Estimation in Queues and Related Stochastic Models: A Bibliography , 2017, ArXiv.

[15]  Guoqing Wang,et al.  Batching and scheduling to minimize the makespan in the two-machine flowshop , 1998 .

[16]  Xiangliang Zhang,et al.  A PCA-Based Change Detection Framework for Multidimensional Data Streams: Change Detection in Multidimensional Data Streams , 2015, KDD.

[17]  S. A. COONS Constrained least-squares , 1978, Comput. Graph..

[18]  Junjie Wu,et al.  Spectral Ensemble Clustering , 2015, KDD.

[19]  Jon Wakefield,et al.  Bayesian and Frequentist Regression Methods , 2013 .

[20]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[21]  Reza H. Ahmadi,et al.  Batching and Scheduling Jobs on Batch and Discrete Processors , 1992, Oper. Res..

[22]  Richard C. Larson The queue inference engine: deducing queue statistics from transactional data , 1990 .

[23]  S. Geer,et al.  Least Squares Estimation , 2005 .

[24]  D. Daley,et al.  Exploiting Markov chains to infer queue length from transactional data , 1992 .

[25]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[26]  David V. Conesa,et al.  Statistical performance of a multiclass bulk production queueing system , 2004, Eur. J. Oper. Res..

[27]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .