Semi-supervised Clustering for Sparsely Sampled Longitudinal Data

Abstract Longitudinal data studies track the measurements of individual subjects over time. The features of the hidden classes in longitudinal data can be effectively extracted by clustering. In practice, however, longitudinal data analysis is hampered by the sparse sampling and different sampling points among subjects. These problems have been overcome by adopting a functional clustering data approach for sparsely sampled data, but this approach is unsuitable when the difference between classes is small. Therefore, we propose a semi-supervised approach for clustering sparsely sampled longitudinal data in which the clustering result is aided and biased by certain labeled subjects. The effectiveness of the proposed method was evaluated in simulation. The proposed method proved especially effective even when the difference between classes is blurred by interference such as noise. In summary, by adding some subjects with class information, we can enhance existing information to realize successful clustering.

[1]  James O. Ramsay,et al.  Functional Data Analysis , 2005 .

[2]  D. Billheimer Functional Data Analysis, 2nd edition edited by J. O. Ramsay and B. W. Silverman , 2007 .

[3]  G. Molenberghs,et al.  Linear Mixed Models for Longitudinal Data , 2001 .

[4]  John A. Rice,et al.  FUNCTIONAL AND LONGITUDINAL DATA ANALYSIS: PERSPECTIVES ON SMOOTHING , 2004 .

[5]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[6]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[7]  Catherine A. Sugar,et al.  Clustering for Sparsely Sampled Functional Data , 2003 .

[8]  H. Müller Functional Modelling and Classification of Longitudinal Data * , 2005 .

[9]  Hans-Georg Ller,et al.  Functional Modelling and Classification of Longitudinal Data. , 2005 .

[10]  L. Hubert,et al.  Comparing partitions , 1985 .

[11]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[12]  D. Cox Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[13]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  Adolfo Martínez Usó,et al.  A Semi-supervised Gaussian Mixture Model for Image Segmentation , 2010, 2010 20th International Conference on Pattern Recognition.

[16]  Shuichi Kawano,et al.  Semi-supervised logistic discrimination for functional data , 2011, 1102.4399.

[17]  Dario Cecilio Fernandes,et al.  Mixed models for longitudinal data , 2016 .