Analyzing quality of life survey using constrained co-clustering model for ordinal data and some dynamic implication

The dataset which motivated this work is a psychological survey on women affected by a breast tumor. Patients replied at different moments of their treatment to questionnaires with answers on ordinal scale. The questions relate to aspects of their life called dimensions. To assist the psychologists in analyzing the results, it is useful to emphasize a structure in the dataset. The clustering method achieves that by creating groups of individuals that are depicted by a representative of the group. From a psychological position , it is also useful to observe how questions may be clustered. The simultaneous clustering of both patients and questions is called co-clustering. However, getting questions into a same group when they are not related to the same dimension does not make sense from a psychologist stance. Therefore, a constrained co-clustering has been performed to prevent questions from different dimensions from getting assembled in a same column-cluster. Then, evolution of co-clusters along time has been investigated. The method relies on a constrained Latent Block Model embedding a probability distribution for ordinal data. Parameter estimation relies on a Stochastic EM-algorithm associated to a Gibbs sampler, and the ICL-BIC criterion is used for selecting the numbers of co-clusters.

[1]  Gérard Govaert,et al.  Estimation d'un modèle à blocs latents par l'algorithme SEM , 2010 .

[2]  V. Robert,et al.  Classification croisée pour l’analyse de bases de données de grandes dimensions de pharmacovigilance , 2017 .

[3]  M. Giordan,et al.  A Clustering Method for Categorical Ordinal Data , 2011 .

[4]  Monia Ranalli,et al.  Mixture models for ordinal data: a pairwise likelihood approach , 2014, Statistics and Computing.

[5]  Gérard Govaert,et al.  Estimation and selection for the latent block model on categorical data , 2015, Stat. Comput..

[6]  I. Sarason,et al.  Assessing the Quality of Personal Relationships , 1997 .

[7]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[8]  I. Sarason,et al.  Assessing Social Support: The Social Support Questionnaire. , 1983 .

[9]  Julien Jacques,et al.  Model-based clustering of multivariate ordinal data relying on a stochastic binary search algorithm , 2016, Stat. Comput..

[10]  J. Vermunt,et al.  Latent Gold 4.0 User's Guide , 2005 .

[11]  Ivy Liu,et al.  Analysis of Ordinal Categorical Data, 2nd edn by Alan Agresti , 2011 .

[12]  Julien Jacques,et al.  Model-based co-clustering for ordinal data , 2017, Comput. Stat. Data Anal..

[13]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[14]  Maria Iannario,et al.  On the identifiability of a mixture model for ordinal data , 2010 .

[15]  G. Huston The Hospital Anxiety and Depression Scale. , 1987, The Journal of rheumatology.

[16]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  T. Robbins,et al.  Heterogeneity of Parkinson’s disease in the early clinical stages using a data driven approach , 2005, Journal of Neurology, Neurosurgery & Psychiatry.

[18]  Damien McParland,et al.  Clustering Ordinal Data via Latent Variable Models , 2013, Algorithms from and for Nature and Life.

[19]  E. R. van den Heuvel,et al.  Trajectories of anxiety and depression in liver transplant candidates during the waiting-list period. , 2017, British journal of health psychology.

[20]  Brian Everitt,et al.  An Introduction to Latent Variable Models , 1984 .

[21]  Domenico Piccolo,et al.  On the Moments of a Mixture of Uniform and Shifted Binomial random variables , 2003 .

[22]  S. Zarit,et al.  Dimensions of Social Support and Social Conflict as Predictors of Caregiver Depression , 1995, International Psychogeriatrics.

[23]  D. Osoba,et al.  The European Organization for Research and Treatment of Cancer QLQ-C30: a quality-of-life instrument for use in international clinical trials in oncology. , 1993, Journal of the National Cancer Institute.

[24]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[25]  Évolution du contrôle religieux la première année suivant l’annonce d’un cancer du sein : quels liens avec les stratégies de coping, l’anxiété, la dépression et la qualité de vie ? , 2014 .