Cooperative Feature Selection in Personalized Medicine

The chapter discusses a research support system to identify diagnostic result patterns that characterise pertinent patient groups for personalized medicine. Example disease is breast cancer. The approach integrates established clinical findings with systems biology analyses. In this respect it is related to personalized medicine as well as translational research. Technically the system is a computer based support environment that links machine learning algorithms for classification with an interface for the medical domain expert. The involvement of the clinician has two reasons. On the one hand the intention is to impart an in-depth understanding of potentially relevant ‘omics’ findings from systems biology (e.g. genomics, transcriptomics, proteomics, and metabolomics) for actual patients in the context of clinical diagnoses. On the other hand the medical expert is indispensable for the process to rationally constrict the pertinent features towards a manageable selection of diagnostic findings. Without the suitable incorporation of domain expert knowledge machine based selections are often polluted by noise or irrelevant but massive variations. Selecting a subset of features is necessary in order to tackle the problem that for statistical reasons the amount of features has to be in an appropriate relationship to the number of cases that are available in a study (curse of dimensionality). The cooperative selection process is iterative. Interim results of analyses based on automatic temporary feature selections have to be graspable and criticisable by the medical expert. In order to support the understanding of machine learning results a prototype based approach is followed. The case type related documentation is in accordance with the way the human expert is cognitively structuring experienced cases. As the features for patient description are heterogeneous in their type and nature, the machine learning based feature selection has to handle different kinds of pertinent dissimilarities for the features and integrate them into a holistic representation.

[1]  Ponnuthurai Nagaratnam Suganthan,et al.  A novel kernel prototype-based learning algorithm , 2004, ICPR 2004.

[2]  Thomas Villmann,et al.  Divergence-Based Vector Quantization , 2011, Neural Computation.

[3]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[4]  Classical Cryptography,et al.  Bonn-Aachen International Center for Information Technology , 2008 .

[5]  Prahlad T. Ram,et al.  Cancer Systems Biology: a peek into the future of patient care? , 2014, Nature Reviews Clinical Oncology.

[6]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[7]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[8]  R. Bellman Dynamic programming. , 1957, Science.

[9]  Thomas Villmann,et al.  Generalized relevance learning vector quantization , 2002, Neural Networks.

[10]  Axel Kowald,et al.  Systems Biology - a Textbook , 2016 .

[11]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[12]  Naomi Miyake,et al.  Constructive Interaction and the Iterative Process of Understanding , 1986, Cogn. Sci..

[13]  J. Aitchison,et al.  Multivariate binary discrimination by the kernel method , 1976 .

[14]  Philip N. Johnson-Laird,et al.  Thinking; Readings in Cognitive Science , 1977 .

[15]  Waltraut Dietlind Zühlke Vector quantization based learning algorithms for mixed data types and their application in cognitive support systems for biomedical research , 2012 .

[16]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[17]  G. Shafer,et al.  Algorithmic Learning in a Random World , 2005 .

[18]  R. Wilcox Introduction to Robust Estimation and Hypothesis Testing , 1997 .

[19]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Atsushi Sato,et al.  Generalized Learning Vector Quantization , 1995, NIPS.

[21]  D. Hanahan,et al.  The Hallmarks of Cancer , 2000, Cell.

[22]  Glenn J. Myatt,et al.  A practical guide to designing interactive data visualizations , 2011 .

[23]  E. Berner,et al.  Overconfidence as a cause of diagnostic error in medicine. , 2008, The American journal of medicine.

[24]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.