Implementation of a web based universal exchange and inference language for medicine: Sparse data, probabilities and inference in data mining of clinical data repositories

We extend Q-UEL, our universal exchange language for interoperability and inference in healthcare and biomedicine, to the more traditional fields of public health surveys. These are the type associated with screening, epidemiological and cross-sectional studies, and cohort studies in some cases similar to clinical trials. There is the challenge that there is some degree of split between frequentist notions of probability as (a) classical measures based only on the idea of counting and proportion and on classical biostatistics as used in the above conservative disciplines, and (b) more subjectivist notions of uncertainty, belief, reliability, or confidence often used in automated inference and decision support systems. Samples in the above kind of public health survey are typically small compared with our earlier "Big Data" mining efforts. An issue addressed here is how much impact on decisions should sparse data have. We describe a new Q-UEL compatible toolkit including a data analytics application DiracMiner that also delivers more standard biostatistical results, DiracBuilder that uses its output to build Hyperbolic Dirac Nets (HDN) for decision support, and HDNcoherer that ensures that probabilities are mutually consistent. Use is exemplified by participating in a real word health-screening project, and also by deployment in a industrial platform called the BioIngine, a cognitive computing platform for health management.

[1]  Barry Robson,et al.  The dragon on the gold: myths and realities for data mining in biomedicine and biotechnology using digital and molecular libraries. , 2004, Journal of proteome research.

[2]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[3]  Renato Coppi,et al.  A theoretical framework for data mining: the informational paradigm , 2002 .

[4]  Tohru Nitta,et al.  On the decision boundaries of hyperbolic neurons , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[5]  Randolph A. Miller,et al.  Review: Medical Diagnostic Decision Support Systems - Past, Present, And Future: A Threaded Bibliography and Brief Commentary , 1994, J. Am. Medical Informatics Assoc..

[6]  S. Toulmin The uses of argument , 1960 .

[8]  P. Holmes Chaotic Dynamics , 1985, IEEE Power Engineering Review.

[9]  Barry Robson,et al.  The new physician as unwitting quantum mechanic: is adapting Dirac's inference system best practice for personalized medicine, genomics, and proteomics? , 2007, Journal of proteome research.

[10]  Robert E. Hoyt,et al.  Health Informatics: Practical Guide for Healthcare and Information Technology Professionals , 2010 .

[11]  Paramasivan Saratchandran,et al.  A new learning algorithm with logarithmic performance index for complex-valued neural networks , 2009, Neurocomputing.

[12]  P. Dirac Principles of Quantum Mechanics , 1982 .

[13]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[14]  Galit Shmueli,et al.  To Explain or To Predict? , 2010 .

[15]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[17]  Barry Robson,et al.  Clinical and pharmacogenomic data mining: 4. The FANO program and command set as an example of tools for biomedical discovery and evidence based medicine. , 2008, Journal of proteome research.

[18]  M Fieschi,et al.  Medical Decision Support Systems: Old Dilemmas and new Paradigms? , 2003, Methods of Information in Medicine.

[19]  Andrei Khrennikov Hyperbolic quantum mechanics , 2000 .

[20]  L. Kohn,et al.  To Err Is Human : Building a Safer Health System , 2007 .

[21]  Thomas Gottron,et al.  Online dating recommender systems: the split-complex number approach , 2012, RSWeb@RecSys.

[22]  C. Mira,et al.  Chaotic Dynamics: From the One-Dimensional Endomorphism to the Two-Dimensional Diffeomorphism , 1987 .

[23]  Bart Verheij,et al.  Arguing on the Toulmin Model , 2006, Arguing on the Toulmin Model.

[24]  H. Whitmore,et al.  The Diffusion of Decision Support Systems in Healthcare: Are We There Yet? , 2000, Journal of healthcare management / American College of Healthcare Executives.

[25]  William J. Clancey,et al.  Rule-based expert systems , 2017, Radiopaedia.org.

[26]  Bruce G. Buchanan,et al.  The MYCIN Experiments of the Stanford Heuristic Programming Project , 1985 .

[27]  Edward H. Shortliffe,et al.  Rule Based Expert Systems: The Mycin Experiments of the Stanford Heuristic Programming Project (The Addison-Wesley series in artificial intelligence) , 1984 .

[28]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[29]  Ida Sim,et al.  A taxonomic description of computer-based clinical decision support systems , 2006, J. Biomed. Informatics.

[30]  Arianna Borrelli Quantum Statistics , 2009, Compendium of Quantum Physics.

[31]  Yasuaki Kuroe,et al.  Models of Hopfield-Type Clifford Neural Networks and Their Energy Functions - Hyperbolic and Dual Valued Networks - , 2011, ICONIP.

[32]  David Heckerman,et al.  Bayesian Networks for Data Mining , 2004, Data Mining and Knowledge Discovery.

[33]  B. Robson,et al.  Analysis of code relating sequences to conformation in globular prtoeins. Theory and application of expected information. , 1974, The Biochemical journal.

[34]  Barry Robson,et al.  Data mining and clinical data repositories: Insights from a 667, 000 patient data set , 2006, Comput. Biol. Medicine.

[35]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[36]  E. Lehmann Fisher, Neyman, and the Creation of Classical Statistics , 2011 .

[37]  Barry Robson,et al.  Drug Gold and Data Dragons: Myths and Realities of Data Mining in the Pharmaceutical Industry , 2009 .

[38]  Enrico Coiera,et al.  Guide to health informatics , 2015 .

[39]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[40]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[41]  Robert A. Greenes,et al.  Clinical Decision Support , 2007 .

[42]  Andrei Khrennikov,et al.  On Quantum-Like Probabilistic Structure of Mental Information , 2004, Open Syst. Inf. Dyn..

[43]  Andrei Khrennikov,et al.  Contextual Approach to Quantum Formalism , 2009 .

[44]  Sue Ellen Haupt,et al.  Artificial Intelligence Methods in the Environmental Sciences , 2008 .

[45]  David Hitchcock,et al.  Arguing on the Toulmin Model: New Essays in Argument Analysis and Evaluation , 2010 .

[46]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[47]  Barry Robson,et al.  Suggestions for a Web based universal exchange and inference language for medicine , 2013, Comput. Biol. Medicine.

[48]  J. Garnier,et al.  Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. , 1978, Journal of molecular biology.

[49]  Gerald Sommer,et al.  A hyperbolic multilayer perceptron , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[50]  Barry Robson,et al.  Clinical and pharmacogenomic data mining: 3. Zeta theory as a general tactic for clinical bioinformatics. , 2005, Journal of proteome research.

[51]  Paul Glasziou,et al.  Evidence-Based Medicine, 4th Edition How to Practice and Teach it , 2010 .

[52]  Barry Robson,et al.  Split-complex numbers and Dirac bra-kets , 2014, Commun. Inf. Syst..

[53]  Barry Robson,et al.  Suggestions for a web based universal exchange and inference language for medicine. Continuity of patient care with PCAST disaggregation , 2015, Comput. Biol. Medicine.

[54]  Michael Barr,et al.  The Emperor's New Mind , 1989 .

[55]  Florin Moldoveanu,et al.  Non viability of hyperbolic quantum mechanics as a theory of Nature , 2013, 1311.6461.

[56]  Dominic Rochon,et al.  A Bicomplex Riemann Zeta Function , 2004 .

[57]  Ralph B D'Agostino,et al.  Framingham risk score and prediction of lifetime risk for coronary heart disease. , 2004, The American journal of cardiology.

[58]  Barry Robson,et al.  POPPER, a simple programming language for probabilistic semantic inference in medicine , 2015, Comput. Biol. Medicine.

[59]  Barry Robson,et al.  Hyperbolic Dirac Nets for medical decision support. Theory, methods, and comparison with Bayes Nets , 2014, Comput. Biol. Medicine.

[60]  Paul Adrien Maurice Dirac,et al.  A new notation for quantum mechanics , 1939, Mathematical Proceedings of the Cambridge Philosophical Society.

[61]  Tohru Nitta,et al.  Solving the XOR problem and the detection of symmetry using a single complex-valued neuron , 2003, Neural Networks.

[62]  Lawrence M. Fagan,et al.  Medical informatics: computer applications in health care and biomedicine (Health informatics) , 2003 .

[63]  A. Wall,et al.  Book ReviewTo Err is Human: building a safer health system Kohn L T Corrigan J M Donaldson M S Washington DC USA: Institute of Medicine/National Academy Press ISBN 0 309 06837 1 $34.95 , 2000 .

[64]  Kees Mandemakers,et al.  The Intermediate Data Structure (IDS) for Longitudinal Historical Microdata, version 4 , 2014, Historical life course studies.

[65]  L. Wolpert Popper , 2000, BioEssays : news and reviews in molecular, cellular and developmental biology.

[66]  M H Trivedi,et al.  Development and Implementation of Computerized Clinical Guidelines: Barriers and Solutions , 2002, Methods of Information in Medicine.