A privacy-preserving framework for distributed clinical decision support

We propose a framework for distributed knowledge-mining that results in a useful clinical decision support tool in the form of a decision tree. This framework facilitates knowledge building using statistics based on patient data from multiple sites that satisfy a certain filtering condition, without the need for actual data to leave the participating sites. Our information retrieval and diagnostics supporting tool accommodates heterogeneous data schemas associated with participating sites. It also supports prevention of personally identifiable information leakage and preservation of privacy, which are important security concerns in management of clinical data transactions. Results of experiments conducted on 8 and 16 sites with a small number of patients per site (if any) satisfying specific partial diagnostics criteria are presented. The experiments coupled with restricting a fraction of attributes from sharing statistics as well as applying different constraints on privacy at various sites demonstrate the usefulness of the tool.

[1]  David F. Lobach,et al.  Medical data mining: knowledge discovery in a clinical data warehouse , 1997, AMIA.

[2]  Maqbool Hussain,et al.  Digital health care ecosystem: SOA compliant HL7 based health care information interchange , 2009, 2009 3rd IEEE International Conference on Digital Ecosystems and Technologies.

[3]  Charu C. Aggarwal,et al.  Managing and Mining Graph Data , 2010, Managing and Mining Graph Data.

[4]  Assaf Schuster,et al.  Data mining with differential privacy , 2010, KDD.

[5]  I. Epstein Clinical Data-Mining , 2009 .

[6]  M. Cannataro Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine, and Healthcare , 2009 .

[7]  Edward H. Shortliffe,et al.  Rule Based Expert Systems: The Mycin Experiments of the Stanford Heuristic Programming Project (The Addison-Wesley series in artificial intelligence) , 1984 .

[8]  Lawrence B. Holder,et al.  Mining Graph Data , 2006 .

[9]  Daniel G. Bobrow,et al.  Expert systems: perils and promise , 1986, CACM.

[10]  Takashi Washio,et al.  Constructing Decision Trees for Graph-Structured Data by Chunkingless Graph-Based Induction , 2006, PAKDD.

[11]  William van Melle,et al.  MYCIN: a knowledge-based consultation program for infectious disease diagnosis , 1978 .

[12]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[13]  Irwin Epstein,et al.  Clinical Data-Mining: Integrating Practice and Research , 2009 .

[14]  Qing He,et al.  Distributed data mining in grid computing environments , 2007, Future Gener. Comput. Syst..

[15]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[16]  Zoran Obradovic,et al.  Vocabularies in collaboration channels , 2010, 6th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2010).

[17]  Qiang Wang,et al.  Classification of brain tumors using MRI and MRS data , 2007, SPIE Medical Imaging.

[18]  Vasant Honavar,et al.  A Framework for Learning from Distributed Data Using Sufficient Statistics and Its Application to Learning Decision Trees , 2004, Int. J. Hybrid Intell. Syst..

[19]  H. Lehmann,et al.  Clinical Decision Support Systems (cdsss) Have Been Hailed for Their Potential to Reduce Medical Errors Clinical Decision Support Systems for the Practice of Evidence-based Medicine , 2022 .

[20]  Ran Wolff,et al.  Hierarchical decision tree induction in distributed genomic databases , 2005, IEEE Transactions on Knowledge and Data Engineering.

[21]  Salvatore J. Stolfo,et al.  JAM: Java Agents for Meta-Learning over Distributed Databases , 1997, KDD.

[22]  Bruce G. Buchanan,et al.  The MYCIN Experiments of the Stanford Heuristic Programming Project , 1985 .