A technique for identifying three diagnostic findings using association analysis

In diagnosing diseases in clinical practice, a combination of three clinical findings is often used to represent each disease. This is largely because it is often difficult or impractical to assess for all possible combinations of symptoms and abnormal exam findings that occur in any particular disease. For most diseases, diagnostic triads are based on empirical observations. In this study, we determined diagnostic triads for chronic diseases using data mining procedures. We also verified the combinations’ validity as well as our procedure for determining them. We used symptoms and examination findings from 477 patients with chronic diseases, collected as part of a 35-year longitudinal study begun in 1968. For each patient there were 295 items from examinations in internal medicine, dermatology, ophthalmology, dentistry and blood tests. We judged each item to be either normal or abnormal, and restricted the analysis to the abnormal findings. To analyze such an exhaustive assortment, we used the data mining technique of association analysis. The analysis generated three clinical findings for each disease. Diseases were defined based on blood tests. Searching through all 295 items to find the three most useful clinical findings would be impractical on a commodity PC. However, by excluding normal items, we were able to sufficiently reduce the total number of combinations so as to make combinatorial analysis on a PC feasible. In addition to more accurate diagnoses, we believe our technique can identify those diagnostic data that are more cost effective in terms of time and other resources required for their collection.

[1]  D C Torney,et al.  Discovery of association rules in medical data , 2001, Medical informatics and the Internet in medicine.

[2]  Nicos Maglaveras,et al.  Mining Association Rules from Clinical Databases: An Intelligent Diagnostic Process in Healthcare , 2001, MedInfo.

[3]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[4]  Masutaka Furue,et al.  Overview of Yusho , 2005 .

[5]  John F. Roddick,et al.  Exploratory medical knowledge discovery: experiences and issues , 2003, SKDD.

[6]  J Karlsson,et al.  Data mining and structuring of executable data analysis reports: guideline development and implementation in a narrow sense. , 2000, Studies in health technology and informatics.

[7]  Lucila Ohno-Machado,et al.  A genetic algorithm approach to multi-disorder diagnosis , 2000, Artif. Intell. Medicine.

[8]  Masutaka Furue,et al.  WITHDRAWN: Overview of Yusho , 2005 .

[9]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[10]  Tomoaki Imamura,et al.  Relationship of clinical symptoms and laboratory findings with blood levels of PCDFs in patients with Yusho , 2005 .

[11]  Johan Karlsson,et al.  Data mining and structuring of executable dataanalysis reports: Guideline development and implementation in a narrowsense , 2000 .

[12]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[13]  Norberto F. Ezquerra,et al.  Mining constrained association rules to predict heart disease , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[14]  F RoddickJohn,et al.  Exploratory medical knowledge discovery , 2003 .

[15]  David F. Lobach,et al.  Medical data mining: knowledge discovery in a clinical data warehouse , 1997, AMIA.