Medical search and classification tools for recommendation

their patients' records from paper to computer, enormous amounts of electronic medical records (EMR) have become available for medical research. Some of the EMR data are well-structured, for which traditional database management systems can provide effective retrieval and management functions. However, most of the EMR data (such as progress notes and consultation letters) are in free text formats. How to effectively and efficiently retrieve and discover useful information from the vast amount of such semi-structured data is a challenge faced by medical professionals. Without proper tools, the rich information and knowledge buried in the medical health records are unavailable for clinical research and decision-making. The objective of our research is to develop text analytics tools that are capable of parsing clinical medical data so that predefined search subjects that correspond to a list of medical diagnoses can be extracted. In addition to this particular core functionality, it is also desired that several important assets should be present within the text-analytics tools in order to improve its overall ability to be used as recommendation tools. In this research, we work with research scientists at the Institute for Clinical Evaluative Sciences (ICES) in Toronto and examine a number of techniques for structuring and processing free text documents in order to effectively and efficiently search and analyze vast amount of medical records. We implement several powerful medical text analytics tools for clinical data searching and classification. For data classification, our tools sort through a great amount of patientrecords to identify the likelihood of a patient having myocardial infarction (MI) or hypertension (HTN), and classify the patients accordingly. Our tools can also identify the likelihood of a patient being a smoker, previous smoker or non-smoker based on the text data of medical records.