Knowledge Discovery in the Data Sets of Hepatitis Disease for Diagnosis and Prediction to Support and Serve Community

The availability of huge amounts of medical data in hospitals and other health organizations leads to the need for powerful data analysis tools to extract useful knowledge. For many years ago, researchers are concentrating on applying statistical methods and Data Mining tools to improve data analysis relevant to large historical data sets. There is a lack to analysis the accumulated data of Hepatitis disease in Kingdom of Saudi Arabia (KSA), where, thousands of people yearly are killed in KSA by this disease and millions of people are killed in the world. This research aims at constructing three independent prediction models for each type of Hepatitis (A, B, and C). These models are constructed using Data Mining techniques and tools. The Java Programming Language employed for an important part of the preprocessing steps and two main WEKA’s algorithms were applied. The research data is a recent group of real data sets that cover the last five years (2010-2014), collected from three big cities, different hospitals in KSA. The Knowledge Data Discovery (KDD) processes applied, and three models established. The research results provided high accuracy results of classification and prediction processes. The results assist physicians as decision makers to diagnose or to predict the disease earlier.