Knowledge process of health big data using MapReduce-based associative mining

Big-data knowledge processing technology facilitates efficient health management services by systematically collecting and promoting information using distributed/parallel processing with the health platform’s common data model. Thus, it enables knowledge expansion for healthcare data. In this study, we propose a big-data knowledge process for the health industry using Hadoop’s MapReduce software for association mining. The proposed method provides efficient health management knowledge services by collecting and processing heterogeneous health information using WebBot and the common data model. Hadoop is a proprietary method of effectively processing distributed big data. It is a knowledge processing model that combines MapReduce-based distributed processing and a method of finding mining-based associations. The input data in MapReduce is extracted from chronic disease nomenclature from health big data. The corpus divides big data into several blocks of a certain size, creating map tasks. Through the map function of the mapper of each map task, <|key|, value> sets composed of pairs of a key and a value are created. In the map process, a key is created using the same method used for a frequent item set of the Apriori algorithm. The key is a set of 2p keys and its value is set to the occurrence frequency of the key. By summing up the values of the same keys by combining, the size of data is decreased and the load of a software program is also decreased. In addition, for each key, the reducer is designated through hash partitioning and stored in the reduce task. In the reduce process, the results of the map are allocated to each reducer, and alignment and merge steps are taken based on the keys. For the same |key|, the values are summed up by performing the reduce function. In this instance, keys whose values fail to meet the minimum support criterion are eliminated. Therefore, from a set of <|key|, value>, a frequent item set that meets the minimum support criterion is extracted. The association rules between datasets constituting the frequent item set are determined, and the support and reliability are calculated to examine whether they are actually associated. As the value of the frequent item set is higher, the support and reliability are also higher. Thus means that the association is obvious. A knowledge base is then constructed using the extracted association rules by repeatedly performing the MapReduce process. Closely associated knowledge bases are created and semantically related in real time with high probability. Furthermore, mining-based knowledge processing of health big data infers more meaningful associations between chronic diseases. The proposed method adds technological value and intelligent efficiency to support the health and medical fields promote healthy lives.

[1]  Kyung-Yong Chung,et al.  Mining health-risk factors using PHR similarity in a hybrid P2P network , 2018, Peer-to-Peer Netw. Appl..

[2]  Kyung-Yong Chung,et al.  PHR open platform based smart health service using distributed object group framework , 2016, Cluster Computing.

[3]  Ki-Yong Lee,et al.  Efficient Processing of Multiple Group-by Queries in MapReduce for Big Data Analysis , 2015 .

[4]  Joo-Chang Kim,et al.  Neural-network based adaptive context prediction model for ambient intelligence , 2018, J. Ambient Intell. Humaniz. Comput..

[5]  Kyung-Yong Chung,et al.  Knowledge-based health service considering user convenience using hybrid Wi-Fi P2P , 2016, Inf. Technol. Manag..

[6]  Jung-Soo Han,et al.  Towards Ubiquitous Health with Convergence. , 2016, Technology and health care : official journal of the European Society for Engineering and Medicine.

[7]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[8]  Jung-Hyun Lee,et al.  User Preference Mining through Hybrid Collaborative Filtering and Content-Based Filtering in Recommendation System , 2004, IEICE Trans. Inf. Syst..

[9]  Jung-Hyun Lee,et al.  Interactive Design Recommendation Using Sensor Based Smart Wear and Weather WebBot , 2013, Wireless Personal Communications.

[10]  Kyung-Yong Chung,et al.  Heart rate variability based stress index service model using bio-sensor , 2018, Cluster Computing.

[11]  Kyung-Yong Chung,et al.  Decision supporting method for chronic disease patients based on mining frequent pattern tree , 2015, Multimedia Tools and Applications.

[12]  Kyung-Yong Chung,et al.  Prediction Model of User Physical Activity using Data Characteristics-based Long Short-term Memory Recurrent Neural Networks , 2019, KSII Trans. Internet Inf. Syst..

[13]  Kyung-Yong Chung,et al.  Mining Based Time-Series Sleeping Pattern Analysis for Life Big-Data , 2018, Wirel. Pers. Commun..

[14]  Kyung-Yong Chung,et al.  Blockchain Network Based Topic Mining Process for Cognitive Manufacturing , 2018, Wireless Personal Communications.

[15]  Robert Powers,et al.  Protein NMR recall, precision, and F-measure scores (RPF scores): structure quality assessment measures based on information retrieval statistics. , 2005, Journal of the American Chemical Society.

[16]  Nigam H. Shah,et al.  Electronic phenotyping with APHRODITE and the Observational Health Sciences and Informatics (OHDSI) data network , 2017, CRI.

[17]  Cheqing Jin,et al.  MapReduce-based entity matching with multiple blocking functions , 2016, Frontiers of Computer Science.

[18]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[19]  Gediminas Adomavicius,et al.  Context-aware recommender systems , 2008, RecSys '08.

[20]  Jung-Hyun Lee,et al.  Improving the Map/Reduce Model through Data Distribution and Task Progress Scheduling , 2010 .

[21]  Kyung-Yong Chung,et al.  Knowledge-based dietary nutrition recommendation for obese management , 2016, Inf. Technol. Manag..

[22]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[23]  Kyung-Yong Chung,et al.  PHR Based Diabetes Index Service Model Using Life Behavior Analysis , 2017, Wirel. Pers. Commun..

[24]  Kyung-Yong Chung,et al.  Associative context mining for ontology-driven hidden knowledge discovery , 2016, Cluster Computing.

[25]  Kyung-Yong Chung,et al.  PHR Based Life Health Index Mobile Service Using Decision Support Model , 2016, Wirel. Pers. Commun..

[26]  Kyung-Yong Chung,et al.  Sequential pattern profiling based bio-detection for smart health service , 2014, Cluster Computing.

[27]  Chang-Woo Song,et al.  Development of a medical big-data mining process using topic modeling , 2017, Cluster Computing.

[28]  Tein-Yaw Chung,et al.  Testing and evaluating recommendation algorithms in internet of things , 2016, Journal of Ambient Intelligence and Humanized Computing.

[29]  Kyung-Yong Chung,et al.  Mining-based lifecare recommendation using peer-to-peer dataset and adaptive decision feedback , 2018, Peer-to-Peer Netw. Appl..

[30]  Achim Streit,et al.  Enabling collaborative MapReduce on the Cloud with a single-sign-on mechanism , 2014, Computing.