论文信息 - IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics

IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics

Background We live in an era of explosive data generation that will continue to grow and involve all industries. One of the results of this explosion is the need for newer and more efficient data analytics procedures. Traditionally, data analytics required a substantial background in statistics and computer science. In 2015, International Business Machines Corporation (IBM) released the IBM Watson Analytics (IBMWA) software that delivered advanced statistical procedures based on the Statistical Package for the Social Sciences (SPSS). The latest entry of Watson Analytics into the field of analytical software products provides users with enhanced functions that are not available in many existing programs. For example, Watson Analytics automatically analyzes datasets, examines data quality, and determines the optimal statistical approach. Users can request exploratory, predictive, and visual analytics. Using natural language processing (NLP), users are able to submit additional questions for analyses in a quick response format. This analytical package is available free to academic institutions (faculty and students) that plan to use the tools for noncommercial purposes. Objective To report the features of IBMWA and discuss how this software subjectively and objectively compares to other data mining programs. Methods The salient features of the IBMWA program were examined and compared with other common analytical platforms, using validated health datasets. Results Using a validated dataset, IBMWA delivered similar predictions compared with several commercial and open source data mining software applications. The visual analytics generated by IBMWA were similar to results from programs such as Microsoft Excel and Tableau Software. In addition, assistance with data preprocessing and data exploration was an inherent component of the IBMWA application. Sensitivity and specificity were not included in the IBMWA predictive analytics results, nor were odds ratios, confidence intervals, or a confusion matrix. Conclusions IBMWA is a new alternative for data analytics software that automates descriptive, predictive, and visual analytics. This program is very user-friendly but requires data preprocessing, statistical conceptual understanding, and domain expertise.

[1] Jian Pei,et al. Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[2] Arend Hintze,et al. Data Preprocessing , 2017, Encyclopedia of Machine Learning and Data Mining.

[3] J. Chambers. Greater or lesser statistics: a choice for future research , 1993 .

[4] D. Donoho. 50 Years of Data Science , 2017 .

[5] Lisa M. Schwartz,et al. PSYCHOLOGICAL SCIENCE IN THE PUBLIC INTEREST Helping Doctors and Patients Make Sense of Health Statistics , 2022 .

[6] Daniel T. Larose,et al. Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[7] William R. Hersh,et al. BMC Medical Informatics and Decision Making , 2009 .

[8] Judith Strymish,et al. Medicine's uncomfortable relationship with math: calculating positive predictive value. , 2014, JAMA internal medicine.

[9] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[10] John Degaspari. Managing the data explosion. , 2013, Healthcare informatics : the business magazine for information and communication systems.

[11] R. Yilmaz Mustafa. The Challenge of Teaching Statistics to Non-Specialists , 1996 .

[12] Gerd Gigerenzer,et al. Do Physicians Understand Cancer Screening Statistics? A National Survey of Primary Care Physicians in the United States , 2012, Annals of Internal Medicine.