Network-Based Modeling and Intelligent Data Mining of Social Media for Improving Care

Intelligently extracting knowledge from social media has recently attracted great interest from the Biomedical and Health Informatics community to simultaneously improve healthcare outcomes and reduce costs using consumer-generated opinion. We propose a two-step analysis framework that focuses on positive and negative sentiment, as well as the side effects of treatment, in users' forum posts, and identifies user communities (modules) and influential users for the purpose of ascertaining user opinion of cancer treatment. We used a self-organizing map to analyze word frequency data derived from users' forum posts. We then introduced a novel network-based approach for modeling users' forum interactions and employed a network partitioning method based on optimizing a stability quality measure. This allowed us to determine consumer opinion and identify influential users within the retrieved modules using information derived from both word-frequency data and network-based properties. Our approach can expand research into intelligently mining social media data for consumer opinion of various treatments to provide rapid, up-to-date information for the pharmaceutical industry, hospitals, and medical staff, on the effectiveness (or ineffectiveness) of future treatments.

[1]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[2]  William Pao,et al.  Novel D761Y and Common Secondary T790M Mutations in Epidermal Growth Factor Receptor–Mutant Lung Adenocarcinomas with Acquired Resistance to Kinase Inhibitors , 2006, Clinical Cancer Research.

[3]  Mike Y. Chen,et al.  Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web , 2001 .

[4]  Altug Akay,et al.  A Novel Data-Mining Approach Leveraging Social Media to Monitor Consumer Opinion of Sitagliptin , 2015, IEEE Journal of Biomedical and Health Informatics.

[5]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[6]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[7]  Michael I. Jordan,et al.  Stable algorithms for link analysis , 2001, SIGIR '01.

[8]  Kit Yan Chan,et al.  Modelling customer satisfaction for product development using genetic programming , 2011 .

[9]  Liang Li,et al.  Artificial Societies and Social Simulation using Ant Colony, Particle Swarm Optimization and Cultural Algorithms , 2010 .

[10]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[11]  Armin R. Mikler,et al.  Text and Structural Data Mining of Influenza Mentions in Web and Social Media , 2010, International journal of environmental research and public health.

[12]  Peter Kirkpatrick,et al.  Erlotinib hydrochloride , 2005, Nature Reviews Drug Discovery.

[13]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[14]  Esa Alhoniemi,et al.  Self-organizing map in Matlab: the SOM Toolbox , 1999 .

[15]  Stephen W Dusza,et al.  Dermatologic side effects associated with the epidermal growth factor receptor inhibitors. , 2006, Journal of the American Academy of Dermatology.

[16]  Ingo Frommholz,et al.  Determining the Polarity of Postings for Discussion Search , 2008, LWA.

[17]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[18]  Guy W. Mineau,et al.  Beyond TFIDF Weighting for Text Categorization in the Vector Space Model , 2005, IJCAI.

[19]  Ji Hyea Han,et al.  Data Mining : Concepts and Techniques 2 nd Edition Solution Manual , 2005 .

[20]  E. Roé,et al.  Description and management of cutaneous side effects during cetuximab or erlotinib treatments: a prospective study of 30 patients. , 2006, Journal of the American Academy of Dermatology.

[21]  Katherine Faust,et al.  Comparing Social Networks: Size, Density, and Local Structure , 2006 .

[22]  Felix Naumann,et al.  Links and Paths through Life Sciences Data Sources , 2004, DILS.

[23]  Ingo Mierswa,et al.  YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.

[24]  M. Meyerson,et al.  The T790M mutation in EGFR kinase causes drug resistance by increasing the affinity for ATP , 2008, Proceedings of the National Academy of Sciences.

[25]  Alain Spatz,et al.  Cutaneous side-effects of kinase inhibitors and blocking antibodies. , 2005, The Lancet. Oncology.

[26]  Chris Hankin,et al.  Multi-scale Community Detection using Stability as Optimisation Criterion in a Greedy Algorithm , 2011, KDIR.

[27]  Joon-Oh Park,et al.  MET Amplification Leads to Gefitinib Resistance in Lung Cancer by Activating ERBB3 Signaling , 2007, Science.

[28]  Alex Arenas,et al.  Analysis of the structure of complex networks at different resolution levels , 2007, physics/0703218.

[29]  Davis,et al.  Principles of Data Mining , 2001 .

[30]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[32]  Heiner Stuckenschmidt,et al.  Leveraging Terminological Structure for Object Reconciliation , 2010, ESWC.

[33]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[34]  WooYoung Kim,et al.  Mining Online Deal Forums for Hot Deals , 2004, IEEE/WIC/ACM International Conference on Web Intelligence (WI'04).

[35]  Katherine Faust,et al.  Very Local Structure in Social Networks , 2006 .

[36]  N. S. Bhutada,et al.  Assessing Pancreatic Cancer Risk Associated with Dipeptidyl Peptidase 4 Inhibitors: Data Mining of FDA Adverse Event Reporting System (FAERS) , 2013 .

[37]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[38]  P. Bonato,et al.  Data mining of motor patterns recorded with wearable technology , 2003, IEEE Engineering in Medicine and Biology Magazine.

[39]  Ellen Riloff,et al.  Little words can make a big difference for text classification , 1995, SIGIR '95.

[40]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[41]  L. Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[42]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[43]  Katherine Faust,et al.  7. Very Local Structure in Social Networks , 2007 .

[44]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..