Data mining for enhanced operations management decision making: applications in health care

Data Mining involves the extraction of new knowledge from large data sets. Despite the growing research interest in data mining, however, integrating this extra knowledge into the subsequent decision making processes has received little attention. Within the context of operations management, this integration can occur in two different ways: by providing inputs for an optimization procedure and by analyzing the output of an optimization procedure. In this dissertation, I will begin by introducing a database exploration technique, which is used to improve the drug discovery process of a pharmaceutical company (Samorani et al., 2011). The same procedure is also applied to a mental health clinic's database to predict whether patients will show up at their scheduled appointments. The knowledge obtained with this procedure is then used to improve patient scheduling procedures (Samorani and LaGanga, 2011). I will finally discuss how data mining can be used to learn useful information about the structure of a problem (Samorani and Laguna, 2012).

[1]  Katta G. Murty,et al.  A hybrid genetic/optimization algorithm for a task allocation problem , 1999 .

[2]  Isaac Plana,et al.  GRASP and path relinking for the matrix bandwidth minimization , 2004, Eur. J. Oper. Res..

[3]  Lior Rokach,et al.  Top-down induction of decision trees classifiers - a survey , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[4]  Nan Liu,et al.  Dynamic Scheduling of Outpatient Appointments Under Patient No-Shows and Cancellations , 2010, Manuf. Serv. Oper. Manag..

[5]  Vasant Honavar,et al.  A Multi-relational Decision Tree Learning Algorithm - Implementation and Experiments , 2003, ILP.

[6]  William Nick Street Oblique Multicategory Decision Trees Using Nonlinear Programming , 2005, INFORMS J. Comput..

[7]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[8]  Lawrence W. Robinson,et al.  A Comparison of Traditional and Open-Access Policies for Appointment Scheduling , 2010, Manuf. Serv. Oper. Manag..

[9]  Q Xie,et al.  Structure-activity relationships for a large diverse set of natural, synthetic, and environmental estrogens. , 2001, Chemical research in toxicology.

[10]  Kartik Hosanagar,et al.  Blockbuster Culture's Next Rise or Fall: The Impact of Recommender Systems on Sales Diversity , 2007, Manag. Sci..

[11]  Karolina J. Glowacka,et al.  A hybrid data mining/simulation approach for modelling outpatient no-shows in clinic scheduling , 2009, J. Oper. Res. Soc..

[12]  Hans Matter,et al.  Comparing 3D Pharmacophore Triplets and 2D Fingerprints for Selecting Diverse Compound Subsets , 1999, J. Chem. Inf. Comput. Sci..

[13]  Darwin Klingman,et al.  A network-related nuclear power plant model with an intelligent branch-and-bound solution approach , 1990 .

[14]  Fred W. Glover,et al.  Classification by vertical and cutting multi-hyperplane decision tree induction , 2010, Decis. Support Syst..

[15]  Wenhong Luo,et al.  The Analytics Movement: Implications for Operations Research , 2010, Interfaces.

[16]  Diwakar Gupta,et al.  Appointment scheduling in health care: Challenges and opportunities , 2008 .

[17]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[18]  Mauricio G. C. Resende,et al.  Greedy Randomized Adaptive Search Procedures , 1995, J. Glob. Optim..

[19]  Amaia Lusa,et al.  A variable neighbourhood search algorithm for the constrained task allocation problem , 2008, J. Oper. Res. Soc..

[20]  S Vamvakas,et al.  Structure-mutagenicity and structure-cytotoxicity studies on bromine-containing cysteine S-conjugates and related compounds. , 1994, Chemical research in toxicology.

[21]  Manuel Laguna,et al.  A Randomized Exhaustive Propositionalization Approach for Molecule Classification , 2011, INFORMS J. Comput..

[22]  Zbigniew Dauter,et al.  Molecular basis of agonism and antagonism in the oestrogen receptor , 1997, Nature.

[23]  Arno J. Knobbe,et al.  Propositionalisation and Aggregates , 2001, PKDD.

[24]  Manuel Laguna,et al.  Data Mining Driven Neighborhood Search , 2010, INFORMS J. Comput..

[25]  P. Willett,et al.  Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. , 2004, Organic & biomolecular chemistry.

[26]  Fred Glover,et al.  Scatter Search and Path Relinking: Advances and Applications , 2003, Handbook of Metaheuristics.

[27]  L. Hall,et al.  Three new consensus QSAR models for the prediction of Ames genotoxicity. , 2004, Mutagenesis.

[28]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[29]  Ashwin Srinivasan,et al.  Warmr: a data mining tool for chemical data , 2001, J. Comput. Aided Mol. Des..

[30]  Mathieu Serrurier,et al.  Improving inductive logic programming by using simulated annealing , 2008, Inf. Sci..

[31]  Stephen R. Lawrence,et al.  Clinic Overbooking to Improve Patient Access and Increase Provider Productivity , 2007, Decis. Sci..

[32]  Fred W. Glover,et al.  Reducing the bandwidth of a sparse matrix with tabu search , 2001, Eur. J. Oper. Res..

[33]  Kumar Muthuraman,et al.  A stochastic overbooking model for outpatient clinical scheduling with no-shows , 2008 .

[34]  Martin W. P. Savelsbergh,et al.  Branch-and-Price: Column Generation for Solving Huge Integer Programs , 1998, Oper. Res..

[35]  Andreas T. Ernst,et al.  Exact Solutions to Task Allocation Problems , 2006, Manag. Sci..

[36]  Daniel C. Weaver Applying data mining techniques to library design, lead generation and lead optimization. , 2004, Current opinion in chemical biology.

[37]  Ronald L. Rardin,et al.  Matching daily healthcare provider capacity to demand in advanced access scheduling systems , 2007, Eur. J. Oper. Res..

[38]  Kenneth J. Klassen,et al.  The Effect of Integrated Scheduling and Capacity Policies on Clinical Efficiency , 2011 .

[39]  D. Rogers,et al.  Using Extended-Connectivity Fingerprints with Laplacian-Modified Bayesian Analysis in High-Throughput Screening Follow-Up , 2005, Journal of biomolecular screening.

[40]  Stephen R. Lawrence,et al.  APPOINTMENT SCHEDULING WITH OVERBOOKING TO MITIGATE PRODUCTIVITY LOSS FROM NO-SHOWS , 2007 .

[41]  S. L. Dixon,et al.  One-dimensional molecular representations and similarity calculations: methodology and validation. , 2001, Journal of medicinal chemistry.

[42]  El-Ghazali Talbi,et al.  Using Datamining Techniques to Help Metaheuristics: A Short Survey , 2006, Hybrid Metaheuristics.

[43]  R. W. Hansen,et al.  The price of innovation: new estimates of drug development costs. , 2003, Journal of health economics.

[44]  Diwakar Gupta,et al.  Revenue Management for a Primary-Care Clinic in the Presence of Patient Choice , 2008, Oper. Res..

[45]  Alan Beaulieu,et al.  Learning SQL , 2005 .

[46]  Pedro M. Domingos The Role of Occam's Razor in Knowledge Discovery , 1999, Data Mining and Knowledge Discovery.

[47]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[48]  W. Denny,et al.  Genotoxicity of non-covalent interactions: DNA intercalators. , 2007, Mutation research.

[49]  Wun-Hwa Chen,et al.  A hybrid heuristic to solve a task allocation problem , 2000, Comput. Oper. Res..

[50]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[51]  Alexandre Plastino,et al.  Applications of the DM-GRASP heuristic: a survey , 2008, Int. Trans. Oper. Res..

[52]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[53]  Kee-Eung Kim,et al.  Statistical Machine Learning for Large-Scale Optimization , 2000 .

[54]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[55]  Rafael Martí,et al.  A branch and bound algorithm for the matrix bandwidth minimization , 2008, Eur. J. Oper. Res..

[56]  Hendrik Blockeel,et al.  Multi-Relational Data Mining , 2005, Frontiers in Artificial Intelligence and Applications.

[57]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[58]  Balaji Padmanabhan,et al.  On the Use of Optimization for Data Mining: Theoretical Interactions and eCRM Opportunities , 2003, Manag. Sci..

[59]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[60]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[61]  Emre A. Veral,et al.  OUTPATIENT SCHEDULING IN HEALTH CARE: A REVIEW OF LITERATURE , 2003 .

[62]  Fred Glover,et al.  IMPROVED LINEAR PROGRAMMING MODELS FOR DISCRIMINANT ANALYSIS , 1990 .

[63]  C. Reeves The Crossover Landscape for the Onemax Problem , 1996 .

[64]  Jeanne G. Harris,et al.  Competing on Analytics: The New Science of Winning , 2007 .

[65]  Jerrold H. May,et al.  Targeted Advertising Strategies on Television , 2006, Manag. Sci..

[66]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[67]  Ji Lin,et al.  Clinic scheduling models with overbooking for patients with heterogeneous no-show probabilities , 2010, Ann. Oper. Res..