IBM Research Report Analyzing Analytics Part 1: A Survey of Business Analytics Models and Algorithms

[1]  Ying Zhao,et al.  Effective document clustering for large heterogeneous law firm collections , 2005, International Conference on Artificial Intelligence and Law.

[2]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[3]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[4]  H. Hindi,et al.  A tutorial on convex optimization , 2004, Proceedings of the 2004 American Control Conference.

[5]  Paul Glasserman,et al.  Monte Carlo Methods in Financial Engineering , 2003 .

[6]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[7]  George Karypis,et al.  Hierarchical Clustering Algorithms for Document Datasets , 2005, Data Mining and Knowledge Discovery.

[8]  Ronald L. Wasserstein,et al.  Monte Carlo: Concepts, Algorithms, and Applications , 1997 .

[9]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[10]  J. Halton A Retrospective and Prospective Survey of the Monte Carlo Method , 1970 .

[11]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[12]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[13]  A. Ravishankar Rao,et al.  A spatio-temporal support vector machine searchlight for fMRI analysis , 2011, 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[14]  Sougata Mukherjea,et al.  Social ties and their relevance to churn in mobile telecom networks , 2008, EDBT '08.

[15]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[16]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[17]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[18]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[19]  Kurt Hornik,et al.  Text Mining Infrastructure in R , 2008 .

[20]  Cynthia Barnhart,et al.  UPS Optimizes Its Air Network , 2004, Interfaces.

[21]  F. Attneave,et al.  The Organization of Behavior: A Neuropsychological Theory , 1949 .

[22]  Frank Harary,et al.  Graph Theory , 2016 .

[23]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[24]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[25]  Goutam Dutta,et al.  A Survey of Mathematical Programming Applications in Integrated Steel Plants , 2001, Manuf. Serv. Oper. Manag..

[26]  D. Dominic,et al.  A Comparative Study of FP-growth Variations , 2009 .

[27]  Stéphane Canu,et al.  Comments on the "Core Vector Machines: Fast SVM Training on Very Large Data Sets" , 2007, J. Mach. Learn. Res..

[28]  Makoto Matsumoto,et al.  SIMD-Oriented Fast Mersenne Twister: a 128-bit Pseudorandom Number Generator , 2008 .

[29]  I. Lustig,et al.  Interior Point Methods for Linear Programming: Just Call Newton, Lagrange, and Fiacco and McCormick! , 1990 .

[30]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[31]  R. Turner,et al.  Eigenvector Centrality Mapping for Analyzing Connectivity Patterns in fMRI Data of the Human Brain , 2010, PloS one.

[32]  Yu-Shan Shih,et al.  QUEST User Manual , 2004 .

[33]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[34]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[35]  Donald D. Chamberlin,et al.  A Complete Guide to DB2 Universal Database , 1998 .

[36]  Nils J. Nilsson,et al.  Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[38]  Mason A. Porter,et al.  Communities in Networks , 2009, ArXiv.

[39]  Piotr Indyk,et al.  Nearest Neighbors in High-Dimensional Spaces , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[40]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[41]  Christos Faloutsos,et al.  Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[42]  Krzysztof R. Apt,et al.  Principles of constraint programming , 2003 .

[43]  Tom M. Mitchell,et al.  Machine learning classifiers and fMRI: A tutorial overview , 2009, NeuroImage.

[44]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[45]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[46]  Thomas Gärtner,et al.  A survey of kernels for structured data , 2003, SKDD.

[47]  Jie Lin,et al.  Coordination of groups of mobile autonomous agents using nearest neighbor rules , 2003, IEEE Trans. Autom. Control..

[48]  Yao Wang,et al.  A robust and scalable clustering algorithm for mixed type attributes in large database environment , 2001, KDD '01.

[49]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[50]  George Karypis,et al.  Common Pharmacophore Identification Using Frequent Clique Detection Algorithm , 2009, J. Chem. Inf. Model..

[51]  Padhraic Smyth,et al.  Business applications of data mining , 2002, CACM.

[52]  Dominique Haughton,et al.  A Review of Two Text-Mining Packages , 2005 .

[53]  Alexander Zeier,et al.  SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units , 2009, Proc. VLDB Endow..

[54]  Albert-László Barabási,et al.  Linked - how everything is connected to everything else and what it means for business, science, and everyday life , 2003 .

[55]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[56]  Ravi Iyengar,et al.  Ordered cyclic motifs contribute to dynamic stability in biological and engineered networks , 2008, Proceedings of the National Academy of Sciences.

[57]  Yossi Richter,et al.  Predicting Customer Churn in Mobile Networks through Analysis of Social Groups , 2010, SDM.

[58]  Sougata Mukherjea,et al.  On the structural properties of massive telecom call graphs: findings and implications , 2006, CIKM '06.

[59]  Wolfgang Lehner,et al.  Data mining with the SAP NetWeaver BI accelerator , 2006, VLDB.

[60]  G. Marsaglia,et al.  A New Class of Random Number Generators , 1991 .

[61]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[62]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[63]  N. Metropolis,et al.  The Monte Carlo method. , 1949 .

[64]  P. Boyle Options: A Monte Carlo approach , 1977 .

[65]  Jon Louis Bentley,et al.  Multidimensional divide-and-conquer , 1980, CACM.

[66]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[67]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[68]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[69]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[70]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[71]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[72]  Noga Alon,et al.  Spectral Techniques in Graph Algorithms , 1998, LATIN.

[73]  P. L’Ecuyer,et al.  On the lattice structure of certain linear congruential sequences related to AWC/SWB generators , 1994 .

[74]  Dennis J. Sweeney,et al.  Quantitative Methods for Business , 1983 .

[75]  Andrew W. Moore,et al.  An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[76]  Marti A. Hearst Untangling Text Data Mining , 1999, ACL.

[77]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[78]  C. Stam,et al.  Small-world networks and functional connectivity in Alzheimer's disease. , 2006, Cerebral cortex.

[79]  George Karypis,et al.  Topic-driven Clustering for Document Datasets , 2005, SDM.

[80]  A. Neumaier Acta Numerica 2004: Complete search in continuous global optimization and constraint satisfaction , 2004 .

[81]  Jeanne G. Harris,et al.  Competing on Analytics: The New Science of Winning , 2007 .

[82]  Tristan Fletcher,et al.  Support Vector Machines Explained , 2008 .

[83]  Joshua A. Grochow,et al.  Network Motif Discovery Using Subgraph Enumeration and Symmetry-Breaking , 2007, RECOMB.

[84]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[85]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[86]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[87]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[88]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[89]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[90]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[91]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[92]  Michael I. Jordan Attractor dynamics and parallelism in a connectionist sequential machine , 1990 .

[93]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[94]  Robert H. Shumway,et al.  Time Series Analysis and Its Applications (Springer Texts in Statistics) , 2005 .

[95]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[96]  Stephen M. Omohundro,et al.  Efficient Algorithms with Neural Network Behavior , 1987, Complex Syst..

[97]  George B. Dantzig,et al.  Linear programming and extensions , 1965 .

[98]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[99]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[100]  Jephthah A. Abara,et al.  Applying Integer Linear Programming to the Fleet Assignment Problem , 1989 .

[101]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[102]  William Stafford Noble,et al.  Support vector machine , 2013 .

[103]  Himabindu Lakkaraju,et al.  Exploiting Coherence for the Simultaneous Discovery of Latent Facets and associated Sentiments , 2011, SDM.

[104]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[105]  Ashish Verma,et al.  Enabling analysts in managed services for CRM analytics , 2009, KDD.

[106]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[107]  Sanjay Mehrotra,et al.  On the Implementation of a Primal-Dual Interior Point Method , 1992, SIAM J. Optim..

[108]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[109]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[110]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[111]  Yehuda Koren,et al.  All Together Now: A Perspective on the Netflix Prize , 2010 .

[112]  Bernard Widrow,et al.  Neural networks: applications in industry, business and science , 1994, CACM.

[113]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[114]  Daniel Zwillinger Predictor—Corrector Methods , 1992 .

[115]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[116]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[117]  C. Stam,et al.  Small-world networks and disturbed functional connectivity in schizophrenia , 2006, Schizophrenia Research.

[118]  Paul E. Green,et al.  K-modes Clustering , 2001, J. Classif..

[119]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[120]  Jeffrey K. Uhlmann,et al.  Satisfying General Proximity/Similarity Queries with Metric Trees , 1991, Inf. Process. Lett..

[121]  Martin W. P. Savelsbergh,et al.  Branch-and-Price: Column Generation for Solving Huge Integer Programs , 1998, Oper. Res..

[122]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[123]  Thomas H. Davenport,et al.  Analytics at Work: Smarter Decisions, Better Results , 2010 .

[124]  Stephen M. Omohundro,et al.  Five Balltree Construction Algorithms , 2009 .

[125]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[126]  Jianying Hu,et al.  Leveraging social networks for corporate staffing and expert recommendation , 2009, IBM J. Res. Dev..

[127]  César A. Hidalgo,et al.  Scale-free networks , 2008, Scholarpedia.

[128]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[129]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[130]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[131]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[132]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[133]  R. Mooney,et al.  Impact of Similarity Measures on Web-page Clustering , 2000 .

[134]  Kevin N. Gurney,et al.  An introduction to neural networks , 2018 .

[135]  Susan T. Dumais,et al.  Latent semantic analysis , 2005, Scholarpedia.

[136]  A. Neumaier Complete search in continuous global optimization and constraint satisfaction , 2004, Acta Numerica.

[137]  Cornelis J. Stam,et al.  Small-world and scale-free organization of voxel-based resting-state functional connectivity in the human brain , 2008, NeuroImage.

[138]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[139]  Andrew Kusiak,et al.  Data Mining in Manufacturing: A Review , 2006 .

[140]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[141]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[142]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[143]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[144]  Andrew W. Moore,et al.  New Algorithms for Efficient High-Dimensional Nonparametric Classification , 2006, J. Mach. Learn. Res..

[145]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[146]  Ting Liu,et al.  Clustering Billions of Images with Large Scale Nearest Neighbor Search , 2007, 2007 IEEE Workshop on Applications of Computer Vision (WACV '07).

[147]  Kristin P. Bennett,et al.  Support vector machines: hype or hallelujah? , 2000, SKDD.

[148]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[149]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[150]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[151]  Stephen M. Omohundro,et al.  Bumptrees for Efficient Function, Constraint and Classification Learning , 1990, NIPS.

[152]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[153]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[154]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[155]  Shlomo S. Sawilowsky,et al.  You Think You’ve Got Trivials? , 2003 .

[156]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[157]  Vikas Sindhwani,et al.  Extracting insights from social media with large-scale matrix approximations , 2011, IBM J. Res. Dev..

[158]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[159]  Michael A. Saunders,et al.  On projected newton barrier methods for linear programming and an equivalence to Karmarkar’s projective method , 1986, Math. Program..

[160]  John Elder,et al.  Handbook of Statistical Analysis and Data Mining Applications , 2009 .

[161]  Michael J. Todd,et al.  The many facets of linear programming , 2002, Math. Program..

[162]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[163]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[164]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[165]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[166]  Joseph F. Traub,et al.  Faster Valuation of Financial Derivatives , 1995 .

[167]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[168]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[169]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[170]  David T. Stanton,et al.  Application of Nearest-Neighbor and Cluster Analyses in Pharmaceutical Lead Discovery , 1999, J. Chem. Inf. Comput. Sci..

[171]  Chris H. Q. Ding,et al.  Nonnegative Matrix Factorization for Combinatorial Optimization: Spectral Clustering, Graph Matching, and Clique Finding , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[172]  Rina Dechter,et al.  Constraint Processing , 1995, Lecture Notes in Computer Science.