The Art of Feature Engineering

[1]  Rachel Schutt,et al.  Doing Data Science , 2013 .

[2]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[3]  Michael J. Cafarella,et al.  Input selection for fast feature engineering , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[4]  D. Gabor,et al.  Theory of communication. Part 1: The analysis of information , 1946 .

[5]  Fillia Makedon,et al.  Learning from Incomplete Ratings Using Non-negative Matrix Factorization , 2006, SDM.

[6]  Carsten Steger,et al.  Machine Vision Algorithms , 2007 .

[7]  Dragomir R. Radev,et al.  Book Review: Graph-Based Natural Language Processing and Information Retrieval by Rada Mihalcea and Dragomir Radev , 2011, CL.

[8]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[9]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[10]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[11]  Ivan Markovsky,et al.  Low Rank Approximation - Algorithms, Implementation, Applications , 2018, Communications and Control Engineering.

[12]  D. Haussler,et al.  Boolean Feature Discovery in Empirical Learning , 1990, Machine Learning.

[13]  John McGee,et al.  Discretization of Time Series Data , 2005, J. Comput. Biol..

[14]  Stephen F. King On Writing: A Memoir of the Craft , 2000 .

[15]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[16]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[17]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[18]  Aurélien Géron,et al.  Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems , 2017 .

[19]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[20]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[21]  Yao-Yi Chiang,et al.  Emerging trends in geospatial artificial intelligence (geoAI): potential applications for environmental epidemiology , 2018, Environmental Health.

[22]  J. Cavanaugh Unifying the derivations for the Akaike and corrected Akaike information criteria , 1997 .

[23]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[24]  Anca D. Dragan,et al.  Translating Neuralese , 2017, ACL.

[25]  Huan Liu,et al.  Discretization: An Enabling Technique , 2002, Data Mining and Knowledge Discovery.

[26]  M. Kenward,et al.  Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls , 2009, BMJ : British Medical Journal.

[27]  Deepak S. Turaga,et al.  Feature Engineering for Predictive Modeling using Reinforcement Learning , 2017, AAAI.

[28]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[29]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[30]  Todd C. Kelley,et al.  The Art of Innovation: Lessons in Creativity from IDEO, America's Leading Design Firm , 2001 .

[31]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[32]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[33]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[34]  Raymond J. Mooney,et al.  Learning for Semantic Parsing with Statistical Machine Translation , 2006, NAACL.

[35]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[36]  Amir Sadeghian,et al.  Feature Engineering for Knowledge Base Construction , 2014, IEEE Data Eng. Bull..

[37]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[38]  O. J. Dunn Multiple Comparisons among Means , 1961 .

[39]  William A. Gale,et al.  Good-Turing Frequency Estimation Without Tears , 1995, J. Quant. Linguistics.

[40]  Wendy G. Lehnert,et al.  Information extraction , 1996, CACM.

[41]  Benjamin Recht,et al.  Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.

[42]  D. Hubel,et al.  Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.

[43]  Jonathan Balzer,et al.  Multi-view feature engineering and learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[45]  E. Oja,et al.  Independent Component Analysis , 2013 .

[46]  Li Wang,et al.  GDAL-based extend ArcGIS Engine's support for HDF file format , 2010, 2010 18th International Conference on Geoinformatics.

[47]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[48]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[49]  Terrence J. Sejnowski,et al.  Edges are the Independent Components of Natural Scenes , 1996, NIPS.

[50]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[51]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[52]  David Vandyke,et al.  Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[53]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[54]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[55]  Michael L. Overton,et al.  Numerical Computing with IEEE Floating Point Arithmetic , 2001 .

[56]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Baoxin Li,et al.  On the generality of neural image features , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[58]  Sangkyum Kim,et al.  Authorship classification: a discriminative syntactic tree mining approach , 2011, SIGIR.

[59]  Gérard Dreyfus,et al.  Ranking a Random Feature for Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[60]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[61]  G Stix,et al.  The mice that warred. , 2001, Scientific American.

[62]  Pablo Ariel Duboue,et al.  Deobfuscating Name Scrambling as a Natural Language Generation Task , 2018 .

[63]  R. Wurtz Recounting the impact of Hubel and Wiesel , 2009, The Journal of physiology.

[64]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[65]  Drew Conway,et al.  Machine Learning for Hackers , 2012 .

[66]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[67]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[68]  R.M. Haralick,et al.  Statistical and structural approaches to texture , 1979, Proceedings of the IEEE.

[69]  J. Cavanaugh,et al.  Generalizing the derivation of the schwarz information criterion , 1999 .

[70]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[71]  Alois Knoll,et al.  Gradient boosting machines, a tutorial , 2013, Front. Neurorobot..

[72]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[73]  Jeff T. Heaton,et al.  Automated Feature Engineering for Deep Neural Networks with Genetic Programming , 2017 .

[74]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[75]  Andrew McCallum,et al.  FACTORIE: Probabilistic Programming via Imperatively Defined Factor Graphs , 2009, NIPS.

[76]  S. Roberts Novelty detection using extreme value statistics , 1999 .

[77]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[78]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[79]  Mathew H. Evans,et al.  Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results , 2018, Advances in Methods and Practices in Psychological Science.

[80]  M. M. Prieto,et al.  Hot Metal Temperature Forecasting at Steel Plant Using Multivariate Adaptive Regression Splines , 2019 .

[81]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[82]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[83]  John D. Kelleher,et al.  Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies , 2015 .

[84]  J. Wolfowitz,et al.  Introduction to the Theory of Statistics. , 1951 .

[85]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[86]  J. Friedman Multivariate adaptive regression splines , 1990 .

[87]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[88]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[89]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[90]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[91]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[92]  Alan Pankratz,et al.  Forecasting with Dynamic Regression Models: Pankratz/Forecasting , 1991 .

[93]  Gerold Hintz,et al.  Leveraging Crowdsourcing for Paraphrase Recognition , 2013, LAW@ACL.

[94]  Chun-Liang Li,et al.  Rivalry of Two Families of Algorithms for Memory-Restricted Streaming PCA , 2015, AISTATS.

[95]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[96]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[97]  Peter Bühlmann Regression shrinkage and selection via the Lasso: a retrospective (Robert Tibshirani): Comments on the presentation , 2011 .

[98]  Marcelo A. Montemurro,et al.  Beyond the Zipf-Mandelbrot law in quantitative linguistics , 2001, ArXiv.

[99]  Hao Wang,et al.  Online Streaming Feature Selection , 2010, ICML.

[100]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[101]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[102]  Caroline Chan,et al.  Determination of quantization intervals in rule based model for dynamic systems , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.

[103]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[104]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[105]  S. Kotsiantis,et al.  Discretization Techniques: A recent survey , 2006 .

[106]  Raymond J. Mooney,et al.  Using Multiple Clause Constructors in Inductive Logic Programming for Semantic Parsing , 2001, ECML.

[107]  Jeff Heaton,et al.  An empirical analysis of feature engineering for predictive modeling , 2016, SoutheastCon 2016.

[108]  Tony Jebara,et al.  Structure preserving embedding , 2009, ICML '09.

[109]  Mahantapas Kundu,et al.  The journey of graph kernels through two decades , 2018, Comput. Sci. Rev..

[110]  D. Sornette,et al.  Stretched exponential distributions in nature and economy: “fat tails” with characteristic scales , 1998, cond-mat/9801293.

[111]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[112]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[113]  Max A. Little,et al.  Highly comparative time-series analysis: the empirical structure of time series and their methods , 2013, Journal of The Royal Society Interface.

[114]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[115]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[116]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[117]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[118]  Celine Vens,et al.  Random Forest Based Feature Induction , 2011, 2011 IEEE 11th International Conference on Data Mining.

[119]  Zhou Xu,et al.  Vehicle Point of Interest Detection Using In-Car Data , 2018, GeoAI@SIGSPATIAL.

[120]  M. A. Wincek Forecasting With Dynamic Regression Models , 1993 .

[121]  C. Furlanello,et al.  Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products , 2006 .

[122]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[123]  Amit Singhal,et al.  Pivoted document length normalization , 1996, SIGIR 1996.

[124]  Geoffrey E. Hinton,et al.  Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[125]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[126]  Jeffrey E. F. Friedl Mastering Regular Expressions , 1997 .

[127]  Pablo Duboue Automatic Reports from Spreadsheets: Data Analysis for the Rest of Us , 2016, INLG.

[128]  Rada Mihalcea,et al.  Factors Influencing the Surprising Instability of Word Embeddings , 2018, NAACL.

[129]  Michael R. Lyu,et al.  Point-of-Interest Recommendation in Location-Based Social Networks , 2018, SpringerBriefs in Computer Science.

[130]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[131]  Matti Pietikäinen,et al.  A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..

[132]  D. Huff,et al.  How to Lie with Statistics , 1956 .

[133]  Volker Tresp,et al.  Predicting the co-evolution of event and Knowledge Graphs , 2015, 2016 19th International Conference on Information Fusion (FUSION).

[134]  Jingen Xiang,et al.  Scalable Scientific Computing Algorithms Using MapReduce , 2013 .

[135]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[136]  Michael J. Swain,et al.  Indexing via color histograms , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[137]  Noel E. Sharkey,et al.  Connectionist Natural Language Processing , 1992 .

[138]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[139]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[140]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[141]  Jim X. Chen,et al.  The Evolution of Computing: AlphaGo , 2016, Comput. Sci. Eng..

[142]  Ruben Verborgh,et al.  Using OpenRefine , 2013 .

[143]  W. John Wilbur,et al.  The automatic identification of stop words , 1992, J. Inf. Sci..

[144]  Ehud Reiter,et al.  Knowledge Acquisition for Natural Language Generation , 2000, INLG.

[145]  Donald Eastlake,et al.  The FNV Non-Cryptographic Hash Algorithm , 2019 .

[146]  Alexander Gribov,et al.  New Flexible Non-parametric Data Transformation for Trans-Gaussian Kriging , 2012 .

[147]  Pablo Ariel Duboué,et al.  On the Robustness of Standalone Referring Expression Generation Algorithms Using RDF Data , 2016, WebNLG.

[148]  Kathleen R. McKeown,et al.  Indirect supervised learning of strategic generation logic , 2005 .

[149]  Adrian Akmajian,et al.  Linguistics: An Introduction to Language and Communication , 1979 .

[150]  Sean Owen,et al.  Mahout in Action , 2011 .

[151]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[152]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[153]  Gavin Brown,et al.  On the Stability of Feature Selection Algorithms , 2017, J. Mach. Learn. Res..

[154]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[155]  Vasileios Hatzivassiloglou,et al.  Disambiguating proteins, genes, and RNA in text: a machine learning approach , 2001, ISMB.

[156]  Viktor Mayer-Schnberger,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2013 .

[157]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[158]  Alexei Pozdnoukhov,et al.  Monitoring network optimisation for spatial data classification using support vector machines , 2006 .

[159]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[160]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[161]  David B. Searls,et al.  Grammatical Representations of Macromolecular Structure , 2006, J. Comput. Biol..

[162]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[163]  Nathan Srebro,et al.  Explicit Approximations of the Gaussian Kernel , 2011, ArXiv.

[164]  James J. Little,et al.  Play and Learn: Using Video Games to Train Computer Vision Models , 2016, BMVC.

[165]  Andrew McCallum,et al.  Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.

[166]  Daniel M. Bikel,et al.  Intricacies of Collins’ Parsing Model , 2004, CL.

[167]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[168]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[169]  William W. Cohen Learning Trees and Rules with Set-Valued Features , 1996, AAAI/IAAI, Vol. 1.

[170]  Mikko Kurimo,et al.  Morfessor 2.0: Toolkit for statistical morphological segmentation , 2014, EACL.

[171]  R. K. Rao Yarlagadda,et al.  Analog and Digital Signals and Systems , 2009 .

[172]  Hans-Peter Kriegel,et al.  Factorizing YAGO: scalable machine learning for linked data , 2012, WWW.

[173]  Hongtao Lu,et al.  Locality Preserving Hashing , 2014, AAAI.

[174]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[175]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[176]  Heiko Paulheim,et al.  RDF2Vec: RDF Graph Embeddings for Data Mining , 2016, SEMWEB.

[177]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[178]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[179]  K. Thorup,et al.  Intra‐African movements of the African cuckoo Cuculus gularis as revealed by satellite telemetry , 2018 .

[180]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[181]  Michalis Vazirgiannis,et al.  On Clustering Validation Techniques , 2001, Journal of Intelligent Information Systems.

[182]  Kalyan Veeramachaneni,et al.  Deep feature synthesis: Towards automating data science endeavors , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[183]  Lukasz A. Kurgan,et al.  CAIM discretization algorithm , 2004, IEEE Transactions on Knowledge and Data Engineering.

[184]  Glenn J. Myatt Making Sense of Data I: A Practical Guide to Exploratory Data Analysis and Data Mining , 2006 .

[185]  Chris Chatfield,et al.  The Analysis of Time Series: An Introduction , 1981 .

[186]  Aditya Kalyanpur,et al.  A framework for merging and ranking of answers in DeepQA , 2012, IBM J. Res. Dev..

[187]  Emden R. Gansner,et al.  Graphviz - Open Source Graph Drawing Tools , 2001, GD.

[188]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[189]  Mohak Shah,et al.  Evaluating Learning Algorithms: A Classification Perspective , 2011 .

[190]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[191]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[192]  Houkuan Huang,et al.  Feature selection for text classification with Naïve Bayes , 2009, Expert Syst. Appl..

[193]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[194]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.

[195]  F. Maxwell Harper,et al.  The MovieLens Datasets: History and Context , 2016, TIIS.

[196]  Aris Floratos,et al.  Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm [published erratum appears in Bioinformatics 1998;14(2): 229] , 1998, Bioinform..

[197]  Huan Liu,et al.  Feature Engineering for Machine Learning and Data Analytics , 2018 .

[198]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[199]  David Goldberg,et al.  What every computer scientist should know about floating-point arithmetic , 1991, CSUR.

[200]  Francisco Herrera,et al.  A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning , 2013, IEEE Transactions on Knowledge and Data Engineering.

[201]  Raymond J. Mooney,et al.  Creating diverse ensemble classifiers to reduce supervision , 2005 .

[202]  M. de Rijke,et al.  An Introduction to Click Models for Web Search: SIGIR 2015 Tutorial , 2015, SIGIR.

[203]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[204]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[205]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[206]  Thorsten Brants,et al.  Randomized Language Models via Perfect Hash Functions , 2008, ACL.

[207]  A. Gelman Analysis of variance: Why it is more important than ever? , 2005, math/0504499.

[208]  Jeff Heaton,et al.  Encog: library of interchangeable machine learning models for Java and C# , 2015, J. Mach. Learn. Res..

[209]  John R. Koza,et al.  Genetic Programming II , 1992 .

[210]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[211]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[212]  David A. Ferrucci,et al.  UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[213]  Damián Barsotti,et al.  Predicting Invariant Nodes in Large Scale Semantic Knowledge Graphs , 2017, SIMBig.

[214]  Shou-De Lin,et al.  Feature Engineering and Classifier Ensemble for KDD Cup 2010 , 2010, KDD 2010.

[215]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[216]  Nathalie Japkowicz,et al.  A Novelty Detection Approach to Classification , 1995, IJCAI.

[217]  Risi Kondor,et al.  Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.

[218]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[219]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[220]  George H. John Robust Decision Trees: Removing Outliers from Databases , 1995, KDD.

[221]  W. Featherstone,et al.  Comparison and validation of the recent freely available ASTER-GDEM ver1, SRTM ver4.1 and GEODATA DEM-9S ver3 digital elevation models over Australia , 2010 .

[222]  Frank Hutter,et al.  Neural Architecture Search , 2019, Automated Machine Learning.

[223]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[224]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[225]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[226]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[227]  Kenneth Ward Church,et al.  Very sparse random projections , 2006, KDD '06.

[228]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[229]  Frank Puppe,et al.  UIMA Ruta: Rapid development of rule-based information extraction applications , 2014, Natural Language Engineering.

[230]  Djamila Aouada,et al.  Feature engineering strategies for credit card fraud detection , 2016, Expert Syst. Appl..

[231]  Virginia E. Eubanks Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor , 2018 .

[232]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[233]  RéChristopher,et al.  Materialization Optimizations for Feature Selection Workloads , 2016, TODS.

[234]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[235]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[236]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .