论文信息 - The Art of Feature Engineering - 字舞流文

The Art of Feature Engineering

Pablo Duboue | Pablo Duboue

[1] Rachel Schutt,et al. Doing Data Science , 2013 .

[2] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[3] Michael J. Cafarella,et al. Input selection for fast feature engineering , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[4] D. Gabor,et al. Theory of communication. Part 1: The analysis of information , 1946 .

[5] Fillia Makedon,et al. Learning from Incomplete Ratings Using Non-negative Matrix Factorization , 2006, SDM.

[6] Carsten Steger,et al. Machine Vision Algorithms , 2007 .

[7] Dragomir R. Radev,et al. Book Review: Graph-Based Natural Language Processing and Information Retrieval by Rada Mihalcea and Dragomir Radev , 2011, CL.

[8] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[9] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[11] Ivan Markovsky,et al. Low Rank Approximation - Algorithms, Implementation, Applications , 2018, Communications and Control Engineering.

[12] D. Haussler,et al. Boolean Feature Discovery in Empirical Learning , 1990, Machine Learning.

[13] John McGee,et al. Discretization of Time Series Data , 2005, J. Comput. Biol..

[14] Stephen F. King. On Writing: A Memoir of the Craft , 2000 .

[15] Victoria J. Hodge,et al. A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[16] Marti A. Hearst. Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[17] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .

[18] Aurélien Géron,et al. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems , 2017 .

[19] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[20] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[21] Yao-Yi Chiang,et al. Emerging trends in geospatial artificial intelligence (geoAI): potential applications for environmental epidemiology , 2018, Environmental Health.

[22] J. Cavanaugh. Unifying the derivations for the Akaike and corrected Akaike information criteria , 1997 .

[23] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[24] Anca D. Dragan,et al. Translating Neuralese , 2017, ACL.

[25] Huan Liu,et al. Discretization: An Enabling Technique , 2002, Data Mining and Knowledge Discovery.

[26] M. Kenward,et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls , 2009, BMJ : British Medical Journal.

[27] Deepak S. Turaga,et al. Feature Engineering for Predictive Modeling using Reinforcement Learning , 2017, AAAI.

[28] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[29] Yvan Saeys,et al. Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[30] Todd C. Kelley,et al. The Art of Innovation: Lessons in Creativity from IDEO, America's Leading Design Firm , 2001 .

[31] Christian Bizer,et al. DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[32] Thorsten Joachims,et al. Learning structural SVMs with latent variables , 2009, ICML '09.

[33] Pedro M. Domingos. A few useful things to know about machine learning , 2012, Commun. ACM.

[34] Raymond J. Mooney,et al. Learning for Semantic Parsing with Statistical Machine Translation , 2006, NAACL.

[35] Leo Breiman,et al. Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[36] Amir Sadeghian,et al. Feature Engineering for Knowledge Base Construction , 2014, IEEE Data Eng. Bull..

[37] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[38] O. J. Dunn. Multiple Comparisons among Means , 1961 .

[39] William A. Gale,et al. Good-Turing Frequency Estimation Without Tears , 1995, J. Quant. Linguistics.

[40] Wendy G. Lehnert,et al. Information extraction , 1996, CACM.

[41] Benjamin Recht,et al. Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.

[42] D. Hubel,et al. Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.

[43] Jonathan Balzer,et al. Multi-view feature engineering and learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[45] E. Oja,et al. Independent Component Analysis , 2013 .

[46] Li Wang,et al. GDAL-based extend ArcGIS Engine's support for HDF file format , 2010, 2010 18th International Conference on Geoinformatics.

[47] James H. Martin,et al. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[48] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .

[49] Terrence J. Sejnowski,et al. Edges are the Independent Components of Natural Scenes , 1996, NIPS.

[50] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.

[51] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[52] David Vandyke,et al. Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[53] E. F. Codd,et al. A relational model of data for large shared data banks , 1970, CACM.

[54] Ronald L. Rivest,et al. Introduction to Algorithms , 1990 .

[55] Michael L. Overton,et al. Numerical Computing with IEEE Floating Point Arithmetic , 2001 .

[56] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57] Baoxin Li,et al. On the generality of neural image features , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[58] Sangkyum Kim,et al. Authorship classification: a discriminative syntactic tree mining approach , 2011, SIGIR.

[59] Gérard Dreyfus,et al. Ranking a Random Feature for Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[60] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[61] G Stix,et al. The mice that warred. , 2001, Scientific American.

[62] Pablo Ariel Duboue,et al. Deobfuscating Name Scrambling as a Natural Language Generation Task , 2018 .

[63] R. Wurtz. Recounting the impact of Hubel and Wiesel , 2009, The Journal of physiology.

[64] Charu C. Aggarwal,et al. Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[65] Drew Conway,et al. Machine Learning for Hackers , 2012 .

[66] Ethem Alpaydin,et al. Introduction to machine learning , 2004, Adaptive computation and machine learning.

[67] Tian Zhang,et al. BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[68] R.M. Haralick,et al. Statistical and structural approaches to texture , 1979, Proceedings of the IEEE.

[69] J. Cavanaugh,et al. Generalizing the derivation of the schwarz information criterion , 1999 .

[70] Satoshi Sekine,et al. A survey of named entity recognition and classification , 2007 .

[71] Alois Knoll,et al. Gradient boosting machines, a tutorial , 2013, Front. Neurorobot..

[72] Evgeniy Gabrilovich,et al. A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[73] Jeff T. Heaton,et al. Automated Feature Engineering for Deep Neural Networks with Genetic Programming , 2017 .

[74] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[75] Andrew McCallum,et al. FACTORIE: Probabilistic Programming via Imperatively Defined Factor Graphs , 2009, NIPS.

[76] S. Roberts. Novelty detection using extreme value statistics , 1999 .

[77] Gerhard Weikum,et al. WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[78] Jean Carletta,et al. Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[79] Mathew H. Evans,et al. Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results , 2018, Advances in Methods and Practices in Psychological Science.

[80] M. M. Prieto,et al. Hot Metal Temperature Forecasting at Steel Plant Using Multivariate Adaptive Regression Splines , 2019 .

[81] Gavin Brown,et al. Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[82] Dorian Pyle,et al. Data Preparation for Data Mining , 1999 .

[83] John D. Kelleher,et al. Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies , 2015 .

[84] J. Wolfowitz,et al. Introduction to the Theory of Statistics. , 1951 .

[85] David H. Wolpert,et al. No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[86] J. Friedman. Multivariate adaptive regression splines , 1990 .

[87] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[88] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[89] J. A. Hartigan,et al. A k-means clustering algorithm , 1979 .

[90] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[91] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[92] Alan Pankratz,et al. Forecasting with Dynamic Regression Models: Pankratz/Forecasting , 1991 .

[93] Gerold Hintz,et al. Leveraging Crowdsourcing for Paraphrase Recognition , 2013, LAW@ACL.

[94] Chun-Liang Li,et al. Rivalry of Two Families of Algorithms for Memory-Restricted Streaming PCA , 2015, AISTATS.

[95] J. Fleiss. Measuring nominal scale agreement among many raters. , 1971 .

[96] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[97] Peter Bühlmann. Regression shrinkage and selection via the Lasso: a retrospective (Robert Tibshirani): Comments on the presentation , 2011 .

[98] Marcelo A. Montemurro,et al. Beyond the Zipf-Mandelbrot law in quantitative linguistics , 2001, ArXiv.

[99] Hao Wang,et al. Online Streaming Feature Selection , 2010, ICML.

[100] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[101] Randy Kerber,et al. ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[102] Caroline Chan,et al. Determination of quantization intervals in rule based model for dynamic systems , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.

[103] M. F. Porter,et al. An algorithm for suffix stripping , 1997 .

[104] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..

[105] S. Kotsiantis,et al. Discretization Techniques: A recent survey , 2006 .

[106] Raymond J. Mooney,et al. Using Multiple Clause Constructors in Inductive Logic Programming for Semantic Parsing , 2001, ECML.

[107] Jeff Heaton,et al. An empirical analysis of feature engineering for predictive modeling , 2016, SoutheastCon 2016.

[108] Tony Jebara,et al. Structure preserving embedding , 2009, ICML '09.

[109] Mahantapas Kundu,et al. The journey of graph kernels through two decades , 2018, Comput. Sci. Rev..

[110] D. Sornette,et al. Stretched exponential distributions in nature and economy: “fat tails” with characteristic scales , 1998, cond-mat/9801293.

[111] P. Mahalanobis. On the generalized distance in statistics , 1936 .

[112] R. Tibshirani,et al. Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[113] Max A. Little,et al. Highly comparative time-series analysis: the empirical structure of time series and their methods , 2013, Journal of The Royal Society Interface.

[114] Vic Barnett,et al. Outliers in Statistical Data , 1980 .

[115] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[116] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[117] Jason Weston,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[118] Celine Vens,et al. Random Forest Based Feature Induction , 2011, 2011 IEEE 11th International Conference on Data Mining.

[119] Zhou Xu,et al. Vehicle Point of Interest Detection Using In-Car Data , 2018, GeoAI@SIGSPATIAL.

[120] M. A. Wincek. Forecasting With Dynamic Regression Models , 1993 .

[121] C. Furlanello,et al. Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products , 2006 .

[122] J. Mercer. Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[123] Amit Singhal,et al. Pivoted document length normalization , 1996, SIGIR 1996.

[124] Geoffrey E. Hinton,et al. Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[125] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[126] Jeffrey E. F. Friedl. Mastering Regular Expressions , 1997 .

[127] Pablo Duboue. Automatic Reports from Spreadsheets: Data Analysis for the Rest of Us , 2016, INLG.

[128] Rada Mihalcea,et al. Factors Influencing the Surprising Instability of Word Embeddings , 2018, NAACL.

[129] Michael R. Lyu,et al. Point-of-Interest Recommendation in Location-Based Social Networks , 2018, SpringerBriefs in Computer Science.

[130] Hiroshi Motoda,et al. Computational Methods of Feature Selection , 2022 .

[131] Matti Pietikäinen,et al. A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..

[132] D. Huff,et al. How to Lie with Statistics , 1956 .

[133] Volker Tresp,et al. Predicting the co-evolution of event and Knowledge Graphs , 2015, 2016 19th International Conference on Information Fusion (FUSION).

[134] Jingen Xiang,et al. Scalable Scientific Computing Algorithms Using MapReduce , 2013 .

[135] J. R. Firth,et al. A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[136] Michael J. Swain,et al. Indexing via color histograms , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[137] Noel E. Sharkey,et al. Connectionist Natural Language Processing , 1992 .

[138] Jens Lehmann,et al. DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[139] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[140] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[141] Jim X. Chen,et al. The Evolution of Computing: AlphaGo , 2016, Comput. Sci. Eng..

[142] Ruben Verborgh,et al. Using OpenRefine , 2013 .

[143] W. John Wilbur,et al. The automatic identification of stop words , 1992, J. Inf. Sci..

[144] Ehud Reiter,et al. Knowledge Acquisition for Natural Language Generation , 2000, INLG.

[145] Donald Eastlake,et al. The FNV Non-Cryptographic Hash Algorithm , 2019 .

[146] Alexander Gribov,et al. New Flexible Non-parametric Data Transformation for Trans-Gaussian Kriging , 2012 .

[147] Pablo Ariel Duboué,et al. On the Robustness of Standalone Referring Expression Generation Algorithms Using RDF Data , 2016, WebNLG.

[148] Kathleen R. McKeown,et al. Indirect supervised learning of strategic generation logic , 2005 .

[149] Adrian Akmajian,et al. Linguistics: An Introduction to Language and Communication , 1979 .

[150] Sean Owen,et al. Mahout in Action , 2011 .

[151] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[152] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[153] Gavin Brown,et al. On the Stability of Feature Selection Algorithms , 2017, J. Mach. Learn. Res..

[154] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[155] Vasileios Hatzivassiloglou,et al. Disambiguating proteins, genes, and RNA in text: a machine learning approach , 2001, ISMB.

[156] Viktor Mayer-Schnberger,et al. Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2013 .

[157] Thomas Hofmann,et al. Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[158] Alexei Pozdnoukhov,et al. Monitoring network optimisation for spatial data classification using support vector machines , 2006 .

[159] Nathan Halko,et al. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[160] S. V. N. Vishwanathan,et al. Graph kernels , 2007 .

[161] David B. Searls,et al. Grammatical Representations of Macromolecular Structure , 2006, J. Comput. Biol..

[162] Ron Kohavi,et al. Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[163] Nathan Srebro,et al. Explicit Approximations of the Gaussian Kernel , 2011, ArXiv.

[164] James J. Little,et al. Play and Learn: Using Video Games to Train Computer Vision Models , 2016, BMVC.

[165] Andrew McCallum,et al. Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.

[166] Daniel M. Bikel,et al. Intricacies of Collins’ Parsing Model , 2004, CL.

[167] Petros Drineas,et al. On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[168] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[169] William W. Cohen. Learning Trees and Rules with Set-Valued Features , 1996, AAAI/IAAI, Vol. 1.

[170] Mikko Kurimo,et al. Morfessor 2.0: Toolkit for statistical morphological segmentation , 2014, EACL.

[171] R. K. Rao Yarlagadda,et al. Analog and Digital Signals and Systems , 2009 .

[172] Hans-Peter Kriegel,et al. Factorizing YAGO: scalable machine learning for linked data , 2012, WWW.

[173] Hongtao Lu,et al. Locality Preserving Hashing , 2014, AAAI.

[174] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[175] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[176] Heiko Paulheim,et al. RDF2Vec: RDF Graph Embeddings for Data Mining , 2016, SEMWEB.

[177] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[178] Jure Leskovec,et al. node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[179] K. Thorup,et al. Intra‐African movements of the African cuckoo Cuculus gularis as revealed by satellite telemetry , 2018 .

[180] Larry A. Rendell,et al. The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[181] Michalis Vazirgiannis,et al. On Clustering Validation Techniques , 2001, Journal of Intelligent Information Systems.

[182] Kalyan Veeramachaneni,et al. Deep feature synthesis: Towards automating data science endeavors , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[183] Lukasz A. Kurgan,et al. CAIM discretization algorithm , 2004, IEEE Transactions on Knowledge and Data Engineering.

[184] Glenn J. Myatt. Making Sense of Data I: A Practical Guide to Exploratory Data Analysis and Data Mining , 2006 .

[185] Chris Chatfield,et al. The Analysis of Time Series: An Introduction , 1981 .

[186] Aditya Kalyanpur,et al. A framework for merging and ranking of answers in DeepQA , 2012, IBM J. Res. Dev..

[187] Emden R. Gansner,et al. Graphviz - Open Source Graph Drawing Tools , 2001, GD.

[188] Ron Kohavi,et al. Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[189] Mohak Shah,et al. Evaluating Learning Algorithms: A Classification Perspective , 2011 .

[190] Masoud Nikravesh,et al. Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[191] Edoardo Amaldi,et al. On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[192] Houkuan Huang,et al. Feature selection for text classification with Naïve Bayes , 2009, Expert Syst. Appl..

[193] Leo Breiman,et al. Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[194] Philip S. Yu,et al. A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.

[195] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.

[196] Aris Floratos,et al. Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm [published erratum appears in Bioinformatics 1998;14(2): 229] , 1998, Bioinform..

[197] Huan Liu,et al. Feature Engineering for Machine Learning and Data Analytics , 2018 .

[198] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[199] David Goldberg,et al. What every computer scientist should know about floating-point arithmetic , 1991, CSUR.

[200] Francisco Herrera,et al. A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning , 2013, IEEE Transactions on Knowledge and Data Engineering.

[201] Raymond J. Mooney,et al. Creating diverse ensemble classifiers to reduce supervision , 2005 .

[202] M. de Rijke,et al. An Introduction to Click Models for Web Search: SIGIR 2015 Tutorial , 2015, SIGIR.

[203] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[204] G. Amdhal,et al. Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[205] Fuhui Long,et al. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[206] Thorsten Brants,et al. Randomized Language Models via Perfect Hash Functions , 2008, ACL.

[207] A. Gelman. Analysis of variance: Why it is more important than ever? , 2005, math/0504499.

[208] Jeff Heaton,et al. Encog: library of interchangeable machine learning models for Java and C# , 2015, J. Mach. Learn. Res..

[209] John R. Koza,et al. Genetic Programming II , 1992 .

[210] Wei Xu,et al. Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[211] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[212] David A. Ferrucci,et al. UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[213] Damián Barsotti,et al. Predicting Invariant Nodes in Large Scale Semantic Knowledge Graphs , 2017, SIMBig.

[214] Shou-De Lin,et al. Feature Engineering and Classifier Ensemble for KDD Cup 2010 , 2010, KDD 2010.

[215] Hans-Peter Kriegel,et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[216] Nathalie Japkowicz,et al. A Novelty Detection Approach to Classification , 1995, IJCAI.

[217] Risi Kondor,et al. Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.

[218] Steven Skiena,et al. DeepWalk: online learning of social representations , 2014, KDD.

[219] Usama M. Fayyad,et al. Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[220] George H. John. Robust Decision Trees: Removing Outliers from Databases , 1995, KDD.

[221] W. Featherstone,et al. Comparison and validation of the recent freely available ASTER-GDEM ver1, SRTM ver4.1 and GEODATA DEM-9S ver3 digital elevation models over Australia , 2010 .

[222] Frank Hutter,et al. Neural Architecture Search , 2019, Automated Machine Learning.

[223] Praveen Paritosh,et al. Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[224] Alexander J. Smola,et al. Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[225] Keinosuke Fukunaga,et al. Introduction to Statistical Pattern Recognition , 1972 .

[226] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[227] Kenneth Ward Church,et al. Very sparse random projections , 2006, KDD '06.

[228] Charu C. Aggarwal,et al. Outlier Analysis , 2013, Springer New York.

[229] Frank Puppe,et al. UIMA Ruta: Rapid development of rule-based information extraction applications , 2014, Natural Language Engineering.

[230] Djamila Aouada,et al. Feature engineering strategies for credit card fraud detection , 2016, Expert Syst. Appl..

[231] Virginia E. Eubanks. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor , 2018 .

[232] R. Tibshirani,et al. Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[233] RéChristopher,et al. Materialization Optimizations for Feature Selection Workloads , 2016, TODS.

[234] Gail C. Murphy,et al. Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[235] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[236] A. E. Hoerl,et al. Ridge regression: biased estimation for nonorthogonal problems , 2000 .