Efficient Construction of Approximate Ad-Hoc ML models Through Materialization and Reuse
暂无分享,去创建一个
Nick Koudas | Abolfazl Asudeh | Gautam Das | Saravanan Thirumuruganathan | Sona Hasani | Gautam Das | Saravanan Thirumuruganathan | Abolfazl Asudeh | Sona Hasani | Nick Koudas
[1] Ion Stoica,et al. Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics , 2016, NSDI.
[2] Dan Feldman,et al. Turning big data into tiny data: Constant-size coresets for k-means, PCA and projective clustering , 2013, SODA.
[3] Nir Ailon,et al. Streaming k-means approximation , 2009, NIPS.
[4] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[5] Christos Faloutsos,et al. NetCube: A Scalable Tool for Fast Data Mining and Compression , 2001, VLDB.
[6] Larry S. Davis,et al. ModelHub: Deep Learning Lifecycle Management , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).
[7] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[8] Gideon S. Mann,et al. Distributed Training Strategies for the Structured Perceptron , 2010, NAACL.
[9] Kasturi R. Varadarajan,et al. Geometric Approximation via Coresets , 2007 .
[10] Christian Hennig,et al. Methods for merging Gaussian mixture components , 2010, Adv. Data Anal. Classif..
[11] Martin J. Wainwright,et al. Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[12] Bin Cui,et al. MLog: Towards Declarative In-Database Machine Learning , 2017, Proc. VLDB Endow..
[13] Ivor W. Tsang,et al. Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..
[14] Alon Y. Halevy,et al. Goods: Organizing Google's Datasets , 2016, SIGMOD Conference.
[15] Adrian E. Raftery,et al. Computing Normalizing Constants for Finite Mixture Models via Incremental Mixture Importance Sampling (IMIS) , 2006 .
[16] Andreas Krause,et al. Scalable Training of Mixture Models via Coresets , 2011, NIPS.
[17] Kun Li,et al. The MADlib Analytics Library or MAD Skills, the SQL , 2012, Proc. VLDB Endow..
[18] Wilko Schwarting,et al. Training Support Vector Machines using Coresets , 2017, ArXiv.
[19] Jun Yang,et al. Data Management in Machine Learning: Challenges, Techniques, and Systems , 2017, SIGMOD Conference.
[20] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.
[21] Gideon S. Mann,et al. Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models , 2009, NIPS.
[22] Andreas Krause,et al. Scalable and Distributed Clustering via Lightweight Coresets , 2017, ArXiv.
[23] Adam Meyerson,et al. Fast and Accurate k-means For Large Datasets , 2011, NIPS.
[24] Sudipto Guha,et al. Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..
[25] Surajit Chaudhuri,et al. Dynamic sample selection for approximate query processing , 2003, SIGMOD '03.
[26] Xintao Wu,et al. Loglinear-Based Quasi Cubes , 2004, Journal of Intelligent Information Systems.
[27] Sariel Har-Peled,et al. On coresets for k-means and k-median clustering , 2004, STOC '04.
[28] Joseph Gonzalez,et al. Hemingway: Modeling Distributed Optimization Algorithms , 2017, ArXiv.
[29] A.R. Runnalls,et al. A Kullback-Leibler Approach to Gaussian Mixture Reduction , 2007 .
[30] Tong Zhang,et al. Local Uncertainty Sampling for Large-Scale Multi-Class Logistic Regression , 2016, The Annals of Statistics.
[31] Hamid Pirahesh,et al. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.
[32] Yi Lin,et al. Prediction Cubes , 2005, VLDB.
[33] Kristian Kersting,et al. Coreset based Dependency Networks , 2017 .
[34] Minos N. Garofalakis,et al. Approximate Query Processing: Taming the TeraBytes , 2001, VLDB.
[35] Ryan Johnson,et al. Processing Analytical Workloads Incrementally , 2015, ArXiv.
[36] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[37] Trevor Campbell,et al. Coresets for Scalable Bayesian Logistic Regression , 2016, NIPS.
[38] Jian Pei,et al. Mining Multi-Dimensional Constrained Gradients in Data Cubes , 2001, VLDB.
[39] Gavriel Salomon,et al. T RANSFER OF LEARNING , 1992 .
[40] Larry S. Davis,et al. Towards Unified Data and Lifecycle Management for Deep Learning , 2016, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).
[41] Christopher Ré,et al. Materialization optimizations for feature selection workloads , 2014, SIGMOD Conference.
[42] Ameet Talwalkar,et al. MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..
[43] Peter Triantafillou,et al. Efficient Scalable Accurate Regression Queries in In-DBMS Analytics , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).
[44] Manasi Vartak,et al. ModelDB: a system for machine learning model management , 2016, HILDA '16.
[45] Samuel Madden,et al. MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis , 2018, SIGMOD Conference.
[46] Jeffrey F. Naughton,et al. Learning Generalized Linear Models Over Normalized Data , 2015, SIGMOD Conference.
[47] Jeff M. Phillips,et al. Improved Coresets for Kernel Density Estimates , 2017, SODA.
[48] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[49] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[50] Sridhar Ramaswamy,et al. Join synopses for approximate query answering , 1999, SIGMOD '99.
[51] Jeffrey F. Naughton,et al. Model Selection Management Systems: The Next Frontier of Advanced Analytics , 2016, SGMD.
[52] Sanjay Chawla,et al. A Cost-based Optimizer for Gradient Descent Optimization , 2017, SIGMOD Conference.
[53] Jeffrey F. Naughton,et al. To Join or Not to Join?: Thinking Twice about Joins before Feature Selection , 2016, SIGMOD Conference.
[54] Justus H. Piater,et al. Online Learning of Gaussian Mixture Models - a Two-Level Approach , 2008, VISAPP.