Data-centric Artificial Intelligence: A Survey
暂无分享,去创建一个
[1] Lei Zou,et al. Knowledge Graph Quality Management: A Comprehensive Survey , 2023, IEEE Transactions on Knowledge and Data Engineering.
[2] Guanchu Wang,et al. Weight Perturbation Can Help Fairness under Distribution Shift , 2023, ArXiv.
[3] Fan Yang,et al. CoRTX: Contrastive Framework for Real-time Explanation , 2023, ICLR.
[4] D. Zha,et al. Towards Personalized Preprocessing Pipeline Search , 2023, ArXiv.
[5] D. Zha,et al. Active Ensemble Learning for Knowledge Graph Error Detection , 2023, WSDM.
[6] D. Zha,et al. Fairly Predicting Graft Failure in Liver Transplant for Organ Assigning , 2023, AMIA.
[7] Philip S. Yu,et al. Weakly Supervised Anomaly Detection: A Survey , 2023, ArXiv.
[8] Christian Hammacher,et al. REIN: A Comprehensive Benchmark Framework for Data Cleaning Methods in ML Pipelines , 2023, EDBT.
[9] Fan Yang,et al. Efficient XAI Techniques: A Taxonomic Survey , 2023, ArXiv.
[10] Zaid Pervaiz Bhat,et al. Data-centric AI: Perspectives and Challenges , 2023, SDM.
[11] Rui Chen,et al. Bring Your Own View: Graph Neural Networks for Link Prediction with Personalized Subgraph Selection , 2022, WSDM.
[12] G. Satzger,et al. Data-centric Artificial Intelligence , 2022, ArXiv.
[13] Mengnan Du,et al. Mitigating Relational Bias on Knowledge Graphs , 2022, ArXiv.
[14] Shion Guha,et al. The Principles of Data-Centric AI (DCAI) , 2022, ArXiv.
[15] M. Schaar,et al. DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems , 2022, ArXiv.
[16] D. Zha,et al. RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations , 2022, arXiv.org.
[17] Meghana Deodhar,et al. A human-ML collaboration framework for improving video content reviews , 2022, CIKM Workshops.
[18] A. Kejariwal,et al. DreamShard: Generalizable Embedding Table Placement for Recommender Systems , 2022, Neural Information Processing Systems.
[19] B. Ghosh,et al. A Feature Extraction & Selection Benchmark for Structural Health Monitoring , 2022, Structural Health Monitoring.
[20] D. Zha,et al. Towards Automated Imbalanced Learning with Deep Hierarchical Reinforcement Learning , 2022, CIKM.
[21] Mengnan Du,et al. Towards Learning Disentangled Representations for Time Series , 2022, KDD.
[22] B. Schiele,et al. USB: A Unified Semi-supervised Learning Benchmark , 2022, NeurIPS.
[23] Yi-An Ma,et al. AutoShard: Automated Embedding Table Sharding for Recommender Systems , 2022, KDD.
[24] Margaret J. Warren,et al. DataPerf: Benchmarks for Data-Centric AI Development , 2022, ArXiv.
[25] Ethan Fetaya,et al. A Study on the Evaluation of Generative Models , 2022, ArXiv.
[26] Fan Yang,et al. Accelerating Shapley Explanation via Contributive Cooperator Selection , 2022, ICML.
[27] Gerard de Melo,et al. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models , 2022, ArXiv.
[28] Ksheera R Shetty,et al. Deep Learning for Computer Vision: A Brief Review , 2022, International Journal of Advanced Research in Science, Communication and Technology.
[29] A. Sowmya,et al. Blood-based transcriptomic signature panel identification for cancer diagnosis: Benchmarking of feature extraction methods , 2022, bioRxiv.
[30] Shafiq R. Joty,et al. Chart-to-Text: A Large-Scale Benchmark for Chart Summarization , 2022, ACL.
[31] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[32] Hanghang Tong,et al. Data Augmentation for Deep Graph Learning , 2022, SIGKDD Explor..
[33] Ninghao Liu,et al. G-Mixup: Graph Data Augmentation for Graph Classification , 2022, ICML.
[34] Mehmet Gorkem Ulkar,et al. BED: A Real-Time Object Detection System for Edge Devices , 2022, CIKM.
[35] Alexander J. Ratner,et al. A Survey on Programmatic Weak Supervision , 2022, ArXiv.
[36] A. Mostafavi,et al. FMP: Toward Fair Graph Message Passing against Topology Bias , 2022, ArXiv.
[37] V. Metsis,et al. TTS-GAN: A Transformer-based Time-Series Generative Adversarial Network , 2022, AIME.
[38] Daochen Zha,et al. Towards Similarity-Aware Time-Series Classification , 2022, SDM.
[39] B. Ommer,et al. High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Michael Hay,et al. Benchmarking Differentially Private Synthetic Data Generation Algorithms , 2021, ArXiv.
[41] Steven Euijong Whang,et al. Data collection and quality challenges in deep learning: a data-centric AI perspective , 2021, The VLDB Journal.
[42] Neoklis Polyzotis,et al. What can Data-Centric AI Learn from Data and ML Engineering? , 2021, ArXiv.
[43] Cathy H. Wu,et al. A crowdsourcing open platform for literature curation in UniProt , 2021, PLoS biology.
[44] Lace M. K. Padilla,et al. The Science of Visual Data Communication: What Works , 2021, Psychological science in the public interest : a journal of the American Psychological Society.
[45] Lora Aroyo,et al. Data Excellence for AI: Why Should You Care , 2021, ArXiv.
[46] Ninghao Liu,et al. Modeling Techniques for Machine Learning Fairness: A Survey , 2021, ArXiv.
[47] Juliana Freire,et al. AlphaD3M: Machine Learning Pipeline Synthesis , 2021, ArXiv.
[48] L. Nanni,et al. Comparison of Different Image Data Augmentation Approaches , 2021, J. Imaging.
[49] Bin Cui,et al. Facilitating Database Tuning with Hyper-Parameter Optimization: A Comprehensive Experimental Evaluation , 2021, Proc. VLDB Endow..
[50] Vidya Setlur,et al. Snowy: Recommending Utterances for Conversational Visual Analysis , 2021, UIST.
[51] Leixian Shen,et al. Towards Natural Language Interfaces for Data Visualization: A Survey , 2021, IEEE Transactions on Visualization and Computer Graphics.
[52] J. Rahnenführer,et al. Benchmark of filter methods for feature selection in high-dimensional gene expression survival data , 2021, Briefings Bioinform..
[53] Leilani Battle,et al. An Evaluation-Focused Framework for Visualization Recommendation Algorithms , 2021, IEEE Transactions on Visualization and Computer Graphics.
[54] D. Zha,et al. Automated Anomaly Detection via Curiosity-Guided Search and Self-Imitation Learning , 2021, IEEE Transactions on Neural Networks and Learning Systems.
[55] Peng Cui,et al. Towards Out-Of-Distribution Generalization: A Survey , 2021, ArXiv.
[56] Moritz Hardt,et al. Retiring Adult: New Datasets for Fair Machine Learning , 2021, NeurIPS.
[57] Zaid Pervaiz Bhat,et al. AutoVideo: An Automated Video Action Recognition System , 2021, IJCAI.
[58] J. V. D. Heuvel,et al. CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms , 2021, NeurIPS Datasets and Benchmarks.
[59] Hiroaki Hayashi,et al. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing , 2021, ACM Comput. Surv..
[60] Zahed Siddique,et al. Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance , 2021, Technologies.
[61] Jenna Wiens,et al. Mind the Performance Gap: Examining Dataset Shift During Prospective Validation , 2021, MLHC.
[62] Oriol Vinyals,et al. Highly accurate protein structure prediction with AlphaFold , 2021, Nature.
[63] Felix Bießmann,et al. A Benchmark for Data Imputation Methods , 2021, Frontiers in Big Data.
[64] Xia Hu,et al. Dirichlet Energy Constrained Learning for Deep Graph Neural Networks , 2021, NeurIPS.
[65] Diederik P. Kingma,et al. Variational Diffusion Models , 2021, ArXiv.
[66] Taghi M. Khoshgoftaar,et al. Text Data Augmentation for Deep Learning , 2021, Journal of Big Data.
[67] Weizhe Yuan,et al. BARTScore: Evaluating Generated Text as Text Generation , 2021, NeurIPS.
[68] J. Dowling,et al. A review of medical image data augmentation techniques for deep learning applications , 2021, Journal of medical imaging and radiation oncology.
[69] Xiangru Lian,et al. DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning , 2021, ICML.
[70] Xia Hu,et al. Simplifying Deep Reinforcement Learning via Self-Supervision , 2021, ArXiv.
[71] Matthias Boehm,et al. SliceLine: Fast, Linear-Algebra-based Slice Finding for ML Model Debugging , 2021, SIGMOD Conference.
[72] Hongfu Liu,et al. Fairness-Aware Unsupervised Feature Selection , 2021, CIKM.
[73] Dawn Song,et al. Scalability vs. Utility: Do We Have to Sacrifice One for the Other in Data Importance Quantification? , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[74] Pierre Blanchart,et al. An exact counterfactual-example-based approach to tree-ensemble models interpretability , 2021, ArXiv.
[75] David J. Fleet,et al. Cascaded Diffusion Models for High Fidelity Image Generation , 2021, J. Mach. Learn. Res..
[76] Eduard Hovy,et al. A Survey of Data Augmentation Approaches for NLP , 2021, FINDINGS.
[77] Praveen K. Paritosh,et al. “Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI , 2021, CHI.
[78] Kang Min Yoo,et al. GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation , 2021, EMNLP.
[79] Haifeng Jin,et al. AutoOD: Neural Architecture Search for Outlier Detection , 2021, 2021 IEEE 37th International Conference on Data Engineering (ICDE).
[80] A. Globerson,et al. BERTese: Learning to Speak to BERT , 2021, EACL.
[81] Miguel 'A. Carreira-Perpin'an,et al. Counterfactual Explanations for Oblique Decision Trees: Exact, Efficient Algorithms , 2021, AAAI.
[82] Hao Guan,et al. Domain Adaptation for Medical Image Analysis: A Survey , 2021, IEEE Transactions on Biomedical Engineering.
[83] Chuizheng Meng,et al. MIMIC-IF: Interpretability and Fairness Evaluation of Deep Learning Models on MIMIC-IV Dataset , 2021, ArXiv.
[84] Xia Hu,et al. Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments , 2021, ICLR.
[85] Danqi Chen,et al. Making Pre-trained Language Models Better Few-shot Learners , 2021, ACL.
[86] Hinrich Schütze,et al. Few-Shot Text Generation with Pattern-Exploiting Training , 2020, ArXiv.
[87] Pang Wei Koh,et al. WILDS: A Benchmark of in-the-Wild Distribution Shifts , 2020, ICML.
[88] Sivan Sabato,et al. Active Feature Selection for the Mutual Information Criterion , 2020, AAAI.
[89] Eric Xing,et al. Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling , 2020, ICLR.
[90] Christopher Ré,et al. No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems , 2020, NeurIPS.
[91] Wei Cao,et al. MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler , 2020, NeurIPS.
[92] Hamid R. Arabnia,et al. A Brief Review of Domain Adaptation , 2020, Advances in Data Science and Information Engineering.
[93] Yanwen Chong,et al. Graph-based semi-supervised learning: A review , 2020, Neurocomputing.
[94] Mucahid Kutlu,et al. Annotator Rationales for Labeling Tasks in Crowdsourcing , 2020, J. Artif. Intell. Res..
[95] Diego Martinez,et al. TODS: An Automated Time Series Outlier Detection System , 2020, AAAI.
[96] Xia Hu,et al. Meta-AAD: Active Anomaly Detection with Deep Reinforcement Learning , 2020, 2020 IEEE International Conference on Data Mining (ICDM).
[97] Hinrich Schütze,et al. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners , 2020, NAACL.
[98] Yanjun Qi,et al. Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples , 2020, BLACKBOXNLP.
[99] Zhihui Li,et al. A Survey of Deep Active Learning , 2020, ACM Comput. Surv..
[100] Brian Kenji Iwana,et al. An empirical survey of data augmentation for time series classification with neural networks , 2020, PloS one.
[101] Xia Hu,et al. RLCard: A Platform for Reinforcement Learning in Card Games , 2020, IJCAI.
[102] Qingquan Song,et al. Multi-Channel Graph Neural Networks , 2020, IJCAI.
[103] Sameep Mehta,et al. Overview and Importance of Data Quality for Machine Learning Tasks , 2020, KDD.
[104] Hiroki Arimura,et al. DACE: Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear Optimization , 2020, IJCAI.
[105] Xia Hu,et al. Policy-GNN: Aggregation Optimization for Graph Neural Networks , 2020, KDD.
[106] Alexander van Renen,et al. Benchmarking learned indexes , 2020, Proc. VLDB Endow..
[107] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.
[108] Xiao Huang,et al. Towards Deeper Graph Neural Networks with Differentiable Group Normalization , 2020, NeurIPS.
[109] Quoc V. Le,et al. Rethinking Pre-training and Self-training , 2020, NeurIPS.
[110] Marta Indulska,et al. Building Data Curation Processes with Crowd Intelligence , 2020, CAiSE Forum.
[111] Marcin Blachnik,et al. Comparison of Instance Selection and Construction Methods with Various Classifiers , 2020, Applied Sciences.
[112] Hang Su,et al. Benchmarking Adversarial Robustness on Image Classification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[113] Xia Hu,et al. Dual Policy Distillation , 2020, IJCAI.
[114] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[115] Sainyam Galhotra,et al. Adaptive Rule Discovery for Labeling Text Data , 2020, SIGMOD Conference.
[116] Xiao Zhang,et al. Active Incremental Feature Selection Using a Fuzzy-Rough-Set-Based Information Entropy , 2020, IEEE Transactions on Fuzzy Systems.
[117] Diyi Yang,et al. MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification , 2020, ACL.
[118] Bernd Bischl,et al. Multi-Objective Counterfactual Explanations , 2020, PPSN.
[119] Norman W. Paton,et al. Dataset Discovery in Data Lakes , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).
[120] Kyung-Ah Sohn,et al. Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[121] Le Gruenwald,et al. Online Index Selection Using Deep Reinforcement Learning for a Cluster Database , 2020, 2020 IEEE 36th International Conference on Data Engineering Workshops (ICDEW).
[122] Prithviraj Sen,et al. A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching , 2020, SIGMOD Conference.
[123] Xiaomin Song,et al. Time Series Data Augmentation for Deep Learning: A Survey , 2020, IJCAI.
[124] James Zou,et al. A Distributional Framework for Data Valuation , 2020, ICML.
[125] Bernhard Schölkopf,et al. Algorithmic Recourse: from Counterfactual Explanations to Interventions , 2020, FAccT.
[126] Ahmet Murat Ozbayoglu,et al. Deep Learning for Financial Applications : A Survey , 2020, Appl. Soft Comput..
[127] Timo Schick,et al. Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference , 2020, EACL.
[128] Patrícia J. Bota,et al. TSFEL: Time Series Feature Extraction Library , 2020, SoftwareX.
[129] Frank F. Xu,et al. How Can We Know What Language Models Know? , 2019, Transactions of the Association for Computational Linguistics.
[130] M. de Rijke,et al. FOCUS: Flexible Optimizable Counterfactual Explanations for Tree Ensembles , 2019, AAAI.
[131] Lior Rokach,et al. DeepLine: AutoML Tool for Pipelines Generation using Deep Reinforcement Learning and Hierarchical Actions Filtering , 2019, KDD.
[132] Daochen Zha,et al. PyODDS: An End-to-end Outlier Detection System with Automated Machine Learning , 2019, WWW.
[133] Ekaba Bisong,et al. Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners , 2019 .
[134] Andreas Kerren,et al. Toward a Quantitative Survey of Dimension Reduction Techniques , 2019, IEEE Transactions on Visualization and Computer Graphics.
[135] Peter A. Flach,et al. FACE: Feasible and Actionable Counterfactual Explanations , 2019, AIES.
[136] Kristina Lerman,et al. A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..
[137] Sameer Singh,et al. Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.
[138] Denis Gracanin,et al. A Comparison of Radial and Linear Charts for Visualizing Daily Patterns , 2019, IEEE Transactions on Visualization and Computer Graphics.
[139] Taghi M. Khoshgoftaar,et al. A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.
[140] Daochen Zha,et al. Experience Replay Optimization , 2019, IJCAI.
[141] Matias Barenstein,et al. ProPublica's COMPAS Data Revisited , 2019, ArXiv.
[142] B. Recht,et al. Do Image Classifiers Generalize Across Time? , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[143] Guoliang Li,et al. An End-to-End Learning-based Cost Estimator , 2019, Proc. VLDB Endow..
[144] Amit Dhurandhar,et al. Model Agnostic Contrastive Explanations for Structured Data , 2019, ArXiv.
[145] Joydeep Ghosh,et al. CERTIFAI: Counterfactual Explanations for Robustness, Transparency, Interpretability, and Fairness of Artificial Intelligence models , 2019, ArXiv.
[146] Marco F. Huber,et al. Benchmark and Survey of Automated Machine Learning Frameworks , 2019, J. Artif. Intell. Res..
[147] Sanjay Krishnan,et al. AlphaClean: Automatic Generation of Data Cleaning Pipelines , 2019, ArXiv.
[148] Quoc V. Le,et al. Using Videos to Evaluate Image Model Robustness , 2019, ArXiv.
[149] Yue Zhang,et al. CleanML: A Benchmark for Joint Data Cleaning and Machine Learning [Experiments and Analysis] , 2019, ArXiv.
[150] James Y. Zou,et al. Data Shapley: Equitable Valuation of Data for Machine Learning , 2019, ICML.
[151] Kamyar Azizzadenesheli,et al. Regularized Learning for Domain Adaptation under Label Shifts , 2019, ICLR.
[152] Ayodeji Olalekan Salau,et al. Feature Extraction: A Survey of the Types, Techniques, Applications , 2019, 2019 International Conference on Signal Processing and Communication (ICSC).
[153] Antonio Carlos de Francisco,et al. Data Mining and Machine Learning to Promote Smart Cities: A Systematic Review from 2000 to 2018 , 2019, Sustainability.
[154] Xue Ying,et al. An Overview of Overfitting and its Solutions , 2019, Journal of Physics: Conference Series.
[155] H. V. Jagadish,et al. Bridging the Semantic Gap with SQL Query Logs in Natural Language Interfaces to Databases , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).
[156] Zhiyuan Liu,et al. Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.
[157] Fei Wang,et al. Deep learning for healthcare: review, opportunities and challenges , 2018, Briefings Bioinform..
[158] Felix Bießmann,et al. Automating Large-Scale Data Quality Verification , 2018, Proc. VLDB Endow..
[159] Tim Kraska,et al. Slice Finder: Automated Data Slicing for Model Validation , 2018, 2019 IEEE 35th International Conference on Data Engineering (ICDE).
[160] Thomas G. Dietterich,et al. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.
[161] Júlio C. Nievola,et al. An Adaptive Approach for Index Tuning with Learning Classifier Systems on Hybrid Storage Environments , 2018, HAIS.
[162] Marie-Jeanne Lesot,et al. Comparison-Based Inverse Classification for Interpretability in Machine Learning , 2018, IPMU.
[163] Atul Prakash,et al. Robust Physical-World Attacks on Deep Learning Visual Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[164] James Y. Zou,et al. Multiaccuracy: Black-Box Post-Processing for Fairness in Classification , 2018, AIES.
[165] Quoc V. Le,et al. AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.
[166] Han Zhang,et al. Self-Attention Generative Adversarial Networks , 2018, ICML.
[167] Munther A. Dahleh,et al. A Marketplace for Data: An Algorithmic Solution , 2018, EC.
[168] Michael Stonebraker,et al. Aurum: A Data Discovery System , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).
[169] Tudor Dumitras,et al. Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.
[170] Guoliang Li,et al. DeepEye: Towards Automatic Data Visualization , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).
[171] Renée J. Miller,et al. Table Union Search on Open Data , 2018, Proc. VLDB Endow..
[172] Sebastian Link,et al. Data Quality: The Role of Empiricism , 2018, SGMD.
[173] Alexander J. Smola,et al. Detecting and Correcting for Label Shift with Black Box Predictors , 2018, ICML.
[174] Nikolaos Doulamis,et al. Deep Learning for Computer Vision: A Brief Review , 2018, Comput. Intell. Neurosci..
[175] Sarah Webb. Deep learning for biology , 2018, Nature.
[176] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.
[177] Hayit Greenspan,et al. Synthetic data augmentation using GAN for improved liver lesion classification , 2018, 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018).
[178] D. Zha,et al. Multi-label dataless text classification with topic modeling , 2017, Knowledge and Information Systems.
[179] Christopher Ré,et al. Snorkel: Rapid Training Data Creation with Weak Supervision , 2017, Proc. VLDB Endow..
[180] Georges G. Grinstein,et al. Benchmark Development for the Evaluation of Visualization for Data Mining , 2017 .
[181] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[182] Chris Russell,et al. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.
[183] Cherukuri Aswani Kumar,et al. Intrusion detection model using fusion of chi-square feature selection and multi class SVM , 2017, J. King Saud Univ. Comput. Inf. Sci..
[184] Deepak S. Turaga,et al. Feature Engineering for Predictive Modeling using Reinforcement Learning , 2017, AAAI.
[185] Lucila Ohno-Machado,et al. A publicly available benchmark for biomedical dataset retrieval: the reference standard for the 2016 bioCADDIE dataset retrieval challenge , 2017, Database J. Biol. Databases Curation.
[186] Jinfeng Yi,et al. ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.
[187] Xin Zhang,et al. TFX: A TensorFlow-Based Production-Scale Machine Learning Platform , 2017, KDD.
[188] Yu Zhang,et al. Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[189] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.
[190] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[191] Geoffrey J. Gordon,et al. Automatic Database Management System Tuning Through Large-scale Machine Learning , 2017, SIGMOD Conference.
[192] Fisher Yu,et al. Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[193] Maria Jesus Martin,et al. Uniclust databases of clustered and deeply annotated protein sequences and alignments , 2016, Nucleic Acids Res..
[194] Marco Loog,et al. A benchmark and comparison of active learning for logistic regression , 2016, Pattern Recognit..
[195] Tim Oates,et al. Time series classification from scratch with deep neural networks: A strong baseline , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).
[196] Zhongheng Zhang,et al. Missing data imputation: focusing on single imputation. , 2016, Annals of translational medicine.
[197] Hua Ouyang,et al. Learning to Rewrite Queries , 2016, CIKM.
[198] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.
[199] Jeffrey F. Naughton,et al. To Join or Not to Join?: Thinking Twice about Joins before Feature Selection , 2016, SIGMOD Conference.
[200] Christopher De Sa,et al. Data Programming: Creating Large Training Sets, Quickly , 2016, NIPS.
[201] Ananthram Swami,et al. Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.
[202] Kanit Wongsuphasawat,et al. Voyager: Exploratory Analysis via Faceted Browsing of Visualization Recommendations , 2016, IEEE Transactions on Visualization and Computer Graphics.
[203] Aaron Klein,et al. Efficient and Robust Automated Machine Learning , 2015, NIPS.
[204] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[205] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.
[206] Taghi M. Khoshgoftaar,et al. Using Random Undersampling to Alleviate Class Imbalance on Tweet Sentiment Data , 2015, 2015 IEEE International Conference on Information Reuse and Integration.
[207] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[208] David Zhang,et al. Feature selection and analysis on correlated gas sensor data with recursive feature elimination , 2015 .
[209] Carsten Binnig,et al. RODI: A Benchmark for Automatic Mapping Generation in Relational-to-Ontology Data Integration , 2015, ESWC.
[210] Huan Liu,et al. Embedded Unsupervised Feature Selection , 2015, AAAI.
[211] Felix Naumann,et al. Estimating the Number and Sizes of Fuzzy-Duplicate Clusters , 2014, CIKM.
[212] Aditya G. Parameswaran,et al. DataHub: Collaborative Data Science & Dataset Version Management at Scale , 2014, CIDR.
[213] Tilmann Rabl,et al. TPC-DI: The First Industry Benchmark for Data Integration , 2014, Proc. VLDB Endow..
[214] Zahir Tari,et al. A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis , 2014, IEEE Transactions on Emerging Topics in Computing.
[215] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[216] Hanspeter Pfister,et al. What Makes a Visualization Memorable? , 2013, IEEE Transactions on Visualization and Computer Graphics.
[217] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.
[218] Paolo Papotti,et al. Discovering Denial Constraints , 2013, Proc. VLDB Endow..
[219] Raúl A. Santelices,et al. Quantitative program slicing: Separating statements by relevance , 2013, 2013 35th International Conference on Software Engineering (ICSE).
[220] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[221] Tim Kraska,et al. CrowdER: Crowdsourcing Entity Resolution , 2012, Proc. VLDB Endow..
[222] Kwong-Sak Leung,et al. A Survey of Crowdsourcing Systems , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.
[223] S. Sudarshan,et al. DBridge: A program rewrite tool for set-oriented query execution , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[224] Trevor Darrell,et al. Adapting Visual Category Models to New Domains , 2010, ECCV.
[225] Zheng Shao,et al. Data warehousing and analytics infrastructure at facebook , 2010, SIGMOD Conference.
[226] Daniel Jurafsky,et al. Distant supervision for relation extraction without labeled data , 2009, ACL.
[227] Shivnath Babu,et al. Tuning Database Configuration Parameters with iTuned , 2009, Proc. VLDB Endow..
[228] Carlo Batini,et al. Methodologies for data quality assessment and improvement , 2009, CSUR.
[229] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[230] Ohad Shamir,et al. Vox Populi: Collecting High-Quality Labels from a Crowd , 2009, COLT.
[231] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[232] Karsten M. Borgwardt,et al. Covariate Shift by Kernel Mean Matching , 2009, NIPS 2009.
[233] Praveen Paritosh,et al. Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.
[234] Haibo He,et al. ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).
[235] Klaus-Robert Müller,et al. Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..
[236] Wenfei Fan,et al. Conditional Functional Dependencies for Data Cleaning , 2007, 2007 IEEE 23rd International Conference on Data Engineering.
[237] Hans-Peter Kriegel,et al. Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.
[238] Yan Zhou,et al. Democratic co-learning , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.
[239] Marek Grochowski,et al. Comparison of Instance Selection Algorithms II. Results and Comments , 2004, ICAISC.
[240] Miguel Toro,et al. Finding representative patterns with ordered projections , 2003, Pattern Recognit..
[241] Maurizio Lenzerini,et al. Data integration: a theoretical perspective , 2002, PODS.
[242] Richard Y. Wang,et al. Data quality assessment , 2002, CACM.
[243] Gilbert Saporta,et al. Data fusion and data grafting , 2002 .
[244] Taizo Shirai,et al. Data discovery system , 2001 .
[245] Daniel C. Zilio,et al. DB2 advisor: an optimizer smart enough to recommend its own indexes , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).
[246] Jim Gray,et al. Microsoft TerraServer: a spatial data warehouse , 1999, SIGMOD '00.
[247] H. Zeng,et al. Stratal slicing, Part II : Real 3-D seismic data , 1998 .
[248] Surajit Chaudhuri,et al. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server , 1997, VLDB.
[249] Robert P. Goldman,et al. Imputation of Missing Data Using Machine Learning Techniques , 1996, KDD.
[250] David A. Cohn,et al. Active Learning with Statistical Models , 1996, NIPS.
[251] Robert W. Blanning,et al. Discovering implicit integrity constraints in rule bases using metagraphs , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.
[252] Jonathan D. Cryer,et al. Time Series Analysis , 1986 .
[253] Barbara J. Grosz,et al. Natural-Language Processing , 1982, Artif. Intell..
[254] Robert H. Riffenburgh,et al. Linear Discriminant Analysis , 1960 .
[255] Matthias Hirth,et al. Human-AI Collaboration for Improving the Identification of Cars for Autonomous Driving , 2022, CIKM Workshops.
[256] Rui Chen,et al. An Information Fusion Approach to Learning with Instance-Dependent Label Noise , 2022, ICLR.
[257] Fan Yang,et al. Generalized Demographic Parity for Group Fairness , 2022, ICLR.
[258] Yue Zhao,et al. Revisiting Time Series Outlier Detection: Definitions and Benchmarks , 2021, NeurIPS Datasets and Benchmarks.
[259] Peter Kellman,et al. Cut out the annotator, keep the cutout: better segmentation with weak supervision , 2021, ICLR.
[260] Xuanhe Zhou,et al. DBMind: A Self-Driving Platform in openGauss , 2021, Proc. VLDB Endow..
[261] Reynold Xin,et al. Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics , 2021, CIDR.
[262] Nadia Burkart,et al. A Step Towards Global Counterfactual Explanations: Approximating the Feature Space Through Hierarchical Division and Graph Search , 2021, Adv. Artif. Intell. Mach. Learn..
[263] M. Krasnyanskiy,et al. Quality Assessment Method for GAN Based on Modified Metrics Inception Score and Fréchet Inception Distance , 2020 .
[264] AnHai Doan,et al. Data Curation with Deep Learning , 2020, EDBT.
[265] Mitar Milutinovic. On Evaluation of AutoML Systems , 2020 .
[266] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[267] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[268] Ekaba Bisong,et al. Introduction to Scikit-learn , 2019, Building Machine Learning and Deep Learning Models on Google Cloud Platform.
[269] Michael Stonebraker,et al. Data Integration: The Current Status and the Way Forward , 2018, IEEE Data Eng. Bull..
[270] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[271] Yi Tay,et al. Deep Learning based Recommender System: A Survey and New Perspectives , 2017, ArXiv.
[272] Paolo Papotti,et al. Benchmarking Data Curation Systems , 2016, IEEE Data Eng. Bull..
[273] M. Zaharia,et al. Apache Spark: a unified engine for big data processing , 2016, Commun. ACM.
[274] Antony Selvadoss Thanamani,et al. Feature Selection Based on Information Gain , 2013 .
[275] Michael Stonebraker,et al. Data Curation at Scale: The Data Tamer System , 2013, CIDR.
[276] Oliver J. Sutton,et al. Introduction to k Nearest Neighbour Classification and Condensed Nearest Neighbour Data Reduction , 2012 .
[277] Matthew Lease,et al. Semi-Supervised Consensus Labeling for Crowdsourcing , 2011 .
[278] Luc Desnoyers,et al. Toward a Taxonomy of Visuals in Science Communication , 2011 .
[279] Liang Dong,et al. Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.
[280] Heng Tao Shen,et al. Principal Component Analysis , 2009, Encyclopedia of Biometrics.
[281] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..
[282] Nils J. Nilsson,et al. Artificial Intelligence , 1974, IFIP Congress.