Benchmarking, Measuring, and Optimizing: Second BenchCouncil International Symposium, Bench 2019, Denver, CO, USA, November 14–16, 2019, Revised Selected Papers
暂无分享,去创建一个
Elisa Bertino | Wanling Gao | Jianfeng Zhan | Xiaoyi Lu | Dan Stanzione | Geoffrey Fox | Xiaoyi Lu | E. Bertino | G. Fox | D. Stanzione | Wanling Gao | Jianfeng Zhan
[1] Wanling Gao,et al. Data motifs: a lens towards fully understanding big data and AI workloads , 2018, PACT.
[2] Tao Wang,et al. Deep learning with COTS HPC systems , 2013, ICML.
[3] Katerina J. Argyraki,et al. How to Measure the Killer Microsecond , 2017, CCRV.
[4] Ameet Talwalkar,et al. MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..
[5] Alexandru Iosup,et al. An Empirical Performance Evaluation of GPU-Enabled Graph-Processing Systems , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[6] Rajeev Dehejia,et al. Propensity Score-Matching Methods for Nonexperimental Causal Studies , 2002, Review of Economics and Statistics.
[7] Osamu Watanabe,et al. Developing Efficient Implementations of Bellman-Ford and Forward-Backward Graph Algorithms for NEC SX-ACE , 2018, Supercomput. Front. Innov..
[8] Ruocheng Guo,et al. Learning Individual Treatment Effects from Networked Observational Data , 2019, IJCAI.
[9] Endong Wang,et al. Intel Math Kernel Library , 2014 .
[10] Lloyd N. Trefethen,et al. Fourth-Order Time-Stepping for Stiff PDEs , 2005, SIAM J. Sci. Comput..
[11] Guangli Li,et al. XDN: Towards Efficient Inference of Residual Neural Networks on Cambricon Chips , 2019, Bench.
[12] Gerhard Wellein,et al. LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments , 2010, 2010 39th International Conference on Parallel Processing Workshops.
[13] Randy H. Katz,et al. Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.
[14] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Holger Karl,et al. DCT2Gen: A traffic generator for data centers , 2016, Comput. Commun..
[16] Andreas Hellander,et al. HarmonicIO: Scalable Data Stream Processing for Scientific Datasets , 2018, 2018 IEEE 11th International Conference on Cloud Computing (CLOUD).
[17] Michael A. Bender,et al. File Systems Fated for Senescence? Nonsense, Says Science! , 2017, FAST.
[18] Amin Vahdat,et al. Carousel: Scalable Traffic Shaping at End Hosts , 2017, SIGCOMM.
[19] Fan Zhang,et al. AIoT Bench: Towards Comprehensive Benchmarking Mobile and Embedded Device Intelligence , 2018, Bench.
[20] Eero Vainikko,et al. Petascale solvers for anisotropic PDEs in atmospheric modelling on GPU clusters , 2015, Parallel Comput..
[21] Kejiang Ye,et al. Imbalance in the cloud: An analysis on Alibaba cluster trace , 2017, 2017 IEEE International Conference on Big Data (Big Data).
[22] Arne-Jørgen Berre,et al. Evidence Based Big Data Benchmarking to Improve Business Performance , 2018 .
[23] Srihari Cadambi,et al. A dynamically configurable coprocessor for convolutional neural networks , 2010, ISCA.
[24] Samuel Williams,et al. An auto-tuning framework for parallel multicore stencil computations , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[25] Winfried Auzinger,et al. Practical splitting methods for the adaptive integration of nonlinear evolution equations. Part II: Comparisons of local error estimation and step-selection strategies for nonlinear Schrödinger and wave equations , 2019, Comput. Phys. Commun..
[26] Lovisa Lugnegård. Building a high throughput microscope simulator using the Apache Kafka streaming framework , 2018 .
[27] Adrian Schüpbach,et al. The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.
[28] Yiying Tong,et al. FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.
[29] Yuchen Zhang,et al. HPC AI500: A Benchmark Suite for HPC AI Systems , 2018, Bench.
[30] Greg Linden,et al. Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .
[31] George Karypis,et al. Item-based top-N recommendation algorithms , 2004, TOIS.
[32] Benson K. Muite,et al. A comparison of CPU and GPU performance for Fourier pseudospectral simulations of the Navier-Stokes, Cubic Nonlinear Schrodinger and Sine Gordon Equations , 2012 .
[33] R. Wollman,et al. High throughput microscopy: from raw images to discoveries , 2007, Journal of Cell Science.
[34] Dusan Markovic,et al. Benchmarking performance and energy efficiency of microprocessors for wireless sensor network applications , 2012, 2012 Proceedings of the 35th International Convention MIPRO.
[35] Archana Ganapathi,et al. The Case for Evaluating MapReduce Performance Using Workload Suites , 2011, 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems.
[36] Hans De Sterck,et al. Algorithmic Acceleration of Parallel ALS for Collaborative Filtering: Speeding up Distributed Big Data Recommendation in Spark , 2015, 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS).
[37] Abhinandan Das,et al. Google news personalization: scalable online collaborative filtering , 2007, WWW '07.
[38] Chao Li,et al. Fuxi: a Fault-Tolerant Resource Management and Job Scheduling System at Internet Scale , 2014, Proc. VLDB Endow..
[39] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[41] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[43] Joseph Gonzalez,et al. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.
[44] Tilmann Rabl,et al. Big Data Benchmark Compendium , 2015, TPCTC.
[45] Vadim D. Levchenko,et al. Performance Limits Study of Stencil Codes on Modern GPGPUs , 2019, Supercomput. Front. Innov..
[46] Stephen Bonner,et al. Causal embeddings for recommendation , 2017, RecSys.
[47] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[48] David Flynn,et al. DFS: A file system for virtualized flash storage , 2010, TOS.
[49] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[50] Wenguang Chen,et al. Gemini: A Computation-Centric Distributed Graph Processing System , 2016, OSDI.
[51] Jack J. Dongarra,et al. The LINPACK Benchmark: past, present and future , 2003, Concurr. Comput. Pract. Exp..
[52] Liu Bingbing. CloudBM:a Benchmark for Cloud Data Management Systems , 2012 .
[53] Chen Yang,et al. AstroServ: Distributed Database for Serving Large-Scale Full Life-Cycle Astronomical Data , 2018, BigSDM.
[54] Luca Benini,et al. GAP-8: A RISC-V SoC for AI at the Edge of the IoT , 2018, 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[55] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[56] Franck Cappello,et al. Failure prediction for HPC systems and applications , 2013, Int. J. High Perform. Comput. Appl..
[57] Ramesh Radhakrishnan,et al. Demystifying the MLPerf Benchmark Suite , 2019, ArXiv.
[58] Yann LeCun,et al. Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[59] Mihaela van der Schaar,et al. GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets , 2018, ICLR.
[60] Christopher Torng,et al. The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips , 2018, IEEE Micro.
[61] Dennis M. Wilkinson,et al. Large-Scale Parallel Collaborative Filtering for the Netflix Prize , 2008, AAIM.
[62] Chuan Wu,et al. Deep Learning-based Job Placement in Distributed Machine Learning Clusters , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.
[63] Minghe Yu,et al. AIBench: An Industry Standard Internet Service AI Benchmark Suite , 2019, ArXiv.
[64] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[65] Ian T. Foster,et al. Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing , 2001, 2001 Eighteenth IEEE Symposium on Mass Storage Systems and Technologies.
[66] Maosen Chen,et al. An Efficient Implementation of the ALS-WR Algorithm on x86 CPUs , 2019, Bench.
[67] Ruocheng Guo,et al. Causal Learning in Question Quality Improvement , 2019, Bench.
[68] Bernhard Schölkopf,et al. Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks , 2014, J. Mach. Learn. Res..
[69] Frederico Pratas,et al. Cache-aware Roofline model: Upgrading the loft , 2014, IEEE Computer Architecture Letters.
[70] Nathan R. Tallent,et al. HPCTOOLKIT: tools for performance analysis of optimized parallel programs , 2010, Concurr. Comput. Pract. Exp..
[71] Aaron Halfaker,et al. Identifying Semantic Edit Intentions from Revisions in Wikipedia , 2017, EMNLP.
[72] B. Scheers,et al. Column Store for GWAC: A High-cadence, High-density, Large-scale Astronomical Light Curve Pipeline and Distributed Shared-nothing Database , 2016 .
[73] Jeffrey A. Smith,et al. Does Matching Overcome Lalonde's Critique of Nonexperimental Estimators? , 2000 .
[74] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[75] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[76] Fan Zhang,et al. AIBench: Towards Scalable and Comprehensive Datacenter AI Benchmarking , 2018, Bench.
[77] Li Zhang,et al. GPU-accelerated Large-Scale Non-negative Matrix Factorization Using Spark , 2018, CollaborateCom.
[78] Wanling Gao,et al. DCMIX: Generating Mixed Workloads for the Cloud Data Center , 2018, Bench.
[79] Rajeev Balasubramonian,et al. Managing DRAM Latency Divergence in Irregular GPGPU Applications , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[80] Tianshu Hao,et al. The Implementation and Optimization of Matrix Decomposition Based Collaborative Filtering Task on X86 Platform , 2019, Bench.
[81] Xiao Wang,et al. AutoFFT: a template-based FFT codes auto-generation framework for ARM and X86 CPUs , 2019, SC.
[82] Shiguang Shan,et al. Improving 2D Face Recognition via Discriminative Face Depth Estimation , 2018, 2018 International Conference on Biometrics (ICB).
[83] Jie Huang,et al. The HiBench benchmark suite: Characterization of the MapReduce-based data analysis , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).
[84] Thorsten Kurth,et al. Hierarchical Roofline analysis for GPUs: Accelerating performance optimization for the NERSC‐9 Perlmutter system , 2020, Concurr. Comput. Pract. Exp..
[85] Hari Sundar,et al. FFT, FMM, or Multigrid? A comparative Study of State-Of-the-Art Poisson Solvers for Uniform and Nonuniform Grids in the Unit Cube , 2014, SIAM J. Sci. Comput..
[86] Herodotos Herodotou,et al. MapReduce programming and cost-based optimization? , 2011, Proc. VLDB Endow..
[87] Jennifer L. Hill,et al. Bayesian Nonparametric Modeling for Causal Inference , 2011 .
[88] F. Krogh,et al. Solving Ordinary Differential Equations , 2019, Programming for Computations - Python.
[89] Jay Kreps,et al. Kafka : a Distributed Messaging System for Log Processing , 2011 .
[90] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[91] Zhengming Ding,et al. Latent Tensor Transfer Learning for RGB-D Action Recognition , 2014, ACM Multimedia.
[92] Johan Karlsson,et al. Adapting the Secretary Hiring Problem for Optimal Hot-Cold Tier Placement Under Top-K Workloads , 2019, 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).
[93] G. Duncan,et al. Economic deprivation and early childhood development. , 1994, Child development.
[94] Ruocheng Guo,et al. A Practical Data Repository for Causal Learning with Big Data , 2019, Bench.
[95] Rishabh Mehrotra,et al. The Music Streaming Sessions Dataset , 2018, WWW.
[96] Ruocheng Guo,et al. Diffusion in Social Networks , 2015, SpringerBriefs in Computer Science.
[97] Michael Stonebraker,et al. A comparison of approaches to large-scale data analysis , 2009, SIGMOD Conference.
[98] Nicolas Gillis,et al. Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization , 2011, Neural Computation.
[99] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[100] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[101] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[102] Thorsten Joachims,et al. Counterfactual Risk Minimization: Learning from Logged Bandit Feedback , 2015, ICML.
[103] Babak Falsafi,et al. Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.
[104] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[105] D. Rubin. [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .
[106] Max Welling,et al. Causal Effect Inference with Deep Latent-Variable Models , 2017, NIPS 2017.
[107] Xu Wen,et al. Improving RGB-D Face Recognition via Transfer Learning from a Pretrained 2D Network , 2019, Bench.
[108] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[109] Ruocheng Guo,et al. Robust Cyberbullying Detection with Causal Interpretation , 2019, WWW.
[110] Chao Yang,et al. 10M-Core Scalable Fully-Implicit Solver for Nonhydrostatic Atmospheric Dynamics , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[111] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[112] Gu-Yeon Wei,et al. Fathom: reference workloads for modern deep learning methods , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).
[113] Wei Cao,et al. DT-CGRA: Dual-track coarse-grained reconfigurable architecture for stream applications , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[114] Ninghui Sun,et al. DianNao family , 2016, Commun. ACM.
[115] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[116] Rafal Zdunek,et al. Distributed Nonnegative Matrix Factorization with HALS Algorithm on Apache Spark , 2018, ICAISC.
[117] John Langford,et al. The offset tree for learning with partial labels , 2008, KDD.
[118] Steve B. Jiang,et al. Intelligent Parameter Tuning in Optimization-Based Iterative CT Reconstruction via Deep Reinforcement Learning , 2017, IEEE Transactions on Medical Imaging.
[119] Tianshi Chen,et al. Cambricon-F: Machine Learning Computers with Fractal von Neumann Architecture , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[120] Samuel Williams,et al. Roofline Scaling Trajectories: A Method for Parallel Application and Architectural Performance Analysis , 2018, 2018 International Conference on High Performance Computing & Simulation (HPCS).
[121] Houman Homayoun,et al. Hadoop Workloads Characterization for Performance and Energy Efficiency Optimizations on Microservers , 2018, IEEE Transactions on Multi-Scale Computing Systems.
[122] Ami Marowka,et al. On Performance Analysis of a Multithreaded Application Parallelized by Different Programming Models Using Intel VTune , 2011, PaCT.
[123] Reinhold Weicker,et al. Dhrystone: a synthetic systems programming benchmark , 1984, CACM.
[124] Ryen W. White,et al. Clarifications and question specificity in synchronous social Q&A , 2013, CHI Extended Abstracts.
[125] Jure Leskovec,et al. Inferring Networks of Substitutable and Complementary Products , 2015, KDD.
[126] D. Almond,et al. The Costs of Low Birth Weight , 2004 .
[127] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[128] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[129] Lei Li,et al. CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling , 2018, AAAI.
[130] Minyi Guo,et al. PSL: Exploiting Parallelism, Sparsity and Locality to Accelerate Matrix Factorization on x86 Platforms , 2019, Bench.
[131] Zhibin Yu,et al. The Elasticity and Plasticity in Semi-Containerized Co-locating Cloud Workload: a View from Alibaba Trace , 2018, SoCC.
[132] Chih-Jen Lin,et al. A Practical Guide to Support Vector Classication , 2008 .
[133] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[134] Ernst Hairer,et al. Simulating Hamiltonian dynamics , 2006, Math. Comput..
[135] Zihan Jiang,et al. Performance Analysis of Cambricon MLU100 , 2019, Bench.
[136] K.W. Bowyer,et al. Using a Multi-Instance Enrollment Representation to Improve 3D Face Recognition , 2007, 2007 First IEEE International Conference on Biometrics: Theory, Applications, and Systems.
[137] Fernando Ortega,et al. A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model , 2016, Knowl. Based Syst..
[138] Jim Webber,et al. A programmatic introduction to Neo4j , 2018, SPLASH '12.
[139] Li Fu,et al. Improve Image Classification by Convolutional Network on Cambricon , 2019, Bench.
[140] Zheng Zhang,et al. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.
[141] Ola Spjuth,et al. SNIC Science Cloud (SSC): A National-Scale Cloud Infrastructure for Swedish Academia , 2017, 2017 IEEE 13th International Conference on e-Science (e-Science).
[142] Kunle Olukotun,et al. DAWNBench : An End-to-End Deep Learning Benchmark and Competition , 2017 .
[143] Junwei Han,et al. CNNs-Based RGB-D Saliency Detection via Cross-View Transfer and Multiview Fusion. , 2018, IEEE transactions on cybernetics.
[144] Nikolai Joukov,et al. A nine year study of file system and storage benchmarking , 2008, TOS.
[145] Torsten Hoefler,et al. Using automated performance modeling to find scalability bugs in complex codes , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[146] Karline Soetaert,et al. Solving Ordinary Differential Equations in R , 2012 .
[147] Ching-Yung Lin,et al. GraphBIG: understanding graph computing in the context of industrial solutions , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[148] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[149] Luca Maria Gambardella,et al. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Flexible, High Performance Convolutional Neural Networks for Image Classification , 2022 .
[150] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[151] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.
[152] Mikko H. Lipasti,et al. BenchNN: On the broad potential application scope of hardware neural network accelerators , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).
[153] Matteo Parsani,et al. More efficient time integration for Fourier pseudospectral DNS of incompressible turbulence , 2018, International Journal for Numerical Methods in Fluids.
[154] Lei Zou,et al. gStore: Answering SPARQL Queries via Subgraph Matching , 2011, Proc. VLDB Endow..
[155] Andrea C. Arpaci-Dusseau,et al. Generating realistic impressions for file-system benchmarking , 2009, TOS.
[156] Ian T. Foster,et al. Jetstream: a self-provisioned, scalable science and engineering cloud environment , 2015, XSEDE.
[157] Andrew S. Cassidy,et al. A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.
[159] Eunyoung Jeong,et al. mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems , 2014, NSDI.
[160] Rainer Gemulla,et al. Distributed Matrix Completion , 2012, 2012 IEEE 12th International Conference on Data Mining.
[161] Marwan Mattar,et al. Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .
[162] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[163] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[164] Ning Li,et al. Solving the Klein-Gordon equation using fourier spectral methods: a benchmark test for computer performance , 2015, SpringSim.
[165] Michael J. Freedman,et al. SLAQ: quality-driven scheduling for distributed machine learning , 2017, SoCC.
[166] Yee Whye Teh,et al. Causal Inference via Kernel Deviance Measures , 2018, NeurIPS.
[167] R. Lalonde. Evaluating the Econometric Evaluations of Training Programs with Experimental Data , 1984 .
[168] Brandon Lucia,et al. Combining Data Duplication and Graph Reordering to Accelerate Parallel Graph Processing , 2019, HPDC.
[169] Anil K. Jain,et al. Face recognition: Some challenges in forensics , 2011, Face and Gesture 2011.
[170] Fan Xia,et al. BSMA: A Benchmark for Analytical Queries over Social Media Data , 2014, Proc. VLDB Endow..
[171] Alexandros G. Dimakis,et al. Cost-Optimal Learning of Causal Graphs , 2017, ICML.
[172] Yanjun Wu,et al. RVTensor: A Light-Weight Neural Network Inference Framework Based on the RISC-V Architecture , 2019, Bench.
[173] Inderjit S. Dhillon,et al. Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems , 2012, 2012 IEEE 12th International Conference on Data Mining.
[174] David R. Kaeli,et al. DNNMark: A Deep Neural Network Benchmark Suite for GPUs , 2017, GPGPU@PPoPP.
[175] Bengt Fornberg,et al. A practical guide to pseudospectral methods: Introduction , 1996 .
[176] Dong Han,et al. Cambricon: An Instruction Set Architecture for Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[177] Yan Li,et al. CAPES: Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement Learning , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.
[178] Reynold Xin,et al. GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.
[179] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[180] Ruocheng Guo,et al. Linked Causal Variational Autoencoder for Inferring Paired Spillover Effects , 2018, CIKM.
[181] Alexandros G. Dimakis,et al. Learning Causal Graphs with Small Interventions , 2015, NIPS.
[182] Yoshua Bengio,et al. Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..
[183] Mahadev Satyanarayanan,et al. OpenFace: A general-purpose face recognition library with mobile applications , 2016 .
[184] Stefanos Zafeiriou,et al. Statistical non-rigid ICP algorithm and its application to 3D face alignment , 2017, Image Vis. Comput..
[185] Uri Shalit,et al. Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.
[186] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[187] Zhuo Liu,et al. Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[188] Gregory R. Ganger,et al. Geriatrix: Aging what you see and what you don't see. A file system aging approach for modern storage systems , 2018, USENIX Annual Technical Conference.
[189] Nancy Wilkins-Diehr,et al. XSEDE: Accelerating Scientific Discovery , 2014, Computing in Science & Engineering.
[190] H. Chipman,et al. BART: Bayesian Additive Regression Trees , 2008, 0806.3286.
[191] Walter Karlen,et al. Perfect Match: A Simple Method for Learning Representations For Counterfactual Inference With Neural Networks , 2018, ArXiv.
[192] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[193] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[194] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[195] Dhanya Sridhar,et al. Using Text Embeddings for Causal Inference , 2019, ArXiv.
[196] Binyu Zang,et al. PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs , 2019, TOPC.
[197] Weitong Chen,et al. Enhancing recommendation on extremely sparse data with blocks-coupled non-negative matrix factorization , 2018, Neurocomputing.
[198] Ashish Sureka,et al. Chaff from the wheat: characterization and modeling of deleted questions on stack overflow , 2014, WWW.
[199] YU WANG,et al. A Survey of FPGA-Based Neural Network Inference Accelerator , 2019 .
[200] Ruocheng Guo,et al. Adaptive Unsupervised Feature Selection on Attributed Networks , 2019, KDD.
[201] Constantin F. Aliferis,et al. The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.
[202] RalfHiutmut Gtiting,et al. GraphDB : Modeling and Querying Graphs in Databases , 1998 .
[203] Thorsten Joachims,et al. The Self-Normalized Estimator for Counterfactual Learning , 2015, NIPS.
[204] Timothy G. Armstrong,et al. LinkBench: a database benchmark based on the Facebook social graph , 2013, SIGMOD '13.
[205] Gary Bradski,et al. Computer Vision Face Tracking For Use in a Perceptual User Interface , 1998 .
[206] Carlo Curino,et al. Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.
[207] Kai Hwang,et al. Edge AIBench: Towards Comprehensive End-to-end Edge Computing Benchmarking , 2018, Bench.
[208] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[209] Stijn Eyerman,et al. Many-Core Graph Workload Analysis , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[210] Daniel Raumer,et al. MoonGen: A Scriptable High-Speed Packet Generator , 2014, Internet Measurement Conference.
[211] S. McIntosh-Smith,et al. Scaling Results From the First Generation of Arm-based Supercomputers , 2019 .
[212] L Sirovich,et al. Low-dimensional procedure for the characterization of human faces. , 1987, Journal of the Optical Society of America. A, Optics and image science.
[213] Andreas Hellander,et al. BAMSI: a multi-cloud service for scalable distributed filtering of massive genome data , 2018, BMC Bioinform..
[214] John Shalf,et al. HPGMG 1.0: A Benchmark for Ranking High Performance Computing Systems , 2014 .
[215] Donald B. Rubin,et al. Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .
[216] Jiming Liu,et al. Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Social Collaborative Filtering by Trust , 2022 .
[217] Tommi S. Jaakkola,et al. Sequence to Better Sequence: Continuous Revision of Combinatorial Structures , 2017, ICML.
[218] Matthew G. Knepley,et al. A performance spectrum for parallel computational frameworks that solve PDEs , 2017, Concurr. Comput. Pract. Exp..
[219] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[220] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[221] Lizy Kurian John,et al. Benchmarking Big Data Systems: A Review , 2018, IEEE Transactions on Services Computing.
[222] Stacy Patterson,et al. EdgeBench: Benchmarking Edge Computing Platforms , 2018, 2018 IEEE/ACM International Conference on Utility and Cloud Computing Companion (UCC Companion).
[223] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[224] Dieter Kranzlmüller,et al. glogin - a multifunctional, interactive tunnel into the grid , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.
[225] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[226] Jaewon Lee,et al. WSMeter: A Performance Evaluation Methodology for Google's Production Warehouse-Scale Computers , 2018, ASPLOS.
[227] Tao Tang,et al. Efficient and Portable ALS Matrix Factorization for Recommender Systems , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[228] Sunita Chandrasekaran,et al. NAS Parallel Benchmarks for GPGPUs Using a Directive-Based Programming Model , 2014, LCPC.
[229] Haichen Shen,et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018, OSDI.
[230] Daisuke Takahashi,et al. Reproducibility in Benchmarking Parallel Fast Fourier Transform based Applications , 2019, ICPE Companion.
[231] Bronis R. de Supinski,et al. The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[232] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[233] Yang Chen,et al. Data Management Challenges and Real-Time Processing Technologies in Astronomy , 2017 .
[234] Jack J. Dongarra,et al. Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy , 2008, TOMS.
[235] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.
[236] Chunjie Luo,et al. BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking , 2013, WBDB.
[237] Allen D. Malony,et al. The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..
[238] Isabelle Guyon,et al. Design and Analysis of the Causation and Prediction Challenge , 2008, WCCI Causation and Prediction Challenge.
[239] Hassan Chafi,et al. The LDBC Social Network Benchmark: Interactive Workload , 2015, SIGMOD Conference.
[240] M. Turk,et al. Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.
[241] Adam Silberstein,et al. Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.
[242] K. Sachs,et al. Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.
[243] Raouf Boutaba,et al. Characterizing Task Usage Shapes in Google Compute Clusters , 2011 .
[244] Razvan Pascanu,et al. Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.
[245] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[246] Debajyoti Mukhopadhyay,et al. Matrix Factorization Model in Collaborative Filtering Algorithms: A Survey , 2015 .
[247] Jie Huang,et al. Benchmarking modern distributed streaming platforms , 2016, 2016 IEEE International Conference on Industrial Technology (ICIT).
[248] S. Giordano,et al. BRUNO: A high performance traffic generator for network processor , 2008, 2008 International Symposium on Performance Evaluation of Computer and Telecommunication Systems.
[249] Ali Anwar,et al. Characterizing Co-located Datacenter Workloads: An Alibaba Case Study , 2018, APSys.
[250] Reynold Xin,et al. Apache Spark , 2016 .
[251] K. Dosaka,et al. A 40GOPS 250mW massively parallel processor based on matrix architecture , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.
[252] Jure Leskovec,et al. Discovering value from community activity on focused question answering sites: a case study of stack overflow , 2012, KDD.
[253] Hoi-Jun Yoo,et al. A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine , 2009, IEEE Journal of Solid-State Circuits.
[254] David H. Bailey,et al. The NAS Parallel Benchmarks 2.0 , 2015 .
[255] David E. Keyes,et al. Efficiency of High Order Spectral Element Methods on Petascale Architectures , 2016, ISC.
[256] Omer Khan,et al. CRONO: A Benchmark Suite for Multithreaded Graph Algorithms Executing on Futuristic Multicores , 2015, 2015 IEEE International Symposium on Workload Characterization.
[257] Jack Dongarra,et al. A new metric for ranking high-performance computing systems , 2016, National Science Review.
[258] David A. Patterson,et al. A new golden age for computer architecture , 2019, Commun. ACM.
[259] Seif Haridi,et al. Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..
[260] Raj Jain,et al. The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.
[261] Philippe Owezarski,et al. OSNT: open source network tester , 2014, IEEE Network.
[262] Philippe Couvee,et al. Recurrent Neural Network for Classifying of Hpc Applications , 2019, 2019 Spring Simulation Conference (SpringSim).
[263] Feiyi Wang,et al. Diving into petascale production file systems through large scale profiling and analysis , 2017, PDSW-DISCS@SC.
[264] Intel ® Guide for Developing Multithreaded Applications Part 1 : Application Threading and Synchronization Summary , 2010 .
[265] Krisztian Balog,et al. Identifying Unclear Questions in Community Question Answering Websites , 2019, ECIR.
[266] Joshua M. Stuart,et al. The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.
[267] Nectarios Koziris,et al. SparseX: A Library for High-Performance Sparse Matrix-Vector Multiplication on Multicore Platforms , 2018, ACM Trans. Math. Softw..
[268] Matteo Parsani,et al. Fully Implicit Time Stepping Can Be Efficient on Parallel Computers , 2019, Supercomput. Front. Innov..
[269] Chiara Francalanci,et al. Relating Big Data Business and Technical Performance Indicators , 2018 .
[270] Zheng Wang,et al. Adaptive Optimization of Sparse Matrix-Vector Multiplication on Emerging Many-Core Architectures , 2018, 2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS).
[271] Y. Abdulkadir. Comparison of Finite Difference Schemes for the Wave Equation Based on Dispersion , 2015 .
[272] Manaal Faruqui,et al. Identifying Well-formed Natural Language Questions , 2018, EMNLP.
[273] María S. Pérez-Hernández,et al. Spark Versus Flink: Understanding Performance in Big Data Analytics Frameworks , 2016, 2016 IEEE International Conference on Cluster Computing (CLUSTER).
[274] Nicolas Gillis,et al. Accelerating Nonnegative Matrix Factorization Algorithms Using Extrapolation , 2018, Neural Computation.
[275] Mats Hamrud,et al. Accelerating Extreme-Scale Numerical Weather Prediction , 2015, PPAM.
[276] Ray Jain,et al. The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.
[277] Alessandro Bozzon,et al. Asking the right question in collaborative q&a systems , 2014, HT.
[278] Samuel Williams,et al. Optimization of geometric multigrid for emerging multi- and manycore processors , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[279] Benoît Meister,et al. Runnemede: An architecture for Ubiquitous High-Performance Computing , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[280] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[281] Sam Harbaugh,et al. Timing studies using a synthetic Whetstone benchmark , 1984, ALET.
[282] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[283] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[284] Uri Shalit,et al. Learning Representations for Counterfactual Inference , 2016, ICML.
[285] Huiqian Niu,et al. An Implementation of ResNet on the Classification of RGB-D Images , 2019, Bench.
[286] Jared S. Murray,et al. Atlantic Causal Inference Conference (ACIC) Data Analysis Challenge 2017 , 2019, 1905.09515.
[287] Berin Martini,et al. NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.
[288] B. Fornberg. Generation of finite difference formulas on arbitrarily spaced grids , 1988 .