THE BERKELEY DATA ANALYSIS SYSTEM (BDAS): AN OPEN SOURCE PLATFORM FOR BIG DATA ANALYTICS
暂无分享,去创建一个
Randy H. Katz | Michael I. Jordan | David A. Patterson | Armando Fox | Scott Shenker | Ion Stoica | Anthony D. Joseph | Michael W. Mahoney | Michael J. Franklin
[1] Ion Stoica,et al. CellIQ : Real-Time Cellular Network Analytics at Scale , 2015, NSDI.
[2] Dimitris S. Papailiopoulos,et al. Perturbed Iterate Analysis for Asynchronous Stochastic Optimization , 2015, SIAM J. Optim..
[3] Liwen Sun,et al. Fine-grained partitioning for aggressive data skipping , 2014, SIGMOD Conference.
[4] Michael I. Jordan,et al. On the Convergence Rate of Decomposable Submodular Function Minimization , 2014, NIPS.
[5] Ali Ghodsi,et al. Scalable atomic visibility with RAMP transactions , 2014, SIGMOD Conference.
[6] Joseph K. Bradley,et al. Parallel Double Greedy Submodular Maximization , 2014, NIPS.
[7] Ihab F. Ilyas,et al. Data Cleaning: Overview and Emerging Challenges , 2016, SIGMOD Conference.
[8] Ion Stoica,et al. Efficient Coflow Scheduling Without Prior Knowledge , 2015, SIGCOMM.
[9] Scott Shenker,et al. Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.
[10] Tim Kraska,et al. MLI: An API for Distributed Machine Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.
[11] Adam Marcus,et al. Argonaut: Macrotask Crowdsourcing for Complex Data Processing , 2015, Proc. VLDB Endow..
[12] Sanjay Krishnan,et al. Wisteria: Nurturing Scalable Data Cleaning Infrastructure , 2015, Proc. VLDB Endow..
[13] Matei Zaharia,et al. Matrix Computations and Optimization in Apache Spark , 2015, KDD.
[14] Ion Stoica,et al. Coflow: An Application Layer Abstraction for Cluster Networking , 2012 .
[15] Gautam Kumar,et al. pHost: distributed near-optimal datacenter transport over commodity network fabric , 2015, CoNEXT.
[16] Liwen Sun,et al. A Partitioning Framework for Aggressive Data Skipping , 2014, Proc. VLDB Endow..
[17] Scott Shenker,et al. The Case for Tiny Tasks in Compute Clusters , 2013, HotOS.
[18] Thomas Hofmann,et al. Communication-Efficient Distributed Dual Coordinate Ascent , 2014, NIPS.
[19] Anca D. Dragan,et al. Comparing human-centric and robot-centric sampling for robot deep learning from demonstrations , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[20] Ameet Talwalkar,et al. Knowing when you're wrong: building fast and reliable approximate query processing systems , 2014, SIGMOD Conference.
[21] Ion Stoica,et al. G-OLA: Generalized On-Line Aggregation for Interactive Analysis on Big Data , 2015, SIGMOD Conference.
[22] Ali Ghodsi,et al. Coordination Avoidance in Database Systems , 2014, Proc. VLDB Endow..
[23] Antti Jylhä,et al. How carat affects user behavior: implications for mobile battery awareness applications , 2014, CHI.
[24] Ali Ghodsi,et al. Eventual Consistency Today: Limitations, Extensions, and Beyond , 2013 .
[25] Archana Ganapathi,et al. Analyzing Log Analysis: An Empirical Study of User Log Mining , 2014, LISA.
[26] Reynold Xin,et al. GraphX: Unifying Data-Parallel and Graph-Parallel Analytics , 2014, ArXiv.
[27] Randy H. Katz,et al. Faster Jobs in Distributed Data Processing using Multi-Task Learning , 2015, SDM.
[28] Michael I. Jordan,et al. SparkNet: Training Deep Networks in Spark , 2015, ICLR.
[29] Scott Shenker,et al. Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks , 2014, SoCC.
[30] Lalit Jain,et al. NEXT: A System for Real-World Development, Evaluation, and Application of Active Learning , 2015, NIPS.
[31] Ion Stoica,et al. Efficient coflow scheduling with Varys , 2015, SIGCOMM.
[32] Scott Shenker,et al. Making Sense of Performance in Data Analytics Frameworks , 2015, NSDI.
[33] Randy H. Katz,et al. Cake: enabling high-level SLOs on shared storage systems , 2012, SoCC '12.
[34] Ion Stoica,et al. BlinkDB: queries with bounded errors and bounded response times on very large data , 2012, EuroSys '13.
[35] Ion Stoica,et al. BlowFish: Dynamic Storage-Performance Tradeoff in Data Stores , 2016, NSDI.
[36] Tim Kraska,et al. CrowdQ: Crowdsourced Query Understanding , 2013, CIDR.
[37] Ameet Talwalkar,et al. MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..
[38] Trevor Darrell,et al. TSC-DL: Unsupervised trajectory segmentation of multi-modal surgical demonstrations with Deep Learning , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[39] Zhao Zhang,et al. Scientific computing meets big data technology: An astronomy use case , 2015, 2015 IEEE International Conference on Big Data (Big Data).
[40] Ali Ghodsi,et al. The potential dangers of causal consistency and an explicit solution , 2012, SoCC '12.
[41] Eugene Wu,et al. CLAMShell: Speeding up Crowds for Low-latency Data Labeling , 2015, Proc. VLDB Endow..
[42] Rishabh K. Iyer,et al. Monotone Closure of Relaxed Constraints in Submodular Optimization: Connections Between Minimization and Maximization , 2014, UAI.
[43] Akshay Vij,et al. When is big data big enough? Implications of using GPS-based surveys for travel demand analysis , 2015 .
[44] Mary Goldman,et al. Rapid and efficient analysis of 20,000 RNA-seq samples with Toil , 2016, bioRxiv.
[45] Ali Ghodsi,et al. Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity , 2015, SIGMOD Conference.
[46] Sanjay Krishnan,et al. A methodology for learning, analyzing, and mitigating social influence bias in recommender systems , 2014, RecSys '14.
[47] Sanjay Krishnan,et al. ActiveClean: Interactive Data Cleaning For Statistical Modeling , 2016, Proc. VLDB Endow..
[48] Randy H. Katz,et al. Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.
[49] Ali Ghodsi,et al. HAT, Not CAP: Towards Highly Available Transactions , 2013, HotOS.
[50] Tim Kraska,et al. PLANET: making progress with commit processing in unpredictable environments , 2014, SIGMOD Conference.
[51] Scott Shenker,et al. Universal Packet Scheduling , 2015, NSDI.
[52] Ameet Talwalkar,et al. A general bootstrap performance diagnostic , 2013, KDD.
[53] Michael I. Jordan,et al. A General Analysis of the Convergence of ADMM , 2015, ICML.
[54] Paramvir Bahl,et al. Low Latency Geo-distributed Data Analytics , 2015, SIGCOMM.
[55] Ali Ghodsi,et al. Highly Available Transactions: Virtues and Limitations , 2013, Proc. VLDB Endow..
[56] Ion Stoica,et al. The Power of Choice in Data-Aware Cluster Scheduling , 2014, OSDI.
[57] Sasu Tarkoma,et al. Collaborative Energy Debugging for Mobile Devices , 2012, HotDep.
[58] Patrick Wendell,et al. Sparrow: distributed, low latency scheduling , 2013, SOSP.
[59] Ali Ghodsi,et al. Bolt-on causal consistency , 2013, SIGMOD '13.
[60] Tim Kraska,et al. A sample-and-clean framework for fast and accurate query processing on dirty data , 2014, SIGMOD Conference.
[61] Michael I. Jordan,et al. A Linearly-Convergent Stochastic L-BFGS Algorithm , 2015, AISTATS.
[62] Peter Bailis,et al. The network is reliable , 2014 .
[63] Purnamrita Sarkar,et al. Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning , 2014, Proc. VLDB Endow..
[64] Tim Kraska,et al. MDCC: multi-data center consistency , 2012, EuroSys '13.
[65] Ion Stoica,et al. PBS at work: advancing data management with consistency metrics , 2013, SIGMOD '13.
[66] Xi Chen,et al. Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..
[67] Martin J. Wainwright,et al. Distributed Estimation of Generalized Matrix Rank: Efficient Algorithms and Lower Bounds , 2015, ICML.
[68] Tim Kraska,et al. Automating model search for large scale machine learning , 2015, SoCC.
[69] Dimitris S. Papailiopoulos,et al. Parallel Correlation Clustering on Big Graphs , 2015, NIPS.
[70] Zhao Zhang,et al. Rethinking Data-Intensive Science Using Scalable Analytics Systems , 2015, SIGMOD Conference.
[71] Michael I. Jordan,et al. The Missing Piece in Complex Analytics: Low Latency, Scalable Model Management and Serving with Velox , 2014, CIDR.
[72] Tim Kraska,et al. Stale View Cleaning: Getting Fresh Answers from Stale Materialized Views , 2015, Proc. VLDB Endow..
[73] Srikanth Kandula,et al. Leveraging endpoint flexibility in data-intensive clusters , 2013, SIGCOMM.
[74] Eemil Lagerspetz,et al. The company you keep: mobile malware infection rates and inexpensive risk indicators , 2013, WWW.
[75] Stefanie Jegelka,et al. Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets , 2014, NIPS.
[76] Lior Pachter,et al. The NIH BD2K center for big data in translational genomics , 2015, J. Am. Medical Informatics Assoc..
[77] Archana Ganapathi,et al. Building blocks for exploratory data analysis tools , 2013, IDEA@KDD.
[78] Dimitris S. Papailiopoulos,et al. Cyclades: Conflict-free Asynchronous Machine Learning , 2016, NIPS.
[79] Michael I. Jordan,et al. Adding vs. Averaging in Distributed Primal-Dual Optimization , 2015, ICML.
[80] Gregory D. Hager,et al. Transition State Clustering: Unsupervised Surgical Trajectory Segmentation for Robot Learning , 2017, ISRR.
[81] S. Alspaugh. Better Logging to Improve Interactive Data Analysis Tools , 2014 .
[82] Ali Ghodsi,et al. FairRide: Near-Optimal, Fair Cache Sharing , 2016, NSDI.
[83] Reynold Xin,et al. GraphX: a resilient distributed graph system on Spark , 2013, GRADES.
[84] Randy H. Katz,et al. FastLane: making short flows shorter with agile drop notification , 2015, SoCC.
[85] Ion Stoica,et al. Time-evolving graph processing at scale , 2016, GRADES '16.
[86] Ion Stoica,et al. Succinct: Enabling Queries on Compressed Data , 2015, NSDI.
[87] Scott Shenker,et al. Shark: SQL and rich analytics at scale , 2012, SIGMOD '13.
[88] Reynold Xin,et al. GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.