Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks
暂无分享,去创建一个
María José del Jesús | Francisco Herrera | Alberto Fernández | José Manuel Benítez | Victoria López | Sara del Río | Abdullah Bawakid | J. M. Benítez | F. Herrera | Alberto Fernández | M. J. D. Jesús | S. Río | Victoria López | Abdullah Bawakid | A. Fernández | M. J. Jesús
[1] Eero Vainikko,et al. Adapting scientific computing problems to clouds using MapReduce , 2012, Future Gener. Comput. Syst..
[2] Ravi Kumar,et al. Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.
[3] Beng Chin Ooi,et al. The performance of MapReduce , 2010, Proc. VLDB Endow..
[4] James Murty,et al. Programming amazon web services , 2008 .
[5] Younghoon Kim,et al. DBCURE-MR: An efficient density-based clustering algorithm for large data using MapReduce , 2014, Inf. Syst..
[6] Ryan Hafen,et al. Visualization Databases for the Analysis of Large Complex Datasets , 2009, AISTATS.
[7] Kristina Chodorow,et al. MongoDB: The Definitive Guide , 2010 .
[8] Reynold Xin,et al. GraphX: a resilient distributed graph system on Spark , 2013, GRADES.
[9] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .
[10] Indranil Palit,et al. Scalable and Parallel Boosting with MapReduce , 2012, IEEE Transactions on Knowledge and Data Engineering.
[11] Markus Grünwald,et al. Business Intelligence , 2009, Informatik-Spektrum.
[12] Shirish Tatikonda,et al. SystemML: Declarative machine learning on MapReduce , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[13] Yanpei Chen,et al. Big data and internships at Cloudera , 2012, XRDS.
[14] Chuck Lam,et al. Hadoop in Action , 2010 .
[15] Shrideep Pallickara,et al. On the performance of high dimensional data clustering and classification algorithms , 2013, Future Gener. Comput. Syst..
[16] Efraim Turban,et al. Business Intelligence: Second European Summer School, eBISS 2012, Brussels, Belgium, July 15-21, 2012, Tutorial Lectures , 2013 .
[17] Schahram Dustdar,et al. Elastic stream processing in the Cloud , 2013, WIREs Data Mining Knowl. Discov..
[18] GhemawatSanjay,et al. The Google file system , 2003 .
[19] E. W. T. Ngai,et al. A literature review and classification of electronic commerce research , 2002, Inf. Manag..
[20] Lyndsay Wise. Using Open Source Platforms for Business Intelligence: Avoid Pitfalls and Maximize ROI , 2012 .
[21] Dan Frankowski,et al. Collaborative Filtering Recommender Systems , 2007, The Adaptive Web.
[22] Michael D. Ernst,et al. The HaLoop approach to large-scale iterative data analysis , 2012, The VLDB Journal.
[23] Gordon S. Blair,et al. A generic component model for building systems software , 2008, TOCS.
[24] Ian F. Akyildiz,et al. Sensor Networks , 2002, Encyclopedia of GIS.
[25] Joseph M. Hellerstein,et al. Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..
[26] Alfred Kobsa,et al. The Adaptive Web, Methods and Strategies of Web Personalization , 2007, The Adaptive Web.
[27] Nick Dimiduk,et al. HBase in Action , 2012 .
[28] Charles R. Severance,et al. Discovering JavaScript Object Notation , 2012, Computer.
[29] Kristina Chodorow,et al. MongoDB - The Definitive Guide: Powerful and Scalable Data Storage , 2019 .
[30] Werner Vogels,et al. Dynamo: amazon's highly available key-value store , 2007, SOSP.
[31] Scott Shenker,et al. Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.
[32] Shichao Zhang,et al. Association Rule Mining: Models and Algorithms , 2002 .
[33] David R. Karger,et al. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.
[34] Tim Kraska,et al. Finding the Needle in the Big Data Systems Haystack , 2013, IEEE Internet Computing.
[35] Barbara Wixom,et al. The Current State of Business Intelligence , 2007, Computer.
[36] Steven J. Plimpton,et al. MapReduce in MPI for Large-scale graph algorithms , 2011, Parallel Comput..
[37] Mona Nasr,et al. Business intelligence software as a service (SAAS) , 2011, 2011 IEEE 3rd International Conference on Communication Software and Networks.
[38] Eugene Wong,et al. Introduction to a system for distributed databases (SDD-1) , 1980, TODS.
[39] Jingren Zhou,et al. SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..
[40] Tom Fawcett,et al. Data Science and its Relationship to Big Data and Data-Driven Decision Making , 2013, Big Data.
[41] Divyakant Agrawal,et al. Big data and cloud computing: current state and future opportunities , 2011, EDBT/ICDT '11.
[42] Mukesh K. Mohania,et al. Cloud Computing and Big Data Analytics: What Is New from Databases Perspective? , 2012, BDA.
[43] Brian David Johnson,et al. Entertainment in the Age of Big Data , 2012, Proceedings of the IEEE.
[44] A. Mobasheri,et al. Application of machine learning to proteomics data: classification and biomarker identification in postgenomics biology. , 2013, Omics : a journal of integrative biology.
[45] Xian-He Sun,et al. Optimizing HPC Fault-Tolerant Environment: An Analytical Approach , 2010, 2010 39th International Conference on Parallel Processing.
[46] Chengqi Zhang,et al. Association Rule Mining , 2002, Lecture Notes in Computer Science.
[47] Bowei Xi,et al. Large complex data: divide and recombine (D&R) with RHIPE , 2012 .
[48] Chris Rose,et al. A Break in the Clouds: Towards a Cloud Definition , 2011 .
[49] Vijay Srinivas Agneeswaran. Big Data Analytics Beyond Hadoop: Real-Time Applications with Storm, Spark, and More Hadoop Alternatives , 2014 .
[50] Younghoon Kim,et al. Parallel Top-K Similarity Join Algorithms Using MapReduce , 2012, 2012 IEEE 28th International Conference on Data Engineering.
[51] Vipin Kumar,et al. Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.
[52] James Murty,et al. Programming Amazon web services - S3, EC2, SQS, FPS, and SimpleDB: outsource your infrastructure , 2008 .
[53] Yang Xiao,et al. Achieving Accountable MapReduce in cloud computing , 2014, Future Gener. Comput. Syst..
[54] Benno Schwikowski,et al. Mining proteomic data for biomedical research , 2012, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..
[55] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[56] Mikel Galar,et al. Minutiae filtering to improve both efficacy and efficiency of fingerprint matching algorithms , 2014, Eng. Appl. Artif. Intell..
[57] Francisco Herrera,et al. Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data , 2015, Fuzzy Sets Syst..
[58] Peter J. Haas,et al. Ricardo: integrating R and Hadoop , 2010, SIGMOD Conference.
[59] Daniel Peralta,et al. Fast fingerprint identification for large databases , 2014, Pattern Recognit..
[60] Sanjay Ghemawat,et al. MapReduce: a flexible data processing tool , 2010, CACM.
[61] Rajkumar Buyya,et al. Cloud Computing Principles and Paradigms , 2011 .
[62] Geoffrey C. Fox,et al. Twister: a runtime for iterative MapReduce , 2010, HPDC '10.
[63] Shaojie Qiao,et al. Parallel Sequential Pattern Mining of Massive Trajectory Data , 2010, Int. J. Comput. Intell. Syst..
[64] Ioannis Koumpouros,et al. Big Data & Cloud Computing στην Υγεία , 2015 .
[65] Tom Fawcett,et al. Data science for business , 2013 .
[66] Yi Pan,et al. International Journal of Approximate Reasoning a Comparison of Parallel Large-scale Knowledge Acquisition Using Rough Set Theory on Different Mapreduce Runtime Systems , 2022 .
[67] Przemyslaw Kazienko,et al. Parallel processing of large graphs , 2013, Future Gener. Comput. Syst..
[68] Chin-Feng Lai,et al. CPRS: A Cloud-Based Program Recommendation System for Digital TV Platforms , 2010, GPC.
[69] Guan Le,et al. Survey on NoSQL database , 2011, 2011 6th International Conference on Pervasive Computing and Applications.
[70] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.
[71] G. Amdhal,et al. Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).
[72] Ashutosh Kumar Singh,et al. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .
[73] Ian H. Witten,et al. Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.
[74] Ian Witten,et al. Data Mining , 2000 .
[75] Leonardo Neumeyer,et al. S4: Distributed Stream Computing Platform , 2010, 2010 IEEE International Conference on Data Mining Workshops.
[76] Stefan Wrobel,et al. Toolkit-Based High-Performance Data Mining of Large Data on MapReduce Clusters , 2009, 2009 IEEE International Conference on Data Mining Workshops.
[77] Andreas Reuter,et al. Principles of transaction-oriented database recovery , 1983, CSUR.
[78] Per Oscarson,et al. Information Security Fundamentals , 2019, World Conference on Information Security Education.
[79] Robert A. Lordo,et al. Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.
[80] Tom Fawcett,et al. Data science for business , 2013 .
[81] Nicolas Bruno,et al. SCOPE: parallel databases meet MapReduce , 2012, The VLDB Journal.
[82] Dinesh Manocha,et al. Query co-processing on commodity processors , 2006, VLDB.
[83] Xavier Llorà,et al. Large‐scale data mining using genetics‐based machine learning , 2013, GECCO.
[84] Tianrui Li,et al. An Improved Cop-Kmeans Clustering for Solving Constraint Violation Based on MapReduce Framework , 2013, Fundam. Informaticae.
[85] Jonathan M. Garibaldi,et al. Using Rule-Based Machine Learning for Candidate Disease Gene Prioritization and Sample Classification of Cancer Gene Expression Data , 2012, PloS one.
[86] Paul Zikopoulos,et al. Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data , 2011 .
[87] Michael C. Schatz,et al. CloudBurst: highly sensitive read mapping with MapReduce , 2009, Bioinform..
[88] Bingsheng He,et al. Mars: Accelerating MapReduce with Graphics Processors , 2011, IEEE Transactions on Parallel and Distributed Systems.
[89] Charles R. Severance. Van Jacobson: Getting NSFNet off the Ground , 2012, Computer.
[90] Divesh Srivastava,et al. Data Management Challenges and Opportunities in Cloud Computing , 2012, DASFAA.
[91] Samuel Madden,et al. From Databases to Big Data , 2012, IEEE Internet Comput..
[92] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.
[93] John L. Klepeis,et al. A scalable parallel framework for analyzing terascale molecular dynamics simulation trajectories , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[94] Greg Linden,et al. Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .
[95] Kunle Olukotun,et al. Map-Reduce for Machine Learning on Multicore , 2006, NIPS.
[96] Bracha Shapira,et al. Recommender Systems Handbook , 2015, Springer US.
[97] Michael Hausenblas,et al. Apache Drill: Interactive Ad-Hoc Analysis at Scale , 2013, Big Data.
[98] Fuzhen Zhuang,et al. A parallel incremental extreme SVM classifier , 2011, Neurocomputing.
[99] Ronald C. Taylor. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics , 2010, BMC Bioinformatics.
[100] Jongwook Woo. Market Basket Analysis algorithms with MapReduce , 2013, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..
[101] Kyuseok Shim,et al. MapReduce Algorithms for Big Data Analysis , 2012, Proc. VLDB Endow..
[102] Ramakrishnan Kannan,et al. NIMBLE: a toolkit for the implementation of parallel data mining and machine learning algorithms on mapreduce , 2011, KDD.
[103] Franck Cappello,et al. Toward Exascale Resilience , 2009, Int. J. High Perform. Comput. Appl..
[104] Limsoon Wong,et al. Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes , 2013, BMC Bioinformatics.
[105] Prashant Malik,et al. Cassandra: a decentralized structured storage system , 2010, OPSR.
[106] Rajesh Nadipalli. HDInsight Essentials , 2013 .
[107] Jung-Min Park,et al. An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.
[108] Wenji Mao,et al. Social Computing: From Social Informatics to Social Intelligence , 2007, IEEE Intell. Syst..
[109] Andrey Gubarev,et al. Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .
[110] J. Chris Anderson,et al. CouchDB - The Definitive Guide: Time to Relax , 2010 .
[111] Arshdeep Bahga,et al. Analyzing Massive Machine Maintenance Data in a Computing Cloud , 2012, IEEE Transactions on Parallel and Distributed Systems.
[112] Pete Wyckoff,et al. Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..
[113] Wilson C. Hsieh,et al. Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.
[114] Philipp Koehn,et al. Synthesis Lectures on Human Language Technologies , 2016 .
[115] Robert L. Grossman,et al. Compute and storage clouds using wide area high performance networks , 2008, Future Gener. Comput. Syst..
[116] Francisco Herrera,et al. On the use of MapReduce for imbalanced big data using Random Forest , 2014, Inf. Sci..
[117] Günther Specht,et al. Cloudgene: A graphical execution platform for MapReduce programs on private and public clouds , 2012, BMC Bioinformatics.
[118] Vladimir Cherkassky,et al. Learning from Data: Concepts, Theory, and Methods , 1998 .
[119] Da Ruan,et al. A parallel method for computing rough set approximations , 2012, Inf. Sci..
[120] Robert L. Grossman,et al. Data mining using high performance data clouds: experimental studies using sector and sphere , 2008, KDD.
[121] P. Mell,et al. The NIST Definition of Cloud Computing , 2011 .
[122] Andrey Balmin,et al. Jaql , 2011, Proc. VLDB Endow..
[123] Alan R. Hevner,et al. Integrated decision support systems: A data warehousing perspective , 2007, Decis. Support Syst..
[124] Kyoung-Don Kang,et al. Grex: An efficient MapReduce framework for graphics processing units , 2013, J. Parallel Distributed Comput..
[125] Sahil R. Kalra,et al. Big Challenges? Big Data … , 2015 .
[126] Alexandros Labrinidis,et al. Challenges and Opportunities with Big Data , 2012, Proc. VLDB Endow..
[127] Christopher Olston,et al. Building a HighLevel Dataflow System on top of MapReduce: The Pig Experience , 2009, Proc. VLDB Endow..
[128] Yuan Yu,et al. Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.
[129] C. L. Philip Chen,et al. Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..
[130] Ibrahim Aljarah,et al. Parallel particle swarm optimization clustering algorithm based on MapReduce methodology , 2012, 2012 Fourth World Congress on Nature and Biologically Inspired Computing (NaBIC).
[131] Michael Stonebraker,et al. A comparison of approaches to large-scale data analysis , 2009, SIGMOD Conference.
[132] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[133] Ying Chen,et al. Rapid processing of remote sensing images based on cloud computing , 2013, Future Gener. Comput. Syst..
[134] Anil K. Jain. Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..
[135] Vipin Kumar,et al. Trends in big data analytics , 2014, J. Parallel Distributed Comput..
[136] S. Fawcett,et al. Data Science, Predictive Analytics, and Big Data: A Revolution that Will Transform Supply Chain Design and Management , 2013 .
[137] Ashwin Srinivasan,et al. Data and task parallelism in ILP using MapReduce , 2011, Machine Learning.
[138] Michael Stonebraker,et al. MapReduce and parallel DBMSs: friends or foes? , 2010, CACM.
[139] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.
[140] Donovan A. Schneider,et al. The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..
[141] Fuzhen Zhuang,et al. Parallel sampling from big data with uncertainty distribution , 2015, Fuzzy Sets Syst..
[142] Jiawei Han,et al. Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.
[143] V. Marx. Biology: The big challenges of big data , 2013, Nature.
[144] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .
[145] Xian-He Sun,et al. Performance comparison under failures of MPI and MapReduce: An analytical approach , 2013, Future Gener. Comput. Syst..
[146] David G. Stork,et al. Pattern Classification , 1973 .
[147] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[148] Sean Owen,et al. Mahout in Action , 2011 .
[149] Ira Assent,et al. Clustering high dimensional data , 2012 .
[150] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[151] John D. Owens,et al. Multi-GPU MapReduce on GPU Clusters , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[152] Younghoon Kim,et al. TWILITE: A recommendation system for Twitter using a probabilistic model based on latent Dirichlet allocation , 2014, Inf. Syst..
[153] Quinton Anderson. Storm Real-Time Processing Cookbook , 2013 .
[154] Michael Isard,et al. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.
[155] Mirek Riedewald,et al. Processing theta-joins using MapReduce , 2011, SIGMOD '11.
[156] Christoforos E. Kozyrakis,et al. Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[157] Mike P. Papazoglou,et al. Service oriented architectures: approaches, technologies and research issues , 2007, The VLDB Journal.
[158] Rachel Schutt,et al. Doing Data Science , 2013 .
[159] Rakesh Agrawal,et al. SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.
[160] Chris Mattmann,et al. Computing: A vision for data science , 2013, Nature.
[161] Jimmy J. Lin. MapReduce is Good Enough? If All You Have is a Hammer, Throw Away Everything That's Not a Nail! , 2012, Big Data.
[162] Athena Vakali,et al. Integrating similarity and dissimilarity notions in recommenders , 2013, Expert Syst. Appl..
[163] Tim Kraska,et al. MLbase: A Distributed Machine-learning System , 2013, CIDR.
[164] Anne E. Trefethen,et al. The UK e-Science Core Programme and the Grid , 2002, Future Gener. Comput. Syst..
[165] Toby Velte,et al. Cloud Computing, A Practical Approach , 2009 .
[166] Frédéric Magoulès,et al. Development of an RDP neural network for building energy consumption fault detection and diagnosis , 2013 .
[167] Jimmy J. Lin,et al. Book Reviews: Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer , 2010, CL.