Big Data: A Survey
暂无分享,去创建一个
Yunhao Liu | Shiwen Mao | Min Chen | Min Chen | Yunhao Liu | S. Mao
[1] T. W. Anderson. An Introduction to Multivariate Statistical Analysis , 1959 .
[2] T. W. Anderson,et al. An Introduction to Multivariate Statistical Analysis , 1959 .
[3] Mahadev Satyanarayanan,et al. Scale and performance in a distributed file system , 1987, SOSP '87.
[4] David J. DeWitt,et al. Parallel database systems: the future of high performance database systems , 1992, CACM.
[5] David Konopnicki,et al. W3QS: A Query System for the World-Wide Web , 1995, VLDB.
[6] David R. Karger,et al. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.
[7] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.
[8] J. Anderson,et al. IP over SONET , 1998 .
[9] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[10] Soumen Chakrabarti,et al. Data mining for hypertext: a tutorial survey , 2000, SKDD.
[11] Andrian Marcus,et al. Data Cleansing: Beyond Integrity Analysis 1 , 2000 .
[12] Nasir Ghani,et al. On IP-over-WDM integration , 2000, IEEE Commun. Mag..
[13] Andrian Marcus,et al. Data Cleansing: Beyond Integrity Analysis , 2000, IQ.
[14] Eric A. Brewer,et al. Towards robust distributed systems (abstract) , 2000, PODC '00.
[15] Maurizio Lenzerini,et al. Data integration: a theoretical perspective , 2002, PODS.
[16] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.
[17] Sankar K. Pal,et al. Web mining in soft computing framework: relevance, state of the art and future directions , 2002, IEEE Trans. Neural Networks.
[18] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.
[19] Hector Garcia-Molina,et al. Parallel crawlers , 2002, WWW.
[20] Yannis Manolopoulos,et al. Indexing web access-logs for pattern queries , 2002, WIDM '02.
[21] Kenneth J. Christensen,et al. A first look at wired sensor networks for video surveillance systems , 2002, 27th Annual IEEE Conference on Local Computer Networks, 2002. Proceedings. LCN 2002..
[22] Nancy A. Lynch,et al. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.
[23] Duncan J. Watts,et al. Six Degrees: The Science of a Connected Age , 2003 .
[24] GhemawatSanjay,et al. The Google file system , 2003 .
[25] Rajesh Parekh,et al. Lessons and Challenges from Mining Retail E-Commerce Data , 2004, Machine Learning.
[26] Anupam Joshi,et al. On Using a Warehouse to Analyze Web Logs , 2003, Distributed and Parallel Databases.
[27] Elisa Bertino,et al. State-of-the-art in privacy preserving data mining , 2004, SGMD.
[28] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[29] Wei Hong,et al. A macroscope in the redwoods , 2005, SenSys '05.
[30] Shonali Krishnaswamy,et al. Mining data streams: a review , 2005, SGMD.
[31] J. E. Hirsch,et al. An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.
[32] Rob Pike,et al. Interpreting the data: Parallel analysis with Sawzall , 2005, Sci. Program..
[33] Alexander G. Gray,et al. On-line anomaly detection of deployed software: a statistical machine learning approach , 2006, SOQUA '06.
[34] John R. Smith,et al. Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.
[35] Nicu Sebe,et al. Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.
[36] Douglas Crockford,et al. The application/json Media Type for JavaScript Object Notation (JSON) , 2006, RFC.
[37] Brett D. Fleisch,et al. The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.
[38] Sukun Kim,et al. Health Monitoring of Civil Infrastructures Using Wireless Sensor Networks , 2007, 2007 6th International Symposium on Information Processing in Sensor Networks.
[39] Bin Wu,et al. Community detection in large-scale social networks , 2007, WebKDD/SNA-KDD '07.
[40] Katherine G. Herbert,et al. Biological data cleaning: a case study , 2007, Int. J. Inf. Qual..
[41] Werner Vogels,et al. Dynamo: amazon's highly available key-value store , 2007, SOSP.
[42] John A. Stankovic,et al. LUSTER: wireless sensor network for environmental research , 2007, SenSys '07.
[43] Philip S. Yu,et al. Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.
[44] Yuan Yu,et al. Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.
[45] Thorsten Meinl,et al. KNIME: The Konstanz Information Miner , 2007, GfKl.
[46] James Murty,et al. Programming Amazon web services - S3, EC2, SQS, FPS, and SimpleDB: outsource your infrastructure , 2008 .
[47] Mani B. Srivastava,et al. NAWMS: nonintrusive autonomous water monitoring system , 2008, SenSys '08.
[48] François Ingelrest,et al. SensorScope: Out-of-the-Box Environmental Monitoring , 2008, 2008 International Conference on Information Processing in Sensor Networks (ipsn 2008).
[49] Douglas Thain,et al. All-pairs: An abstraction for data-intensive cloud computing , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[50] Mohd Norzali Haji Mohd,et al. Data pre-processing on web server logs for generalized association rules mining algorithm , 2008 .
[51] Jingren Zhou,et al. SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..
[52] Wilson C. Hsieh,et al. Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.
[53] Qiang Yang,et al. Translated Learning: Transfer Learning across Different Feature Spaces , 2008, NIPS.
[54] Michael Isard,et al. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.
[55] James Murty,et al. Programming amazon web services , 2008 .
[56] Dan Suciu,et al. Probabilistic Event Extraction from RFID Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[57] Jae-Gil Lee,et al. Mining Massive RFID, Trajectory, and Traffic Data Sets , 2008, Knowledge Discovery and Data Mining.
[58] Alon Y. Halevy,et al. Data Integration for the Relational Web , 2009, Proc. VLDB Endow..
[59] Albert G. Greenberg,et al. VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.
[60] Pete Wyckoff,et al. Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..
[61] Jeremy Ginsberg,et al. Detecting influenza epidemics using search engine query data , 2009, Nature.
[62] Sean Quinlan,et al. GFS: Evolution on Fast-forward , 2009, ACM Queue.
[63] Lise Getoor,et al. Co-evolution of social and affiliation networks , 2009, KDD.
[64] J. Armstrong,et al. OFDM for Optical Communications , 2009, Journal of Lightwave Technology.
[65] A. Jacobs. The Pathologies of Big Data , 2009, ACM Queue.
[66] Juyeon Lee,et al. ON MODELINGA model of mobile community: designing user interfaces to support group interaction , 2009, INTR.
[67] Amy L. Murphy,et al. Monitoring heritage buildings with wireless sensor networks: The Torre Aquila deployment , 2009, 2009 International Conference on Information Processing in Sensor Networks.
[68] You-Jin Park,et al. Individual and group behavior-based customer profile model for personalized product recommendation , 2009, Expert Syst. Appl..
[69] H. Takara,et al. Dynamic optical mesh networks: Drivers, challenges and solutions for the future , 2009, 2009 35th European Conference on Optical Communication.
[70] Prashant Malik,et al. Cassandra: structured storage system on a P2P network , 2009, PODC '09.
[71] Jimeng Sun,et al. Social influence analysis in large-scale networks , 2009, KDD.
[72] Douglas Stott Parker,et al. Traverse: Simplified Indexing on Large Map-Reduce-Merge Clusters , 2009, DASFAA.
[73] Niklas Carlsson,et al. Evolution of an online social aggregation network: an empirical study , 2009, IMC '09.
[74] Luiz André Barroso,et al. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.
[75] Tony Hey,et al. The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .
[76] Elena Console,et al. Data Fusion , 2009, Encyclopedia of Database Systems.
[77] Haitao Wu,et al. BCube: a high performance, server-centric network architecture for modular data centers , 2009, SIGCOMM '09.
[78] Haixun Wang,et al. Leveraging spatio-temporal redundancy for RFID data cleansing , 2010, SIGMOD Conference.
[79] Geoffrey C. Fox,et al. Cloud computing paradigms for pleasingly parallel biomedical applications , 2010, HPDC '10.
[80] Geoffrey C. Fox,et al. Twister: a runtime for iterative MapReduce , 2010, HPDC '10.
[81] Juan C. Burguillo,et al. A hybrid content-based and item-based collaborative filtering approach to recommend TV programs enhanced with singular value decomposition , 2010, Inf. Sci..
[82] Kristina Chodorow,et al. MongoDB: The Definitive Guide , 2010 .
[83] Hong Liu,et al. Fiber optic communication technologies: What's needed for datacenter network operations , 2010, IEEE Communications Magazine.
[84] Sanjeev Kumar,et al. Finding a Needle in Haystack: Facebook's Photo Storage , 2010, OSDI.
[85] Antony I. T. Rowstron,et al. Symbiotic routing in future data centers , 2010, SIGCOMM '10.
[86] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.
[87] Jure Leskovec,et al. Empirical comparison of algorithms for network community detection , 2010, WWW '10.
[88] Roberto Proietti,et al. DOS - A scalable optical switch for datacenters , 2010, 2010 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS).
[89] Konstantina Papagiannaki,et al. c-Through: part-time optics in data centers , 2010, SIGCOMM '10.
[90] Michael D. Ernst,et al. HaLoop , 2010, Proc. VLDB Endow..
[91] Deepak S. Turaga,et al. Multimodal analysis of body sensor network data streams for real-time healthcare , 2010, MIR '10.
[92] Rami G. Melhem,et al. Applying statistical machine learning to multicore voltage & frequency scaling , 2010, Conf. Computing Frontiers.
[93] Amin Vahdat,et al. Helios: a hybrid electrical/optical switch architecture for modular data centers , 2010, SIGCOMM '10.
[94] J. Chris Anderson,et al. CouchDB: The Definitive Guide , 2010 .
[95] Atul Singh,et al. Proteus: a topology malleable data center network , 2010, Hotnets-IX.
[96] Bill Hostmann,et al. Magic Quadrant for Business Intelligence Platforms , 2012 .
[97] Koji Eguchi,et al. Link prediction using probabilistic group models of network structure , 2010, SAC '10.
[98] Jignesh M. Patel,et al. A comparison of join algorithms for log processing in MaPreduce , 2010, SIGMOD Conference.
[99] Paul Zikopoulos,et al. Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data , 2011 .
[100] Randal E. Bryant,et al. Data-Intensive Scalable Computing for Scientific Applications , 2011, Computing in Science & Engineering.
[101] Cecilia Mascolo,et al. Exploiting place features in link prediction on location-based social networks , 2011, KDD.
[102] Lars George,et al. HBase: The Definitive Guide , 2011 .
[103] Rick Cattell,et al. Scalable SQL and NoSQL data stores , 2011, SGMD.
[104] Predrag Tasevski. PASSWORD ATTACKS AND GENERATION STRATEGIES , 2011 .
[105] Charu C. Aggarwal,et al. An Introduction to Social Network Data Analytics , 2011, Social Network Data Analytics.
[106] B. S. Manjunath,et al. The iPlant Collaborative: Cyberinfrastructure for Plant Biology , 2011, Front. Plant Sci..
[107] Pramod Bhatotia,et al. Incoop: MapReduce for incremental computations , 2011, SoCC.
[108] Isabella Cerutti,et al. Energy-Efficient Design of a Scalable Optical Multiplane Interconnection Architecture , 2011, IEEE Journal of Selected Topics in Quantum Electronics.
[109] Erik Meijer. The world according to LINQ , 2011, CACM.
[110] Steven Hand,et al. CIEL: A Universal Execution Engine for Distributed Data-Flow Computing , 2011, NSDI.
[111] J. Manyika. Big data: The next frontier for innovation, competition, and productivity , 2011 .
[112] Avinash Karanth Kodi,et al. Energy-Efficient and Bandwidth-Reconfigurable Photonic Networks for High-Performance Computing (HPC) Systems , 2011, IEEE Journal of Selected Topics in Quantum Electronics.
[113] Li Li,et al. A Survey on Visual Content-Based Video Indexing and Retrieval , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[114] Tamara G. Kolda,et al. Temporal Link Prediction Using Matrix and Tensor Factorizations , 2010, TKDD.
[115] Zi Huang,et al. Effective data co-reduction for multimedia similarity search , 2011, SIGMOD '11.
[116] Feng Wang,et al. Networked Wireless Sensor Data Collection: Issues, Challenges, and Approaches , 2011, IEEE Communications Surveys & Tutorials.
[117] Kwong-Sak Leung,et al. Data Mining on DNA Sequences of Hepatitis B Virus , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.
[118] Surajit Chaudhuri,et al. An overview of business intelligence technology , 2011, Commun. ACM.
[119] W Shieh,et al. OFDM for Flexible High-Speed Optical Networks , 2011, Journal of Lightwave Technology.
[120] Charu C. Aggarwal,et al. Social Network Data Analytics , 2011 .
[121] Cecilia Mascolo,et al. Evolution of a location-based online social network: analysis and models , 2012, IMC '12.
[122] Wil M.P. van der Aalst. Process Mining: Overview and Opportunities , 2012, TMIS.
[123] Joydeep Ghosh,et al. A probabilistic imputation framework for predictive analysis using variably aggregated, multi-source healthcare data , 2012, IHI '12.
[124] Ling Huang,et al. Evolution of social-attribute networks: measurements, modeling, and implications using google+ , 2012, Internet Measurement Conference.
[125] Arshdeep Bahga,et al. Analyzing Massive Machine Maintenance Data in a Computing Cloud , 2012, IEEE Transactions on Parallel and Distributed Systems.
[126] Gregor von Bochmann,et al. Crawling rich internet applications: the state of the art , 2012, CASCON.
[127] Kenneth A. De Jong,et al. An Evolutionary Algorithm Approach for Feature Generation from Sequence Data and Its Application to DNA Splice Site Prediction , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.
[128] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[129] Florian Metze,et al. Beyond audio and video retrieval: towards multimedia summarization , 2012, ICMR.
[130] Wilfred Ng,et al. A model-based approach for RFID data stream cleansing , 2012, CIKM.
[131] Alexandros Labrinidis,et al. Challenges and Opportunities with Big Data , 2012, Proc. VLDB Endow..
[132] Nicu Sebe,et al. Knowledge adaptation for ad hoc multimedia event detection with few exemplars , 2012, ACM Multimedia.
[133] Tsung-Han Tsai,et al. Exploring Contextual Redundancy in Improving Object-Based Video Coding for Video Sensor Networks Surveillance , 2012, IEEE Transactions on Multimedia.
[134] Ben Y. Zhao,et al. Mirror mirror on the ceiling: flexible wireless links for data centers , 2012, SIGCOMM '12.
[135] Bingbing Ni,et al. Assistive tagging: A survey of multimedia tagging with human-computer joint exploration , 2012, CSUR.
[136] Min Chen,et al. FAR: A fault-avoidance routing method for data center networks with regular topology , 2013, Architectures for Networking and Communications Systems.
[137] Viktor Mayer-Schnberger,et al. Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2013 .
[138] Wei Chen,et al. Influence diffusion dynamics and influence maximization in social networks with friend and foe relationships , 2011, WSDM.
[139] Olha Buchel,et al. Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .
[140] Ck Cheng,et al. The Age of Big Data , 2015 .
[141] Eric Gossett,et al. Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .
[142] A Special Report on Managing Information , 2022 .