What next?: a half-dozen data management research goals for big data and the cloud

In this short paper, I describe six data management research challenges relevant for Big Data and the Cloud. Although some of these problems are not new, their importance is amplified by Big Data and Cloud Computing.

[1]  Surajit Chaudhuri,et al.  Query optimizers: time to rethink the contract? , 2009, SIGMOD Conference.

[2]  Ganesh Ramakrishnan,et al.  Collective annotation of Wikipedia entities in web text , 2009, KDD.

[3]  Surajit Chaudhuri,et al.  An overview of business intelligence technology , 2011, Commun. ACM.

[4]  Benoît Dageville,et al.  SQL Memory Management in Oracle9i , 2002, VLDB.

[5]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[6]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[7]  Jingren Zhou,et al.  SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..

[8]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[9]  Sridhar Ramaswamy,et al.  Join synopses for approximate query answering , 1999, SIGMOD '99.

[10]  Rajeev Motwani,et al.  On random sampling over joins , 1999, SIGMOD '99.

[11]  Peter J. Haas,et al.  Ripple joins for online aggregation , 1999, SIGMOD '99.

[12]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[13]  Michael Stonebraker,et al.  MapReduce and parallel DBMSs: friends or foes? , 2010, CACM.

[14]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[15]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[16]  Tao Cheng,et al.  Entity Synonyms for Structured Web Search , 2012, IEEE Transactions on Knowledge and Data Engineering.

[17]  Sam Lightstone,et al.  Adaptive self-tuning memory in DB2 , 2006, VLDB.

[18]  Butler W. Lampson Privacy and securityUsable security , 2009, Commun. ACM.

[19]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[20]  Christian S. Jensen,et al.  Google fusion tables: web-centered data management and collaboration , 2010, SIGMOD Conference.

[21]  Surajit Chaudhuri,et al.  Targeted disambiguation of ad-hoc, homogeneous sets of named entities , 2012, WWW.

[22]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.