Think big about data: Archaeology and the Big Data challenge

Ublicherweise als Hochgeschwindigkeitsdaten (high volume, high velocity und/oder high variety data) bezeichnet, machen es Big Data moglich, dank dem Einsatz von Software, Hardware und Algorithmen historische Prozesse zu studieren, die man anhand kleinerer Datenmengen nicht verstehen kann. Big Data setzt einen neuen archaologischen Ansatz voraus: Die Bereitschaft, Massen von Daten zu nutzen, ungeordnete und heterogene Daten zu ubernehmen, und Korrelation statt Kausalitat zu akzeptieren. Kann die Unvollstandigkeit archaologischer Daten einen solchen Ansatz verhindern? Oder sind archaologische Daten geradezu dafur pradestiniert, eben weil sie ungeordnet und unstrukturiert sind? Normalerweise handelt Archaologie mit grosen und komplexen Mengen von Daten, oft fragmentarisch, und oft solchen, die aus verschiedenen Quellen und Disziplinen kommen und die selten im gleichen Format oder in der gleichen Skala vorliegen. Ist Archaologie bereit, mehr mit solchen Methoden zu arbeiten, die auf Daten basieren, und pradiktive und probabilistische Techniken zu akzeptieren? Big Data erklart nicht, sondern informiert, bietet ein Modell fur eine archaologische Interpretation an, ist eine Ressource und ein Werkzeug: Data Mining, Datenvisualisierung, Bildverarbeitung und quantitative Methoden konnen gemeinsam dazu beitragen, komplexe archaologische Informationen zu verstehen. So verfuhrerisch Big Data auch sein mag, man sollte die Probleme nicht leugnen: Es besteht die Gefahr, Daten als absolute Wahrheit zu betrachten, zudem bestehen Fragen verbunden mit intellektuellen Rechten und Ethik. Wir konnen diese Technologie adaptieren, aber wir sollten ihre Starken und Grenzen erkennen.

[1]  Alex Bateman,et al.  Databases, data tombs and dust in the wind , 2008, Bioinform..

[2]  Marcos Llobera,et al.  Archaeological Visualization: Towards an Archaeological Information Science (AISc) , 2011 .

[3]  G. Lock,et al.  Introduction: Confronting Scale , 2006 .

[4]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[5]  Robert D Drennan Statistics For Archaeologists , 1996 .

[6]  Timothy W. Finin,et al.  Delta TFIDF: An Improved Feature Space for Sentiment Analysis , 2009, ICWSM.

[7]  Francesca Anichini,et al.  Verso la rivoluzione. Dall'Open Access all'Open Data: la pubblicazione aperta in archeologia , 2015 .

[8]  John W. Cottier,et al.  Big Sites, Big Questions, Big Data, Big Problems: Scales of Investigation and Changing Perceptions of Archaeological Practice in the Southeastern United States , 2014 .

[9]  Makon Saengkhattiya,et al.  Quality in Crowdsourcing - How software quality is ensured in software crowdsourcing , 2012 .

[10]  Nigel Collier,et al.  Sentiment Analysis using Support Vector Machines with Diverse Information Sources , 2004, EMNLP.

[11]  Sarah Whitcher Kansa,et al.  Publishing and Pushing: Mixing Models for Communicating Research Data in Archaeology , 2014, Int. J. Digit. Curation.

[12]  T. Cresswell Déjà vu all over again: Spatial science, quantitative revolutions and the culture of numbers , 2014 .

[13]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[14]  George L. Cowgill,et al.  ORIGINS AND DEVELOPMENT OF URBANISM: Archaeological Perspectives , 2004 .

[15]  Michael E. Smith Sprawl, Squatters, and Sustainable Cities: Can Archaeological Data Shed Light on Modern Urban Issues? , 2010 .

[16]  Charles Anderson,et al.  The end of theory: The data deluge makes the scientific method obsolete , 2008 .

[17]  Nevio Dubbini,et al.  A PageRank based predictive model for the estimation of the archaeological potential of an urban area , 2013, 2013 Digital Heritage International Congress (DigitalHeritage).

[18]  Heather A. Piwowar,et al.  Sharing Detailed Research Data Is Associated with Increased Citation Rate , 2007, PloS one.

[19]  Trevor J Barnes,et al.  Big data, little history , 2013 .

[20]  Bracha Shapira,et al.  Recommender Systems Handbook , 2015, Springer US.

[21]  B. Habert,et al.  Building together digital archives for research in social sciences and humanities , 2010 .

[22]  Rachel Schutt,et al.  Doing Data Science , 2013 .

[23]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[24]  Arshdeep Bahga,et al.  Internet of Things: A Hands-On Approach , 2014 .

[25]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[26]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[27]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[28]  William K. Michener,et al.  Grand challenges for archaeology , 2014, Proceedings of the National Academy of Sciences.

[29]  Anne Beaulieu,et al.  E-research as Intervention , 2009 .

[30]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[31]  Victoria Tsoukala,et al.  Issues in the development of open access to research data , 2014 .

[32]  I. Hodder,et al.  Spatial Analysis in Archaeology. , 1977 .

[33]  Domonkos Tikk,et al.  Scalable Collaborative Filtering Approaches for Large Recommender Systems , 2009, J. Mach. Learn. Res..

[34]  Heather A. Piwowar,et al.  Data reuse and the open data citation advantage , 2013, PeerJ.

[35]  George J. Gumerman,et al.  The distribution of prehistoric population aggregates , 1971 .

[36]  L. Florio,et al.  Advancing technologies and federating communities: a study on authentication and authorisation platforms for scientific resources in Europe , 2012 .

[37]  Keith W. Kintigh,et al.  The Promise and Challenge of Archaeological Data Integration , 2005, American Antiquity.

[38]  Gary Lock,et al.  Archaeological computing then and now: theory and practice, intentions and tensions , 2009 .

[39]  G. Brogiolo Le origini della città medievale , 2011 .

[40]  Yuri Demchenko,et al.  Architecture Framework and Components for the Big Data Ecosystem , 2013 .

[41]  Sarah Whitcher Kansa,et al.  We All Know That a 14 Is a Sheep: Data Publication and Professionalism in Archaeological Communication , 2013 .

[42]  Andreas Hotho,et al.  Recommender Systems for Social Tagging Systems , 2012, SpringerBriefs in Electrical and Computer Engineering.

[43]  J.W.H.P. Verhagen,et al.  Archaeological prediction and risk management , 2009 .

[44]  Edward R. Tufte,et al.  The cognitive style of PowerPoint , 2003 .

[45]  T. Whitley,et al.  Integrating Archaeological Theory and Predictive Modeling: a Live Report from the Scene , 2012 .

[46]  T. Harris Scale as Artifact: GIS, Ecological Fallacy, and Archaeological Analysis , 2006 .

[47]  Luís M. A. Bettencourt,et al.  Why are large cities faster? Universal scaling and self-similarity in urban organization and dynamics , 2008 .

[48]  J. Aldrich Correlations Genuine and Spurious in Pearson and Yule , 1995 .

[49]  P. Samuelson The Pure Theory of Public Expanditure , 1954 .

[50]  K. Lilley City and Cosmos: The Medieval World in Urban Form , 2009 .

[51]  Helmut Krcmar,et al.  Big Data , 2014, Wirtschaftsinf..

[52]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[53]  Ulrich Pöschl Interactive open access publishing and public peer review: The effectiveness of transparency and self-regulation in scientific quality assurance , 2010 .

[54]  Lorin M. Hitt,et al.  Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance? , 2011, ICIS 2011.

[55]  Elizabeth Yakel,et al.  The challenges of digging data: a study of context in archaeological data reuse , 2013, JCDL '13.

[56]  R. Steckel Big Social Science History , 2007, Social Science History.

[57]  Eric Gossett,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .