NLP Community Perspectives on Replicability

With recent efforts in drawing attention to the task of replicating and/or reproducing1 results, for example in the context of COLING 2018 and various LREC workshops, the question arises how the NLP community views the topic of replicability in general. Using a survey, in which we involve members of the NLP community, we investigate how our community perceives this topic, its relevance and options for improvement. Based on over two hundred participants, the survey results confirm earlier observations, that successful reproducibility requires more than having access to code and data. Additionally, the results show that the topic has to be tackled from the authors’, reviewers’ and community’s side.

[1]  K. Bretonnel Cohen,et al.  Replicability of Research in Biomedical Natural Language Processing: a pilot evaluation for a coding task , 2016, Louhi@EMNLP.

[2]  Margot Mieskes,et al.  A Quantitative Study of Data in the NLP community , 2017, EthNLP@EACL.

[3]  Ted Pedersen,et al.  Empiricism Is Not a Matter of Faith , 2008, Computational Linguistics.

[4]  K. Bretonnel Cohen,et al.  Reproducibility in Biomedical Natural Language Processing , 2017, AMIA.

[5]  Maria Liakata,et al.  Using clinical Natural Language Processing for health outcomes research: Overview and actionable suggestions for future advances , 2018, J. Biomed. Informatics.

[6]  L. Mbuagbaw,et al.  A systematic scoping review of adherence to reporting guidelines in health care literature , 2013, Journal of multidisciplinary healthcare.

[7]  Pearl Brereton,et al.  Reproducibility of studies on text mining for citation screening in systematic reviews: Evaluation and checklist , 2017, J. Biomed. Informatics.

[8]  Yutaka Matsuo,et al.  Replication issues in syntax-based aspect extraction for opinion mining , 2017, EACL.

[9]  Paul Rayson,et al.  Bringing replication and reproduction together with generalisability in NLP: Three reproduction studies for Target Dependent Sentiment Analysis , 2018, COLING.

[10]  D. Moher,et al.  Interventions to improve adherence to reporting guidelines in health research: a scoping review protocol , 2017, BMJ Open.

[11]  Murhaf Fares,et al.  Word vectors, reuse, and replicability: Towards a community repository of large-text resources , 2017, NODALIDA.

[12]  K. Bretonnel Cohen,et al.  Three Dimensions of Reproducibility in Natural Language Processing , 2018, LREC.

[13]  Rotem Dror,et al.  Replicability Analysis for Natural Language Processing: Testing Significance with Multiple Datasets , 2017, TACL.

[14]  Antske Fokkens,et al.  Offspring from Reproduction Problems: What Replication Failure Teaches Us , 2013, ACL.

[15]  Torsten Zesch,et al.  Do LSTMs really work so well for PoS tagging? – A replication study , 2017, EMNLP.

[16]  Adam Kilgarriff,et al.  95% Replicability for Manual Word Sense Tagging , 1999, EACL.

[17]  Sandra Kübler,et al.  Towards Replicability in Parsing , 2017, RANLP.

[18]  Gertjan van Noord,et al.  Squib: Reproducibility in Computational Linguistics: Are We Willing to Share? , 2018, CL.