Using blog‐like documents to investigate software practice: Benefits, challenges, and research directions

Background An emerging body of research is using grey literature to investigate software practice. One frequently occurring type of grey literature is the blog post. Whilst there are prospective benefits to using grey literature and blog posts to investigate software practice, there are also concerns about the quality of such material.

[1]  Benjamin H. Detenber,et al.  Doing the right thing online: a survey of bloggers' ethical beliefs and practices , 2009, New Media Soc..

[2]  Edward Y. Chang,et al.  Mining blog stories using community-based and temporal clustering , 2006, CIKM '06.

[3]  Christoph Treude,et al.  Effective communication of software development knowledge through community portals , 2011, ESEC/FSE '11.

[4]  V. Dickson-Swift,et al.  Using Blogs as a Qualitative Health Research Tool , 2015 .

[5]  Owen Rambow,et al.  Identifying Justifications in Written Dialogs , 2011, 2011 IEEE Fifth International Conference on Semantic Computing.

[6]  Christoph Treude,et al.  How Modern News Aggregators Help Development Communities Shape and Share Knowledge , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[7]  Iryna Gurevych,et al.  Argumentation Mining on the Web from Information Seeking Perspective , 2014, ArgNLP.

[8]  Austen Rainer,et al.  Using Blog Articles in Software Engineering Research: Benefits, Challenges and Case–Survey Method , 2018, 2018 25th Australasian Software Engineering Conference (ASWEC).

[9]  Iryna Gurevych,et al.  Annotating Argument Components and Relations in Persuasive Essays , 2014, COLING.

[10]  Vahid Garousi,et al.  A systematic literature review of literature reviews in software testing , 2016, Inf. Softw. Technol..

[11]  Austen Rainer,et al.  Identifying Practitioners' Arguments and Evidence in Blogs: Insights from a Pilot Study , 2016, 2016 23rd Asia-Pacific Software Engineering Conference (APSEC).

[12]  Somesh Jha,et al.  New Directions for Container Debloating , 2017, FEAST@CCS.

[13]  David Gefen,et al.  What do software practitioners really think about project success: an exploratory study , 2005, J. Syst. Softw..

[14]  Hikmat Ullah Khan,et al.  Modelling to identify influential bloggers in the blogosphere: A survey , 2017, Comput. Hum. Behav..

[15]  Willem-Jan van den Heuvel,et al.  The pains and gains of microservices: A Systematic grey literature review , 2018, J. Syst. Softw..

[16]  Bram Adams,et al.  Do developers feel emotions? an exploratory analysis of emotions in software artifacts , 2014, MSR 2014.

[17]  Chris Taylor Blogging Statistics and Trends: The 2017 Survey of 1000+ Bloggers , 2017 .

[18]  Vahid Garousi,et al.  Guidelines for including the grey literature and conducting multivocal literature reviews in software engineering , 2017, Inf. Softw. Technol..

[19]  Carolyn B. Seaman,et al.  Qualitative Methods in Empirical Studies of Software Engineering , 1999, IEEE Trans. Software Eng..

[20]  Robert Feldt,et al.  Behavioral software engineering: A definition and systematic literature review , 2015, J. Syst. Softw..

[21]  Sérgio Soares,et al.  The Role of Rapid Reviews in Supporting Decision-Making in Software Engineering Practice , 2018, EASE.

[22]  J. Higgins,et al.  Cochrane Handbook for Systematic Reviews of Interventions , 2010, International Coaching Psychology Review.

[23]  Pekka Abrahamsson,et al.  Happy software developers solve problems better: psychological measurements in empirical software engineering , 2014, PeerJ.

[24]  Tony Gorschek,et al.  Choosing Component Origins for Software Intensive Systems: In-House, COTS, OSS or Outsourcing?—A Case Survey , 2018, IEEE Transactions on Software Engineering.

[25]  Sara Rosenthal,et al.  Detecting Opinionated Claims in Online Discussions , 2012, 2012 IEEE Sixth International Conference on Semantic Computing.

[26]  David Budgen,et al.  Evaluation and assessment in software engineering , 1997, J. Syst. Softw..

[27]  Michal R. Wróbel,et al.  Emotions in the software development process , 2013, 2013 6th International Conference on Human System Interactions (HSI).

[28]  Ashley Williams Using reasoning markers to select the more rigorous software practitioners' online content when searching for grey literature , 2018, EASE.

[29]  Per Runeson,et al.  Guidelines for conducting and reporting case study research in software engineering , 2009, Empirical Software Engineering.

[30]  Claus Pahl,et al.  Architectural Patterns for Microservices: A Systematic Mapping Study , 2018, CLOSER.

[31]  Jorge Bernardino,et al.  WISE Blogs: A Special Blog Search Engine , 2015, C3S2E.

[32]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[33]  Elahe Rahimtoroghi,et al.  Evaluation, Orientation, and Action in Interactive StoryTelling , 2013, Intelligent Narrative Technologies.

[34]  Marie-Francine Moens,et al.  Approaches to Text Mining Arguments from Legal Cases , 2010, Semantic Processing of Legal Texts.

[35]  Walid Maalej,et al.  How do developers blog?: an exploratory study , 2011, MSR '11.

[36]  K. A. Hayden,et al.  State-of-the-evidence reviews: advantages and challenges of including grey literature. , 2006, Worldviews on evidence-based nursing.

[37]  Helmut Krcmar,et al.  Using the Case Survey Method for Synthesizing Case Study Evidence in Information Systems Research , 2013, AMCIS.

[38]  Jennifer Jie Xu,et al.  A Blog Mining Framework , 2009, IT Professional.

[39]  Geoffrey A. Moore Crossing the chasm : marketing and selling high-tech products to mainstream customers , 1999 .

[40]  Christoph Treude,et al.  Measuring API documentation on the web , 2011, Web2SE '11.

[41]  Noam Slonim,et al.  A Benchmark Dataset for Automatic Detection of Claims and Evidence in the Context of Controversial Topics , 2014, ArgMining@ACL.

[42]  Christoph Treude,et al.  Blogging developer knowledge: Motivations, challenges, and future directions , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[43]  Sunny Wong,et al.  Software development challenges with air-gap isolation , 2018, ESEC/SIGSOFT FSE.

[44]  R. Swanson,et al.  Identifying Personal Stories in Millions of Weblog Entries , 2009, ICWSM 2009.

[45]  Maya Daneva,et al.  On the pragmatic design of literature studies in software engineering: an experience-based guideline , 2016, Empirical Software Engineering.

[46]  Nikos Kasioumis,et al.  BlogForever Crawler: Techniques and Algorithms to Harvest Modern Weblogs , 2014, WIMS '14.

[47]  Austen Rainer,et al.  Heuristics for improving the rigour and relevance of grey literature searches for software engineering research , 2019, Inf. Softw. Technol..

[48]  Vahid Garousi,et al.  The need for multivocal literature reviews in software engineering: complementing systematic literature reviews with grey literature , 2016, EASE.

[49]  Martin Oberhofer,et al.  Knowledge Discovery in the Blogosphere: Approaches and Challenges , 2010, IEEE Internet Computing.

[50]  R. Adams,et al.  Shades of Grey: Guidelines for Working with the Grey Literature in Systematic Reviews for Management and Organizational Studies , 2017 .

[51]  Lois Ann Scheidt,et al.  Bridging the gap: a genre analysis of Weblogs , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[52]  J. Glanville,et al.  Searching for Studies , 2008 .

[53]  Pável Calado,et al.  Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow , 2013, SIGIR.

[54]  Austen Rainer,et al.  Do software engineering practitioners cite software testing research in their online articles?: A larger scale replication , 2019, EASE.

[55]  Paolo Torroni,et al.  Argumentation Mining , 2016, ACM Trans. Internet Techn..

[56]  Mitesh M. Khapra,et al.  Show Me Your Evidence - an Automatic Method for Context Dependent Evidence Detection , 2015, EMNLP.

[57]  J. Tseng,et al.  Survival Analysis of Children with Primary Malignant Brain Tumors in England and Wales: A Population-Based Study , 2006, Pediatric Neurosurgery.

[58]  Len L. Levin Literature Search Strategy Week: Len Levin on Understanding and Finding Grey Literature , 2014 .

[59]  Matthew Hurst,et al.  BlogPulse: Automated Trend Discovery for Weblogs , 2003 .

[60]  Neil A. Ernst Bayesian Hierarchical Modelling for Tailoring Metric Thresholds , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[61]  Frank Wm. Tompa,et al.  Seeking Stable Clusters in the Blogosphere , 2007, VLDB.

[62]  Vahid Garousi,et al.  Choosing the Right Test Automation Tool: a Grey Literature Review of Practitioner Sources , 2017, EASE.

[63]  Austen Rainer,et al.  Toward the use of blog articles as a source of evidence for software engineering research , 2017, EASE.

[64]  M. Garden Defining blog: A fool’s errand or a necessary undertaking , 2012 .

[65]  Ashley Williams,et al.  Do software engineering practitioners cite research on software testing in their online articles?: A preliminary survey. , 2018, EASE.

[66]  Claire Cardie,et al.  Identifying Appropriate Support for Propositions in Online User Comments , 2014, ArgMining@ACL.

[67]  Jan Snajder,et al.  Back up your Stance: Recognizing Arguments in Online Discussions , 2014, ArgMining@ACL.

[68]  Kai Petersen,et al.  Guidelines for conducting systematic mapping studies in software engineering: An update , 2015, Inf. Softw. Technol..

[69]  Walid Maalej,et al.  From work to word: How do software developers describe their work? , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[70]  Kentaro Inui,et al.  Experience Mining: Building a Large-Scale Database of Personal Experiences and Opinions from Web Documents , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[71]  ˇ FilipBoltu Back up your Stance: Recognizing Arguments in Online Discussions , 2014 .

[72]  Jean Adams,et al.  Searching and synthesising ‘grey literature’ and ‘grey information’ in public health: critical reflections on three case studies , 2016, Systematic Reviews.

[73]  Austen Rainer,et al.  How do empirical software engineering researchers assess the credibility of practitioner-generated blog posts? , 2019, EASE.

[74]  Austen Rainer,et al.  Using argumentation theory to analyse software practitioners' defeasible evidence, inference and belief , 2017, Inf. Softw. Technol..

[75]  Premkumar T. Devanbu,et al.  Belief & Evidence in Empirical Software Engineering , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[76]  Austen Rainer,et al.  Case Study Research in Software Engineering - Guidelines and Examples , 2012 .

[77]  Vahid Garousi,et al.  When and what to automate in software testing? A multi-vocal literature review , 2016, Inf. Softw. Technol..

[78]  Akshay Java,et al.  The ICWSM 2009 Spinn3r Dataset , 2009 .

[79]  Evandro Costa,et al.  RetriBlog: a framework for creating blog crawlers , 2012, SAC '12.

[80]  Elahe Rahimtoroghi,et al.  Identifying Narrative Clause Types in Personal Stories , 2014, SIGDIAL Conference.

[81]  John Houghton,et al.  Where is the evidence: realising the value of grey literature for public policy and practice , 2014 .

[82]  Leif Singer,et al.  The (R) Evolution of social media in software engineering , 2014, FOSE.