TechLand: Assisting Technology Landscape Inquiries with Insights from Stack Overflow

Understanding the technology landscape is crucial for the success of the software-engineering project or organization. However, it can be difficult, even for experienced developers, due to the proliferation of similar technologies, the complex and often implicit dependencies among technologies, and the rapid development in which technology landscape evolves. Developers currently rely on online documents such as tutorials and blogs to find out best available technologies, technology correlations, and technology trends. Although helpful, online documents often lack objective, consistent summary of the technology landscape. In this paper, we present the TechLand system for assisting technology landscape inquiries with categorical, relational and trending knowledge of technologies that is aggregated from millions of Stack Overflow questions mentioning the relevant technologies. We implement the TechLand system and evaluate the usefulness of the system against the community answers to 100 technology questions on Stack Overflow and by field deployment and a lab study. Our evaluation shows that the TechLand system can assist developers in technology landscape inquiries by providing direct, objective, and aggregated information about available technologies, technology correlations and technology trends. Developers currently rely on online documents such as tutorials and blogs to find out best available technologies, technology correlations, and technology trends. Although helpful, online documents often lack objective, consistent summary of the technology landscape. In this paper, we present the TechLand system for assisting technology landscape inquiries with categorical, relational and trending knowledge of technologies that is aggregated from millions of Stack Overflow questions mentioning the relevant technologies. We implement the TechLand system and evaluate the usefulness of the system against the community answers to 100 technology questions on Stack Overflow and by field deployment and a lab study. Our evaluation shows that the TechLand system can assist developers in technology landscape inquiries by providing direct, objective, and aggregated information about available technologies, technology correlations and technology trends.

[1]  Karl Aberer,et al.  TweetSpector: entity-based retrieval of tweets , 2012, SIGIR '12.

[2]  Frank Maurer,et al.  What makes a good code example?: A study of programming Q&A in StackOverflow , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[3]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[4]  Valentin Robu,et al.  Emergence of consensus and shared vocabularies in collaborative tagging systems , 2009, TWEB.

[5]  George Macgregor,et al.  Collaborative tagging as a knowledge organisation and resource discovery tool , 2006 .

[6]  Bo Zhang,et al.  StatSnowball: a statistical approach to extracting entity relationships , 2009, WWW '09.

[7]  Jeffrey Heer,et al.  D³ Data-Driven Documents , 2011, IEEE Transactions on Visualization and Computer Graphics.

[8]  Edwin Simpson,et al.  Clustering Tags in Enterprise and Web Folksonomies , 2021, ICWSM.

[9]  Kristina Lerman,et al.  Pragmatic evaluation of folksonomies , 2011, WWW.

[10]  Zhenchang Xing,et al.  SimilarTech: Automatically recommend analogical libraries across different programming languages , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[11]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[12]  P. Schmitz,et al.  Inducing Ontology from Flickr Tags , 2006 .

[13]  Roi Blanco,et al.  Enhanced results for web search , 2011, SIGIR.

[14]  J. B. Brooke,et al.  SUS: A 'Quick and Dirty' Usability Scale , 1996 .

[15]  Bamshad Mobasher,et al.  Personalizing Navigation in Folksonomies Using Hierarchical Tag Clustering , 2008, DaWaK.

[16]  Oren Etzioni,et al.  TextRunner: Open Information Extraction on the Web , 2007, NAACL.

[17]  Yegin Genc,et al.  Exploratory search with semantic transformations using collaborative knowledge bases , 2014, WSDM.

[18]  Zhenchang Xing,et al.  Mining Technology Landscape from Stack Overflow , 2016, ESEM.

[19]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[20]  Jure Leskovec,et al.  Discovering value from community activity on focused question answering sites: a case study of stack overflow , 2012, KDD.

[21]  Xiaoxin Yin,et al.  Building taxonomy of web search intents for name entity queries , 2010, WWW '10.

[22]  Panayiotis Tsaparas,et al.  Structured annotations of web queries , 2010, SIGMOD Conference.

[23]  Ahmed E. Hassan,et al.  What are developers talking about? An analysis of topics and trends in Stack Overflow , 2014, Empirical Software Engineering.

[24]  Peter C. Rigby,et al.  The influence of App churn on App success and StackOverflow discussions , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[25]  James R. Curran Proceedings of the COLING/ACL on Interactive presentation sessions , 2006 .

[26]  Joseph A. Konstan,et al.  Evolution of Experts in Question Answering Communities , 2012, ICWSM.

[27]  David Lo,et al.  Inferring semantically related software terms and their taxonomy by leveraging collaborative tagging , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[28]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[29]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[30]  B. J. Winer Statistical Principles in Experimental Design , 1992 .

[31]  Krisztian Balog,et al.  Entity linking and retrieval for semantic search , 2014, WSDM.

[32]  Zhenchang Xing,et al.  Towards Correlating Search on Google and Asking on Stack Overflow , 2016, 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC).

[33]  Emad Shihab,et al.  What are mobile developers asking about? A large scale study using stack overflow , 2016, Empirical Software Engineering.

[34]  Mounia Lalmas,et al.  Penguins in sweaters, or serendipitous entity search on user-generated content , 2013, CIKM.

[35]  Divyakant Agrawal,et al.  Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data , 2010, SIGMOD 2010.

[36]  Jeffrey Heer,et al.  SpanningAspectRatioBank Easing FunctionS ArrayIn ColorIn Date Interpolator MatrixInterpola NumObjecPointI Rectang ISchedu Parallel Pause Scheduler Sequen Transition Transitioner Transiti Tween Co DelimGraphMLCon IData JSONCon DataField DataSc Dat DataSource Data DataUtil DirtySprite LineS RectSprite , 2011 .

[37]  John Kim,et al.  What makes users rate (share, tag, edit...)?: predicting patterns of participation in online communities , 2012, CSCW.

[38]  Kentaro Torisawa,et al.  Exploiting Wikipedia as External Knowledge for Named Entity Recognition , 2007, EMNLP.

[39]  Valentin Robu,et al.  The complex dynamics of collaborative tagging , 2007, WWW '07.

[40]  Christoph Treude,et al.  How do programmers ask and answer questions on the web?: NIER track , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[41]  Zhenchang Xing,et al.  Mining Analogical Libraries in Q&A Discussions -- Incorporating Relational and Categorical Knowledge into Word Embedding , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[42]  Roi Blanco,et al.  Effective and Efficient Entity Search in RDF Data , 2011, SEMWEB.