A Generic Framework for Concept-Based Exploration of Semi-Structured Software Engineering Data

Software engineering meta-data (SE data), such as revision control data, Github project data or test reports, is typically semi-structured, it comprises a mixture of formatted and free-text fields and is often self-describing. Semi-structured SE data cannot be queried in a SQL-like manner because of its lack of structure. Consequently, there are a variety of customized tools built to analyze specific datasets but these do not generalize. We propose to develop a generic framework for exploration and querying of semi-structured SE data. Our approach investigates the use of a formal concept lattice as a universal data structure and a tag cloud as an intuitive interface to support data exploration.

[1]  Robert Godin,et al.  Lattice model of browsable data spaces , 1986, Inf. Sci..

[2]  Margaret-Anne D. Storey,et al.  Synchronized tag clouds for exploring semi-structured clinical trial data , 2008, CASCON '08.

[3]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[4]  Thomas Fritz,et al.  Using information fragments to answer the questions developers ask , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[5]  Peter Buneman,et al.  Semistructured data , 1997, PODS.

[6]  Christian Lindig Concept-Based Component Retrieval , 1995 .

[7]  Peter W. Eklund,et al.  Browsing Semi-structured Web Texts Using Formal Concept Analysis , 2001, ICCS.

[8]  Claudio Carpineto,et al.  Automatic Construction of Navigable Concept Networks Characterizing Text Databases , 1995, AI*IA.

[9]  Amit P. Sheth,et al.  RDF data exploration and visualization , 2007, CIMS '07.

[10]  Gregor Snelting,et al.  Reengineering of configurations based on mathematical concept analysis , 1996, TSEM.

[11]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[12]  Bernd Fischer Specification-Based Browsing of Software Component Libraries , 2004, Automated Software Engineering.

[13]  Janice Singer,et al.  Hipikat: a project memory for software development , 2005, IEEE Transactions on Software Engineering.

[14]  Bernd Fischer,et al.  Interactive tag cloud visualization of software version control repositories , 2015, 2015 IEEE 3rd Working Conference on Software Visualization (VISSOFT).

[15]  Julio Gonzalo,et al.  Browsing Search Results via Formal Concept Analysis: Automatic Selection of Attributes , 2004, ICFCA.

[16]  Frank Tip,et al.  Reengineering class hierarchies using concept analysis , 1998, SIGSOFT '98/FSE-6.

[17]  Bernd Fischer,et al.  ConceptCloud: a tagcloud browser for software archives , 2014, SIGSOFT FSE.

[18]  Robert Godin,et al.  Design of a browsing interface for information retrieval , 1989, SIGIR '89.