论文信息 - Using contexts similarity to predict relationships between tasks

Using contexts similarity to predict relationships between tasks

We can predict tasks relationships by comparing the contexts of the tasks.The more related the tasks, the better context similarity predicts the relationship.Comparing contexts is roughly as accurate as mining text descriptions of tasks.Depending on data availability context similarity complements content similarity. Developers tasks are often interrelated. A task might succeed, precede, block, or depend on another task. Or, two tasks might simply have a similar aim or require similar expertise. When working on tasks, developers interact with artifacts and tools, which constitute the contexts of the tasks. This work investigates the extent to which the similarity of the contexts predicts whether and how the respective tasks are related. The underlying assumption is simple: if during two tasks the same artifacts are touched or similar interactions are observed, the tasks might be interrelated.We define a task context as the set of all developers interactions with the artifacts during the task. We then apply Jaccard index, a popular similarity measure to compare two contexts. Instead of only counting the artifacts in the intersection and union of the contexts as Jaccard does, we scale the artifacts with their relevance to the task. For this, we suggest a simple heuristic based on the Frequency, Duration, and Age of the interactions with the artifacts (FDA). Alternatively, artifact relevance can be estimated by the Degree-of-Interest (DOI) used in task-focused programming.To compare the accuracy of the context similarity models for predicting task relationships, we conducted a field study with professionals, analyzed data from the open source task repository Bugzilla, and ran an experiment with students. We studied two types of relationships useful for work coordination (dependsOn and blocks) and two types useful for personal work management (isNextTo and isSimilarTo). We found that context similarity models clearly outperform a random prediction for all studied task relationships. We also found evidence that, the more interrelated the tasks are, the more accurate the context similarity predictions are.Our results show that context similarity is roughly as accurate to predict task relationships as comparing the textual content of the task descriptions. Context and content similarity models might thus be complementary in practice, depending on the availability of text descriptions or context data. We discuss several use cases for this research, e.g. to assist developers choose the next task or to recommend other tasks they should be aware of.

[1] Walid Maalej,et al. Can development work describe itself? , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[2] Thomas D. LaToza,et al. Maintaining mental models: a study of developer work habits , 2006, ICSE.

[3] Roger B. Bradford,et al. An empirical study of required dimensionality for large-scale latent semantic indexing applications , 2008, CIKM '08.

[4] Thomas Fritz,et al. Collecting and Processing Interaction Data for Recommendation Systems , 2014, Recommendation Systems in Software Engineering.

[5] Mik Kersten,et al. Mylar: a degree-of-interest model for IDEs , 2005, AOSD '05.

[6] Robert J. Walker,et al. Simulation - A Methodology to Evaluate Recommendation Systems in Software Engineering , 2014, Recommendation Systems in Software Engineering.

[7] Jane Cleland-Huang,et al. Recommendation Systems in Requirements Discovery , 2014, Recommendation Systems in Software Engineering.

[8] Thomas G. Dietterich,et al. Detecting and correcting user activity switches: algorithms and interfaces , 2009, IUI.

[9] Jennifer Widom,et al. SimRank: a measure of structural-context similarity , 2002, KDD.

[10] Allen H Dutoit,et al. Object-Oriented Software Engineering , 2011 .

[11] Janice Singer,et al. An examination of software engineering work practices , 2010, CASCON.

[12] Martin P. Robillard,et al. The Influence of the Task on Programmer Behaviour , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[13] Thomas Zimmermann,et al. Changes and bugs — Mining and predicting development activities , 2009, 2009 IEEE International Conference on Software Maintenance.

[14] Guy Shani,et al. Evaluating Recommendation Systems , 2011, Recommender Systems Handbook.

[15] Chris Parnin,et al. Building Usage Contexts During Program Comprehension , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[16] Foutse Khomh,et al. Noises in Interaction Traces Data and Their Impact on Previous Research Studies , 2015, 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[17] James D. Hollan,et al. Edit wear and read wear , 1992, CHI.

[18] Spencer Rugaber,et al. Resumption strategies for interrupted programming tasks , 2009, 2009 IEEE 17th International Conference on Program Comprehension.

[19] Oliver Brdiczka,et al. From documents to tasks: deriving user tasks from document usage patterns , 2010, IUI '10.

[20] Eric Horvitz,et al. Disruption and recovery of computing tasks: field study, analysis, and directions , 2007, CHI.

[21] Gail C. Murphy,et al. On what basis to recommend: Changesets or interactions? , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[22] John F. Canny,et al. CAAD: an automatic task support system , 2007, CHI.

[23] Mik Kersten,et al. Using task context to improve programmer productivity , 2006, SIGSOFT '06/FSE-14.

[24] Kelly Blincoe,et al. Do all task dependencies require coordination? the role of task properties in identifying critical coordination needs in software projects , 2013, ESEC/FSE 2013.

[25] Alexander Sahm,et al. Assisting engineers in switching artifacts by using task semantic and interaction history , 2010, RSSE '10.

[26] Yiyu Yao. Measuring retrieval effectiveness based on user preference of documents , 1995 .

[27] Janice Singer,et al. An examination of software engineering work practices , 1997, CASCON.

[28] Doug Abbott. Eclipse integrated development environment , 2013 .

[29] Víctor M. González,et al. No task left behind?: examining the nature of fragmented work , 2005, CHI.

[30] Emily Hill,et al. Identifying Word Relations in Software: A Comparative Study of Semantic Similarity Tools , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[31] Mary Czerwinski,et al. Towards understanding programs through wear-based filtering , 2005, SoftVis '05.

[32] Serge Demeyer,et al. Comparing Mining Algorithms for Predicting the Severity of a Reported Bug , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[33] Nicholas Jalbert,et al. Automated duplicate detection for bug tracking systems , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[34] T. Landauer,et al. Indexing by Latent Semantic Analysis , 1990 .

[35] Marco Tulio Valente,et al. NextBug: a Bugzilla extension for recommending similar bugs , 2015, Journal of Software Engineering Research and Development.

[36] Walid Maalej,et al. Task-First or Context-First? Tool Integration Revisited , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[37] Per Runeson,et al. Detection of Duplicate Defect Reports Using Natural Language Processing , 2007, 29th International Conference on Software Engineering (ICSE'07).

[38] Premkumar T. Devanbu,et al. The missing links: bugs and bug-fix commits , 2010, FSE '10.

[39] Seng W. Loke,et al. Towards context-aware task recommendation , 2009, 2009 Joint Conferences on Pervasive Computing (JCPC).

[40] Eleni Stroulia,et al. A contextual approach towards more accurate duplicate bug report detection and ranking , 2013, Empirical Software Engineering.

[41] Victor Kaptelinin,et al. UMEA: translating interaction histories into project contexts , 2003, CHI '03.

[42] Collin McMillan,et al. A search engine for finding highly relevant applications , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[43] Jane Cleland-Huang,et al. On-demand feature recommendations derived from mining public product descriptions , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[44] Oskar Kohonen,et al. Using Topic Models in Content-Based News Recommender Systems , 2013, NODALIDA.

[45] Birger Hjørland,et al. Work tasks and socio-cognitive relevance: A specific example , 2002, J. Assoc. Inf. Sci. Technol..

[46] Thomas G. Dietterich,et al. TaskTracer: a desktop environment to support multi-tasking knowledge workers , 2005, IUI.

[47] Rainer Koschke,et al. On the Comprehension of Program Comprehension , 2014, TSEM.

[48] Gail C. Murphy,et al. How Software Developers Use Work Breakdown Relationships in Issue Repositories , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[49] Robert DeLine,et al. Information Needs in Collocated Software Development Teams , 2007, 29th International Conference on Software Engineering (ICSE'07).

[50] Walid Maalej,et al. A Lightweight Approach for Knowledge Sharing in Distributed Software Teams , 2008, PAKM.

[51] Petr Sojka,et al. Software Framework for Topic Modelling with Large Corpora , 2010 .

[52] D. Allen. Getting Things Done: The Art of Stress-Free Productivity , 2001 .

[53] Ingo Scholtes,et al. Categorizing bugs with social networks: A case study on four open source software communities , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[54] Víctor M. González,et al. An empirical study of work fragmentation in software evolution tasks , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[55] Patrícia Duarte de Lima Machado,et al. Revealing influence of model structure and test case profile on the prioritization of test cases in the context of model-based testing , 2014, Journal of Software Engineering Research and Development.

[56] Thomas G. Dietterich,et al. A hybrid learning system for recognizing user tasks from desktop activities and email messages , 2006, IUI '06.

[57] Spencer Rugaber,et al. Resumption strategies for interrupted programming tasks , 2009, ICPC.