As new requirements are introduced and implemented in a software system, developers must identify the set of source code classes which need to be changed. Therefore, past effort has focused on predicting the set of classes impacted by a requirement. In this paper, we introduce and evaluate a new type of information based on the intuition that the set of requirements which are associated with historical changes to a specific class are likely to exhibit semantic similarity to new requirements which impact that class. This new Requirements to Requirements Set (R2RS) family of metrics captures the semantic similarity between a new requirement and the set of existing requirements previously associated with a class. The aim of this paper is to present and evaluate the usefulness of R2RS metrics in predicting the set of classes impacted by a requirement. We consider 18 different R2RS metrics by combining six natural language processing techniques to measure the semantic similarity among texts (e.g., VSM) and three distribution scores to compute overall similarity (e.g., average among similarity scores). We evaluate if R2RS is useful for predicting impacted classes in combination and against four other families of metrics that are based upon temporal locality of changes, direct similarity to code, complexity metrics, and code smells. Our evaluation features five classifiers and 78 releases belonging to four large open-source projects, which result in over 700,000 candidate impacted classes. Experimental results show that leveraging R2RS information increases the accuracy of predicting impacted classes practically by an average of more than 60 percent across the various classifiers and projects.

Falessi, D., Roll, J., Guo, J., Cleland-Huang, J. (2020). Leveraging Historical Associations between Requirements and Source Code to Identify Impacted Classes. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 46(4), 420-441 [10.1109/TSE.2018.2861735].

Leveraging Historical Associations between Requirements and Source Code to Identify Impacted Classes

Falessi, D;
2020-01-01

Abstract

As new requirements are introduced and implemented in a software system, developers must identify the set of source code classes which need to be changed. Therefore, past effort has focused on predicting the set of classes impacted by a requirement. In this paper, we introduce and evaluate a new type of information based on the intuition that the set of requirements which are associated with historical changes to a specific class are likely to exhibit semantic similarity to new requirements which impact that class. This new Requirements to Requirements Set (R2RS) family of metrics captures the semantic similarity between a new requirement and the set of existing requirements previously associated with a class. The aim of this paper is to present and evaluate the usefulness of R2RS metrics in predicting the set of classes impacted by a requirement. We consider 18 different R2RS metrics by combining six natural language processing techniques to measure the semantic similarity among texts (e.g., VSM) and three distribution scores to compute overall similarity (e.g., average among similarity scores). We evaluate if R2RS is useful for predicting impacted classes in combination and against four other families of metrics that are based upon temporal locality of changes, direct similarity to code, complexity metrics, and code smells. Our evaluation features five classifiers and 78 releases belonging to four large open-source projects, which result in over 700,000 candidate impacted classes. Experimental results show that leveraging R2RS information increases the accuracy of predicting impacted classes practically by an average of more than 60 percent across the various classifiers and projects.
2020
Pubblicato
Rilevanza internazionale
Editoriale
Comitato scientifico
Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
English
Measurement
Semantics
Natural language processing
Complexity theory
Open source software
Task analysis
Impact analysis
mining software repositories
traceability
Falessi, D., Roll, J., Guo, J., Cleland-Huang, J. (2020). Leveraging Historical Associations between Requirements and Source Code to Identify Impacted Classes. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 46(4), 420-441 [10.1109/TSE.2018.2861735].
Falessi, D; Roll, J; Guo, J; Cleland-Huang, J
Articolo su rivista
File in questo prodotto:
File Dimensione Formato  
1808.06359.pdf

solo utenti autorizzati

Tipologia: Documento in Pre-print
Licenza: Non specificato
Dimensione 5.13 MB
Formato Adobe PDF
5.13 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/256954
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 20
  • ???jsp.display-item.citation.isi??? 10
social impact