Context Advances in defect prediction models, aka classifiers, have been validated via accuracy metrics. Effort-aware metrics (EAMs) relate to benefits provided by a classifier in accurately ranking defective entities such as classes or methods. PofB is an EAM that relates to a user that follows a ranking of the probability that an entity is defective, provided by the classifier. Despite the importance of EAMs, there is no study investigating EAMs trends and validity. Aim The aim of this paper is twofold: 1) we reveal issues in EAMs usage, and 2) we propose and evaluate a normalization of PofBs (aka NPofBs), which is based on ranking defective entities by predicted defect density. Method We perform a systematic mapping study featuring 152 primary studies in major journals and an empirical study featuring 10 EAMs, 10 classifiers, two industrial, and 12 open-source projects. Results Our systematic mapping study reveals that most studies using EAMs use only a single EAM (e.g., PofB20) and that some studies mismatched EAMs names. The main result of our empirical study is that NPofBs are statistically and by orders of magnitude higher than PofBs. Conclusions In conclusion, the proposed normalization of PofBs: (i) increases the realism of results as it relates to a better use of classifiers, and (ii) promotes the practical adoption of prediction models in industry as it shows higher benefits. Finally, we provide a tool to compute EAMs to support researchers in avoiding past issues in using EAMs.
Çarka, J., Esposito, M., Falessi, D. (2022). On effort-aware metrics for defect prediction. EMPIRICAL SOFTWARE ENGINEERING, 27(6) [10.1007/s10664-022-10186-7].
On effort-aware metrics for defect prediction
Davide Falessi
2022-01-01
Abstract
Context Advances in defect prediction models, aka classifiers, have been validated via accuracy metrics. Effort-aware metrics (EAMs) relate to benefits provided by a classifier in accurately ranking defective entities such as classes or methods. PofB is an EAM that relates to a user that follows a ranking of the probability that an entity is defective, provided by the classifier. Despite the importance of EAMs, there is no study investigating EAMs trends and validity. Aim The aim of this paper is twofold: 1) we reveal issues in EAMs usage, and 2) we propose and evaluate a normalization of PofBs (aka NPofBs), which is based on ranking defective entities by predicted defect density. Method We perform a systematic mapping study featuring 152 primary studies in major journals and an empirical study featuring 10 EAMs, 10 classifiers, two industrial, and 12 open-source projects. Results Our systematic mapping study reveals that most studies using EAMs use only a single EAM (e.g., PofB20) and that some studies mismatched EAMs names. The main result of our empirical study is that NPofBs are statistically and by orders of magnitude higher than PofBs. Conclusions In conclusion, the proposed normalization of PofBs: (i) increases the realism of results as it relates to a better use of classifiers, and (ii) promotes the practical adoption of prediction models in industry as it shows higher benefits. Finally, we provide a tool to compute EAMs to support researchers in avoiding past issues in using EAMs.File | Dimensione | Formato | |
---|---|---|---|
s10664-022-10186-7.pdf
solo utenti autorizzati
Tipologia:
Versione Editoriale (PDF)
Licenza:
Copyright dell'editore
Dimensione
1.38 MB
Formato
Adobe PDF
|
1.38 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.