Identifying Design and Requirement Self-Admitted Technical Debt Using N-gram IDF


In software projects, technical debt takes place when a developer adopting a trivial solution containing quick and easy shortcuts to implement over a suitable solution that can take a longer time to solve a problem. This can cause major additional costs leading to negative impacts for software maintenance since those shortcuts might need to be reworked in the future. Detecting technical debt early can help a team cope with those risks. In this paper, we focus on Self-Admitted Technical Debt (SATD) that is a debt intentionally produced by developers. We propose an automated model to identify two most common types of self-admitted technical debt, requirement and design debt, from source code comments. We combine N-gram IDF and auto-sklearn machine learning to build the model. With the empirical evaluation on ten projects, our approach outperform the baseline method by improving the performance over 20% when identifying requirement self-admitted technical debt and achieving an average F1-score of 64% when identifying design self-admitted technical debt.

2018 9th International Workshop on Empirical Software Engineering in Practice