In-depth exploration of software defects and self-admitted technical debt through cutting-edge deep learning techniques.

Journal: PloS one
Published Date:

Abstract

Most previous research focuses on finding Self-Admitted Technical Debt (SATD) or detecting bugs alone, rather to addressing the concurrent identification of both issues. These study investigations solely identify and classify the SATD or faults, without identifying or categorising bugs based on SATD. Furthermore, the majority of current methodologies do not incorporate contemporary deep learning techniques. This work presents an innovative method utilising deep learning techniques to discover and classify Self-Admitted Technical Debt (SATD) and to find defects in software comments associated with SATD. The proposed approach detects this issue and classifies and enhances the understanding and localization of defects. The methodology involves developing a deep learning model using diverse data from repositories, including Apache, Mozilla Firefox, and Eclipse. The chosen data set comprises projects, designated SATD examples, and bug instances, facilitating thorough model training and evaluation. The methodology comprises data analysis, preprocessing, and model training utilising deep learning architectures such as LSTM, BI-LSTM, GRU, and BI-GRU, with Transformer models like BERT and GPT-3, in conjunction with machine learning methods. The performance evaluation criteria, such as precision, recall, accuracy, and F1 score, illustrate the efficacy of the suggested method. Comparative assessment with existing methodologies underscores notable improvements, while cross-validation ensures model resilience. All deep learning models achieved an accuracy and precision of 0.98, and transformer models achieved slightly higher metrics. The GPT-3 achieved an overall accuracy of 0.984. We see that using the transfer learning approach the transformer model (GPT-3) outperformed the other as it achieved an overall accuracy of 0.96 and F1-Score of 0.96, precision of 0.96, and recall of 0.96, and deep learning models (LSTM, GRU) also give significant performance, but their accuracy is slightly lower than baseline model (Naive Bayes). The research has significant implications for software engineering, providing a comprehensive method for software quality assessment and maintenance. It enhances software architecture technical debt (SATD) and knowledge of bugs, as well as prioritization and resource allocation for software maintenance and evolution. The research's ramifications go beyond academia; it has a direct impact on business procedures and makes it easier to create software systems that are reliable and long-lasting.

Authors

  • Sajid Ullah
    Plant Sciences Core Facility, CEITEC-Central European Institute of Technology, Masaryk University, 60200 Brno, Czech Republic.
  • M Irfan Uddin
    Institute of Computing, Kohat University of Science and Technology, Kohat 26000, Pakistan.
  • Muhammad Adnan
    Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, 54770, Pakistan.
  • Ala Abdulsalam Alarood
    College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia.
  • Abdulkream Alsulami
    Department of Information Technology at Al-kamil, University of Jeddah, Jeddah, Saudi Arabia.
  • Safa Habibullah
    Department of Information Systems and Technology, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia.