A1 Journal article (refereed)
Investigating Novice Developers’ Code Commenting Trends Using Machine Learning Techniques (2023)

Niazi, T., Das, T., Ahmed, G., Waqas, S. M., Khan, S., Khan, S., Abdelatif, A. A., & Wasi, S. (2023). Investigating Novice Developers’ Code Commenting Trends Using Machine Learning Techniques. Algorithms, 16(1), Article 53. https://doi.org/10.3390/a16010053

JYU authors or editors

Das, Teerath

Publication details

All authors or editors: Niazi, Tahira; Das, Teerath; Ahmed, Ghufran; Waqas, Syed Muhammad; Khan, Sumra; Khan, Suleman; Abdelatif, Ahmed Abdelaziz; Wasi, Shaukat

Journal or series: Algorithms

eISSN: 1999-4893

Publication year: 2023

Publication date: 12/01/2023

Volume: 16

Issue number: 1

Article number: 53

Publisher: MDPI AG

Publication country: Switzerland

Publication language: English

DOI: https://doi.org/10.3390/a16010053

Publication open access: Openly available

Publication channel open access: Open Access channel

Publication is parallel published (JYX): https://jyx.jyu.fi/handle/123456789/85762

Abstract

Code comments are considered an efficient way to document the functionality of a particular block of code. Code commenting is a common practice among developers to explain the purpose of the code in order to improve code comprehension and readability. Researchers investigated the effect of code comments on software development tasks and demonstrated the use of comments in several ways, including maintenance, reusability, bug detection, etc. Given the importance of code comments, it becomes vital for novice developers to brush up on their code commenting skills. In this study, we initially investigated what types of comments novice students document in their source code and further categorized those comments using a machine learning approach. The work involves the initial manual classification of code comments and then building a machine learning model to classify student code comments automatically. The findings of our study revealed that novice developers/students’ comments are mainly related to Literal (26.66%) and Insufficient (26.66%). Further, we proposed and extended the taxonomy of such source code comments by adding a few more categories, i.e., License (5.18%), Profile (4.80%), Irrelevant (4.80%), Commented Code (4.44%), Autogenerated (1.48%), and Improper (1.10%). Moreover, we assessed our approach with three different machine-learning classifiers. Our implementation of machine learning models found that Decision Tree resulted in the overall highest accuracy, i.e., 85%. This study helps in predicting the type of code comments for a novice developer using a machine learning approach that can be implemented to generate automated feedback for students, thus saving teachers time for manual one-on-one feedback, which is a time-consuming activity.

Keywords: software development; software developers; beginners; programming; source codes; classification; machine learning

Free keywords: source code comments; classification; machine learning techniques

Fields of science:

113 Computer and information sciences (Natural sciences)

Contributing organizations

JYU units:

Faculty of Information Technology

Ministry reporting: Yes

VIRTA submission year: 2023

JUFO rating: 1

A1 Journal article (refereed)Investigating Novice Developers’ Code Commenting Trends Using Machine Learning Techniques (2023)

JYU authors or editors

Publication details

Abstract

Contributing organizations

A1 Journal article (refereed)
Investigating Novice Developers’ Code Commenting Trends Using Machine Learning Techniques (2023)