Institutional Repository

Towards an automated analysis of the quality of source code comments

Show simple item record

dc.contributor.author Haddad, Mireille J.
dc.date.accessioned 2022-02-07T09:55:24Z
dc.date.available 2022-02-07T09:55:24Z
dc.date.issued 2017
dc.identifier.citation Haddad, M. J.(2017). Towards an automated analysis of the quality of source code comments (Master's thesis, Notre Dame University-Louaize, Zouk Mosbeh, Lebanon). Retrieved from http://ir.ndu.edu.lb/123456789/1460
dc.identifier.uri http://ir.ndu.edu.lb/123456789/1460
dc.description M.S. -- Faculty of Natural and Applied Sciences, Department of Computer Science, Notre Dame University, Louaize, 2017; "A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science"; Includes bibliographical references (leave 61-64).
dc.description.abstract Maintenance is the most costly phase of the software life cycle. The maintenance cost of a program is estimated to be over 80% of its total life cycle costs (Erlikh, 2000). Since most of the maintenance time is devoted to understanding the program itself, program comprehension becomes essential. Often, a large fraction of the maintenance time is spent on reading code to understand what functionality of the program it implements. An insufficiently documented source code can be challenging for developers to understand and maintain. A clear and concise documentation can help developers to inspect and understand their ograms. Unfortunately, one of the major problems faced by developers during maintenance is that documentation is often not available or not useful. This thesis provides a heuristic approach for an automatic analysis and assessment of source-code comments by parsing by using a parser generator tool called ANTLR. This approach measures the antic similarity between the comment content and its corresponding entity identifier name. An algorithm was developed for splitting identifiers into component terms and computes the similarity percentage between the useful content of the comment and the identifier. The developed approach categorizes comments as follows: Scary noise, noise, normal with minor similar ity, probably meaningful, empty, and TODO. A study was carried out to evaluate the ability of the proposed approach to adequately assess source-code comments. In this study the source code of the Eclipse open source Integrated Development Environment (IDE) was parsed. The results showed that more than 50% of the comments fall into the category of empty comments and spread over 62% of the whole project files. Only 18% of the comments were of a high quality and around 20% of the files contain noise comments. Most Class and Interface identifiers have comments while more than 50% of the methods lack comments. en_US
dc.format.extent xii, 80 leaves : illustrations
dc.language.iso en en_US
dc.publisher Notre Dame University-Louaize en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject.lcsh Software maintenance
dc.subject.lcsh Source code (Computer science)
dc.title Towards an automated analysis of the quality of source code comments en_US
dc.type Thesis en_US
dc.rights.license This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 United States License. (CC BY-NC 3.0 US)
dc.contributor.supervisor Akiki, Pierre A., Ph.D. en_US
dc.contributor.department Notre Dame University-Louaize. Department of Computer Sciences en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search DSpace


Advanced Search

Browse

My Account