Code quality in pull requests: an empirical study
Nikkola, Vili (2019)
Nikkola, Vili
2019
Tietotekniikan DI-ohjelma - Degree Programme in Information Technology
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2019-10-10
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-201908252998
https://urn.fi/URN:NBN:fi:tuni-201908252998
Tiivistelmä
Pull requests are a common practice for contributing and reviewing contributions, and are employed both in open-source and industrial contexts. Compared to the traditional code review process adopted in the 1970s and 1980s, pull requests allow a more lightweight reviewing approach. One of the main goals of code reviews is to find defects in the code, allowing project maintainers to easily integrate external contributions into a project and discuss the code contributions. The goal of this work is to understand whether code quality is actually considered when pull requests are accepted. Specifically, we aim at understanding whether code quality issues such as code smells, antipatterns, and coding style violations in the pull request code affect the chance of its acceptance when reviewed by a maintainer of the project. We conducted a case study among 28 Java open-source projects, analyzing the presence of 4.7 M code quality issues in 36 K pull requests. We analyzed further correlations by applying Logistic Regression and seven machine learning techniques (Decision Tree, Random Forest, Extremely Randomized Trees, AdaBoost, Gradient Boosting, XGBoost). Unexpectedly, code quality turned out not to affect the acceptance of a pull request at all. As suggested by other works, other factors such as the reputation of the maintainer and the importance of the feature delivered might be more important than code quality in terms of pull request acceptance.