Implementation of a Smart Test Execution System for Continuous Integration Pipelines
Dalla Rizza, Federico (2024)
Dalla Rizza, Federico
2024
Master's Programme in Information Technology
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2024-09-18
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202407097570
https://urn.fi/URN:NBN:fi:tuni-202407097570
Tiivistelmä
In contemporary software development, the necessity for robust and reproducible testing throughout the entire lifecycle of a project is undeniable. However, executing tests at every phase of development can be both time-consuming and resource-intensive, particularly in environments with long and complex CI pipelines. To address this challenge, leveraging Artificial Intelligence (AI) presents a promising solution. By employing AI to manage tests selectively, developers can attain fast feedback, minimizing waiting times and enhancing overall productivity.
This thesis endeavours to design and implement a proof-of-concept for an AI-assisted system tailored for Smart Test Execution within CI pipelines. The system aims to optimize test execution by bypassing unnecessary tests, thereby reducing the testing overhead associated with each code change. Test case selection will be facilitated by an AI system utilizing a Machine Learning (ML) model trained to classify test cases based on their relevance to specific code changes.
The initial phase of the research involved the development of a data gathering tool to collect metadata from popular open-source GitLab pipelines. This data was stored in a Cassandra database hosted on Kubernetes, ensuring scalability and availability for future scaling efforts. Subsequent phases focused on leveraging this data to gain insights into the features that could predict the outcome of test cases for a given pipeline. Several ML techniques were employed to develop a prediction model trained on the gathered data, utilizing input such as code changes, pipeline descriptions, and test information. The output of the model provides the probability of a test case being required in a pipeline execution, thereby informing test case selection and prioritization.
Through iterative refinement, the developed system aims to decrease the overall time required to execute a pipeline by selectively executing tests predicted to fail and take the least amount of time. This approach offers a novel and efficient means of managing tests within CI pipelines, ultimately enhancing the efficiency and effectiveness of software development processes.
This thesis endeavours to design and implement a proof-of-concept for an AI-assisted system tailored for Smart Test Execution within CI pipelines. The system aims to optimize test execution by bypassing unnecessary tests, thereby reducing the testing overhead associated with each code change. Test case selection will be facilitated by an AI system utilizing a Machine Learning (ML) model trained to classify test cases based on their relevance to specific code changes.
The initial phase of the research involved the development of a data gathering tool to collect metadata from popular open-source GitLab pipelines. This data was stored in a Cassandra database hosted on Kubernetes, ensuring scalability and availability for future scaling efforts. Subsequent phases focused on leveraging this data to gain insights into the features that could predict the outcome of test cases for a given pipeline. Several ML techniques were employed to develop a prediction model trained on the gathered data, utilizing input such as code changes, pipeline descriptions, and test information. The output of the model provides the probability of a test case being required in a pipeline execution, thereby informing test case selection and prioritization.
Through iterative refinement, the developed system aims to decrease the overall time required to execute a pipeline by selectively executing tests predicted to fail and take the least amount of time. This approach offers a novel and efficient means of managing tests within CI pipelines, ultimately enhancing the efficiency and effectiveness of software development processes.