Improving Code Quality Using Fine-tuning Large Language Models
Nguyen, Quang Duc (2024)
Nguyen, Quang Duc
2024
Tieto- ja sähkötekniikan kandidaattiohjelma - Bachelor's Programme in Computing and Electrical Engineering
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2024-11-19
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-2024111210150
https://urn.fi/URN:NBN:fi:tuni-2024111210150
Tiivistelmä
Large language models has demonstrated significant capabilities in solving real-life problems by the means of generating human-like responses to input text in the similar format. However, these generic languages exhibit weakness when the context is not fully provided, resulting a need for model customization to enhance their performance in specialized fields. The fine-tuning is implemented using techniques inspired by machine learning methods. The two well-known fine-tuning methods are Supervised Fine-Tuning (STF), based on the concept of supervised learning, and Reinforcement Learning from Human Feedback (RLHF) introduced by OpenAI. The resource-intensive nature of RLHF makes it challenging to replicate the results achieved by Chat Generative Pre-trained Transformer (ChatGPT) models. Therefore, a new technique named direct preference optimization (DPO) has been introduced lately, incorporating similar concepts to RLHF while reducing required time and labor.
Given several studies have not addressed the improvement of DPO compared to other fine-tuning methods and its interplay with them, this thesis employs STF, DPO and their combination to fine-tune the base LLAMA 3.1 model, whose size is 8 billion parameters, in order to improve code quality generated by the LLMs. These fine-tuned models are then evaluated using Bi-Lingual Evaluation Understudy (BLEU) and Bidirectional Encoder Representations from Transformer (BERT) metrics, along with their code-specific derivatives. Despite the outcomes of the metrics cannot determine the most effective fine-tuning technique, they clearly highlight the significant influence of dataset utilized during the fine-tuning process. The repository containing the code and data for the research can be accessed via GitHub platform.
Given several studies have not addressed the improvement of DPO compared to other fine-tuning methods and its interplay with them, this thesis employs STF, DPO and their combination to fine-tune the base LLAMA 3.1 model, whose size is 8 billion parameters, in order to improve code quality generated by the LLMs. These fine-tuned models are then evaluated using Bi-Lingual Evaluation Understudy (BLEU) and Bidirectional Encoder Representations from Transformer (BERT) metrics, along with their code-specific derivatives. Despite the outcomes of the metrics cannot determine the most effective fine-tuning technique, they clearly highlight the significant influence of dataset utilized during the fine-tuning process. The repository containing the code and data for the research can be accessed via GitHub platform.
Kokoelmat
- Kandidaatintutkielmat [8918]