Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Fine-tuning Open-source Large Language Model Using a Custom Dataset

Saeed, Mubashir (2024)

 
Avaa tiedosto
SaeedMubashir.pdf (2.801Mt)
Lataukset: 



Saeed, Mubashir
2024

Master's Programme in Computing Sciences
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2024-06-10
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202405216160
Tiivistelmä
Artificial intelligence has gained significant popularity in recent years. These systems began their journey with the development of basic algorithms that were used to predict data points based on structured training data. Modern day artificially intelligent systems, especially Generative AI based models, have come a long way in allowing users to generate unique content such as text, images, and music. The content generated by these models exhibits traits that resemble human-level quality. Specializing the output generated by AI models is becoming increasingly popular for technology users and also in fields relating to arts and business. Making AI based products accessible to general users is becoming very common, however, specializing the output generated by these models, specifically text-based models, poses significant challenges.

This thesis looks into the specific method of specializing the output generated by Large Language Models using the fine-tuning technique. This thesis investigated and looked into extensively exploring the current landscape of commercially available LLMs. Open-source LLMs were also studied to understand their capabilities and their applications after going through the fine-tuning process. The final goal of the thesis is to fine-tune an open-source LLM by specializing the output generated by an LLM to answer technical queries related to the domain of WordPress, following a specific pattern from the finalized dataset.

This thesis aimed to achieve the final goal by conducting four research rounds following the Design Science Research methodology. This approach helped in the periodic build-up of the required knowledge of tools and techniques, which helped in the efficient fine-tuning of the open-source LLM. The first half of the research looks into fine-tuning commercially available solutions using an open-source dataset and a synthetic dataset. Based on results gained from the first two research rounds, the thesis dives deep into generating similar results using open-source LLM. Finally, the thesis uses the custom dataset generated using WordPress based question and answers for introducing specific patterns to the output generated by the open-source LLM.
Kokoelmat
  • Opinnäytteet - ylempi korkeakoulututkinto [41749]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste