Analyzing the Recommendation Capabilities of ChatGPT : A Comparison with Traditional Recommender Systems
Yuan, Wenqi (2024)
Yuan, Wenqi
2024
Master's Programme in Computing Sciences
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2024-11-19
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202410229432
https://urn.fi/URN:NBN:fi:tuni-202410229432
Tiivistelmä
In recent decades, recommendation systems have advanced significantly, becoming prevalent tools across various sectors, including social media, e-commerce, and entertainment. These systems enhance user experiences by providing personalized suggestions, thereby increasing engagement and driving business growth. Traditional recommendation systems rely primarily on collaborative filtering, content-based filtering, and hybrid approaches to predict user preferences based on historical data. However, the emergence of advanced machine learning models, particularly those based on large language models (LLMs) such as ChatGPT, introduces a new paradigm in recommendation capabilities. By leveraging natural language processing, ChatGPT can potentially provide more detailed and personalized recommendations.
This thesis conducts a comprehensive analysis of ChatGPT's effectiveness as a recommender system, comparing its performance with traditional approaches like UserKNN, ItemKNN, and TF-IDF-based similarity models. Using two real-world datasets, MovieLens 10M and GoodBooks 10k, this thesis evaluates ChatGPT’s capabilities across three key recommendation tasks: ranking, rating prediction, and cold-start scenarios. The evaluation is structured around metrics such as nDCG, RMSE, and MAE to assess the quality of recommendations and predictions.
The findings show that ChatGPT performs particularly well in ranking tasks, consistently placing relevant items at the top of recommendation lists. In rating prediction tasks, ChatGPT demonstrates strong performance in few-shot scenarios where limited data is available. However, in zero-shot settings, while ChatGPT is still effective, its performance declines slightly, especially compared to traditional methods like ItemKNN, which excel in cold-start conditions. The analysis also reveals limitations in ChatGPT's recommendation capabilities, such as dependency on contextual information, limited scalability with increasing data, lack of transparency, and the potential risk of overfitting.
Overall, this thesis concludes that while ChatGPT shows substantial promise in recommendation systems, it faces specific challenges that traditional methods manage more effectively. The results suggest that hybrid approaches, combining ChatGPT's deep learning capabilities with the reliability and interpretability of traditional recommendation systems, could address these limitations. Future research is recommended to explore the integration of ChatGPT with advanced RS approaches and newer GPT models to enhance its performance in real-world applications.
This thesis conducts a comprehensive analysis of ChatGPT's effectiveness as a recommender system, comparing its performance with traditional approaches like UserKNN, ItemKNN, and TF-IDF-based similarity models. Using two real-world datasets, MovieLens 10M and GoodBooks 10k, this thesis evaluates ChatGPT’s capabilities across three key recommendation tasks: ranking, rating prediction, and cold-start scenarios. The evaluation is structured around metrics such as nDCG, RMSE, and MAE to assess the quality of recommendations and predictions.
The findings show that ChatGPT performs particularly well in ranking tasks, consistently placing relevant items at the top of recommendation lists. In rating prediction tasks, ChatGPT demonstrates strong performance in few-shot scenarios where limited data is available. However, in zero-shot settings, while ChatGPT is still effective, its performance declines slightly, especially compared to traditional methods like ItemKNN, which excel in cold-start conditions. The analysis also reveals limitations in ChatGPT's recommendation capabilities, such as dependency on contextual information, limited scalability with increasing data, lack of transparency, and the potential risk of overfitting.
Overall, this thesis concludes that while ChatGPT shows substantial promise in recommendation systems, it faces specific challenges that traditional methods manage more effectively. The results suggest that hybrid approaches, combining ChatGPT's deep learning capabilities with the reliability and interpretability of traditional recommendation systems, could address these limitations. Future research is recommended to explore the integration of ChatGPT with advanced RS approaches and newer GPT models to enhance its performance in real-world applications.