Interpreting user behavior in MyBioethics app using explainable AI methods
Saxén, Heikki (2025)
Saxén, Heikki
2025
Tieto- ja sähkötekniikan kandidaattiohjelma - Bachelor's Programme in Computing and Electrical Engineering
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2025-06-10
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202506096994
https://urn.fi/URN:NBN:fi:tuni-202506096994
Tiivistelmä
The aim of this thesis is to study whether training a machine learning model, combined with interpretive tools to uncover the model’s reasoning, can meaningfully contribute to analyzing users’ behavior in the MyBioethics mobile application.
MyBioethics is an educational and interactive app, particularly designed to engage its users through bioethical dilemmas that relate to real-life case studies. The app also contains surveys that are aimed at helping users explore their personal worldviews and reasoning processes. These dilemmas and surveys constitute the dataset that is analyzed in this research.
In essence, the thesis develops a practical machine learning pipeline to train a model, based on the MyBioethics dataset. The inner logic of this model, i.e., what it has learned from the dataset, is then explored by applying SHapley Additive ExPlanations (SHAP), a method that clarifies the reasoning behind complex algorithms. Furthermore, modern large language models (LLMs) are used to support the interpretation of the SHAP results.
The first evaluation criterion for the feasibility of this approach is the basic technical functionality of the pipeline. The second and more significant goal is to determine whether such an approach yields novel and empirically meaningful insights into the users’ thinking, potentially driving a broader understanding of bioethics.
The findings confirm the viability of constructing and interpreting such a machine learning pipeline. The SHAP analysis, especially, proves to be effective in clarifying the internal mechanisms of the trained model. Furthermore, applying modern LLMs to interpret the SHAP output significantly enhances the interpretive power of the analysis. Overall, despite inherent limitations of research that is aimed at exploring rather than fully explaining the dataset, this combined methodological approach is seen as effective and insightful.
The thesis thus demonstrates that integrating machine learning, SHAP analysis, and modern language models represents a novel and promising methodological approach. This combination successfully provides empirically driven insights into the users’ bioethical reasoning. Moreover, it highlights the potential of merging technological advancements with humanities research, especially when addressing complex and meaningful datasets.
MyBioethics is an educational and interactive app, particularly designed to engage its users through bioethical dilemmas that relate to real-life case studies. The app also contains surveys that are aimed at helping users explore their personal worldviews and reasoning processes. These dilemmas and surveys constitute the dataset that is analyzed in this research.
In essence, the thesis develops a practical machine learning pipeline to train a model, based on the MyBioethics dataset. The inner logic of this model, i.e., what it has learned from the dataset, is then explored by applying SHapley Additive ExPlanations (SHAP), a method that clarifies the reasoning behind complex algorithms. Furthermore, modern large language models (LLMs) are used to support the interpretation of the SHAP results.
The first evaluation criterion for the feasibility of this approach is the basic technical functionality of the pipeline. The second and more significant goal is to determine whether such an approach yields novel and empirically meaningful insights into the users’ thinking, potentially driving a broader understanding of bioethics.
The findings confirm the viability of constructing and interpreting such a machine learning pipeline. The SHAP analysis, especially, proves to be effective in clarifying the internal mechanisms of the trained model. Furthermore, applying modern LLMs to interpret the SHAP output significantly enhances the interpretive power of the analysis. Overall, despite inherent limitations of research that is aimed at exploring rather than fully explaining the dataset, this combined methodological approach is seen as effective and insightful.
The thesis thus demonstrates that integrating machine learning, SHAP analysis, and modern language models represents a novel and promising methodological approach. This combination successfully provides empirically driven insights into the users’ bioethical reasoning. Moreover, it highlights the potential of merging technological advancements with humanities research, especially when addressing complex and meaningful datasets.
Kokoelmat
- Kandidaatintutkielmat [10016]