Advances in Privacy-Preserving Machine Learning : Techniques, challenges, and applications
Khan, Tanveer (2025)
Khan, Tanveer
Tampere University
2025
Tieto- ja sähkötekniikan tohtoriohjelma - Doctoral Programme in Computing and Electrical Engineering
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Väitöspäivä
2025-09-29
Julkaisun pysyvä osoite on
https://urn.fi/URN:ISBN:978-952-03-4107-7
https://urn.fi/URN:ISBN:978-952-03-4107-7
Tiivistelmä
With the widespread use of digital services, enormous amounts of data are constantly generated and collected. Extracting meaningful insights from such data is both challenging and time-consuming. Machine Learning (ML), particularly Deep Learning (DL), has gained significant attention for its ability to uncover valuable patterns and perform well in many tasks such as traffic analysis, image classification. These models typically involve a training phase where the model learns from data, and a testing phase, where it predicts outputs for unseen inputs. However, training DL models often requires large datasets gathered from multiple sources, creating additional challenges in data management, privacy, and infrastructure needs.
Machine-Learning-as-a-Service (MLaaS) offers a solution by automating the development and deployment of ML models, making it appealing to data-driven companies. However, MLaaS raises privacy concerns, especially in sensitive domains such as healthcare, where data security is crucial under GDPR. This creates a need for frameworks that balance data privacy with innovation,
This dissertation addresses that need by exploring Privacy-preserving Techniques (PPTs) and applying them to MLaaS. Our first contribution introduces Learning in the Dark, a model that uses Homomorphic Encryption (HE) to allow predictions on encrypted data. Our second contribution, A More Secure Split, combines Split Learning (SL) and HE to mitigate privacy leakage in SL. Building on this, our third contribution Split without a Leak addresses privacy leakage during backpropagation in SL. This method enhances the security of the SL framework and making a significant advancement in the field of PPML.
Our fourth contribution, Wildest Dreams, presents a comprehensive Systemization of Knowledge (SoK) on HE and Secure Multi-party Computation (SMPC) based approaches in PPML. This SoK serves as a valuable step towards deepening the understanding and improving the effectiveness of PPML techniques. Aiming to further enhance privacy in SL, we propose Make SPlit not Hijack, introducing the use of Function Secret Sharing (FSS) – an SMPC technique that allows two parties to perform computations on a public input using a private, secret-shared function. We demonstrate that FSS not only eliminates privacy leakage in SL but also protects data from different attacks. Additionally, we propose GuardML, which employs a Hybrid Homomorphic Encryption (HHE) – a technique that combines symmetric cryptography with HE to offload expensive computation to the cloud securely. This work paves the way for creating practical and secure solutions to real-world privacy challenges.
Machine-Learning-as-a-Service (MLaaS) offers a solution by automating the development and deployment of ML models, making it appealing to data-driven companies. However, MLaaS raises privacy concerns, especially in sensitive domains such as healthcare, where data security is crucial under GDPR. This creates a need for frameworks that balance data privacy with innovation,
This dissertation addresses that need by exploring Privacy-preserving Techniques (PPTs) and applying them to MLaaS. Our first contribution introduces Learning in the Dark, a model that uses Homomorphic Encryption (HE) to allow predictions on encrypted data. Our second contribution, A More Secure Split, combines Split Learning (SL) and HE to mitigate privacy leakage in SL. Building on this, our third contribution Split without a Leak addresses privacy leakage during backpropagation in SL. This method enhances the security of the SL framework and making a significant advancement in the field of PPML.
Our fourth contribution, Wildest Dreams, presents a comprehensive Systemization of Knowledge (SoK) on HE and Secure Multi-party Computation (SMPC) based approaches in PPML. This SoK serves as a valuable step towards deepening the understanding and improving the effectiveness of PPML techniques. Aiming to further enhance privacy in SL, we propose Make SPlit not Hijack, introducing the use of Function Secret Sharing (FSS) – an SMPC technique that allows two parties to perform computations on a public input using a private, secret-shared function. We demonstrate that FSS not only eliminates privacy leakage in SL but also protects data from different attacks. Additionally, we propose GuardML, which employs a Hybrid Homomorphic Encryption (HHE) – a technique that combines symmetric cryptography with HE to offload expensive computation to the cloud securely. This work paves the way for creating practical and secure solutions to real-world privacy challenges.
Kokoelmat
- Väitöskirjat [5232]
