Personalized Support Vector Machines for Privacy-Preserving Federated Learning
Ponomarenko-Timofeev, Aleksei (2025)
Ponomarenko-Timofeev, Aleksei
Tampere University
2025
Tieto- ja sähkötekniikan tohtoriohjelma - Doctoral Programme in Computing and Electrical Engineering
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Väitöspäivä
2025-12-12
Julkaisun pysyvä osoite on
https://urn.fi/URN:ISBN:978-952-03-4255-5
https://urn.fi/URN:ISBN:978-952-03-4255-5
Tiivistelmä
The pervasive digitalization of the environment enables monitoring of real-world systems by collecting data from multiple diverse sources. Such data may correspond to problems such as network intrusion detection or predicting the state of the wireless channel. While machine learning (ML)-based solutions are increasingly employed in such tasks, centralized learning approaches often fall short in terms of scalability and privacy. A more practical alternative is offered by federated learning (FL), which enables collaborative model training without disclosing raw data. However, FL brings a new set of challenges such as heterogeneous data distributions, diverse computational resources, and privacy requirements, which are rarely considered together in the algorithm design.
This dissertation addresses the heterogeneity of the data structure and distribution, the heterogeneity of the computational resources, and the requirements for privacy preservation. Motivated by the fact that such factors may be present simultaneously, this dissertation focuses on developing lightweight privacy-preserving personalized FL algorithms. The work aims to bridge the gap in the FL field by formulating a set of algorithms for learning classification/regression models (i), proposing policies for operating in highly heterogeneous systems and assessing their applicability in different scenarios (ii), assessing the applicability of conventional privacy preservation measures (iii), and proposing novel privacy-preserving mechanisms and evaluating their applicability (iv).
Assessments are based on the results collected from extensive experiments across a broad range of scenarios and data types. The numerical results demonstrate the effectiveness of the proposed methods. In systems with heterogeneous computational resources and data, the proposed algorithms consistently maintain performance on par with or better than the state-of-the-art. The training delay was observed to be up to 20 % lower compared with the state-of-the-art personalized FL algorithms. Moreover, the proposed policies for operating in highly heterogeneous systems retain fairness and leverage heterogeneity to improve the quality of the learned models. The positive effect of these policies was observed through faster model convergence and stable fairness. It was also determined that conventional privacy preservation measures, such as differential privacy mechanisms, may be applied to the system where the proposed algorithms are used. The utility of the models learned by the algorithms was assessed for these privacy preservation measures. The proposed privacy preservation mechanisms, based on multiplicative noise, were found to provide better privacy/utility trade-offs compared to the conventional Gaussian mechanism within a certain operational region. These regions of higher utility were explored for diverse datasets (e.g., image-based, time-series). Multiple approaches to improving the utility of the learned model while retaining the privacy guarantees were studied based on the properties of the algorithms. The advantages of these approaches are demonstrated through both qualitative insights and quantitative evidence. Overall, this thesis addresses a key gap in the FL literature by providing lightweight, privacy-preserving, and personalized algorithms designed for heterogeneous wireless systems.
This dissertation addresses the heterogeneity of the data structure and distribution, the heterogeneity of the computational resources, and the requirements for privacy preservation. Motivated by the fact that such factors may be present simultaneously, this dissertation focuses on developing lightweight privacy-preserving personalized FL algorithms. The work aims to bridge the gap in the FL field by formulating a set of algorithms for learning classification/regression models (i), proposing policies for operating in highly heterogeneous systems and assessing their applicability in different scenarios (ii), assessing the applicability of conventional privacy preservation measures (iii), and proposing novel privacy-preserving mechanisms and evaluating their applicability (iv).
Assessments are based on the results collected from extensive experiments across a broad range of scenarios and data types. The numerical results demonstrate the effectiveness of the proposed methods. In systems with heterogeneous computational resources and data, the proposed algorithms consistently maintain performance on par with or better than the state-of-the-art. The training delay was observed to be up to 20 % lower compared with the state-of-the-art personalized FL algorithms. Moreover, the proposed policies for operating in highly heterogeneous systems retain fairness and leverage heterogeneity to improve the quality of the learned models. The positive effect of these policies was observed through faster model convergence and stable fairness. It was also determined that conventional privacy preservation measures, such as differential privacy mechanisms, may be applied to the system where the proposed algorithms are used. The utility of the models learned by the algorithms was assessed for these privacy preservation measures. The proposed privacy preservation mechanisms, based on multiplicative noise, were found to provide better privacy/utility trade-offs compared to the conventional Gaussian mechanism within a certain operational region. These regions of higher utility were explored for diverse datasets (e.g., image-based, time-series). Multiple approaches to improving the utility of the learned model while retaining the privacy guarantees were studied based on the properties of the algorithms. The advantages of these approaches are demonstrated through both qualitative insights and quantitative evidence. Overall, this thesis addresses a key gap in the FL literature by providing lightweight, privacy-preserving, and personalized algorithms designed for heterogeneous wireless systems.
Kokoelmat
- Väitöskirjat [5189]
