Benchmark MEDIUM relevance

Privacy Preserving Machine Learning Workflow: from Anonymization to Personalized Differential Privacy Budgets in Federated Learning

Judith Sáinz-Pardo Díaz Álvaro López García

cs.CR cs.AI

Published

May 4, 2026

Updated

May 4, 2026

Links

PDF arxiv

Abstract

The growing development of artificial intelligence based solutions, together with privacy legislation, has driven the rise of the so-called privacy preserving machine learning architectures, such as federated learning. While federated learning enables model training on decentralized data preventing their sharing and centralization, it still faces several challenges related to data integrity and privacy. This paper presents a comprehensive privacy preserving federated learning workflow for sensitive tabular data, including anonymization and differential privacy techniques. We also introduce a formal definition for the concept of client drift, together with ways of detecting it to mitigate poisoning attacks. Then, we detail a complete methodology for assigning personalized privacy budgets for global differential privacy to the different clients participating in the network, based on a re-identification risk metric. The proposed methodology is presented and tested on an openly available dataset of medical records. Within the experimental setup we show that the approach based on personalized budgets, compared to the architecture including global differential privacy with fixed privacy budget, achieves a better model performance in terms of two error metrics.

Metadata

Comment: Accepted at the 2nd International Conference on Federated Learning and Intelligent Computing Systems (FLICS2026)

Pro Analysis

Full threat analysis, ATLAS technique mapping, compliance impact assessment (ISO 42001, EU AI Act), and actionable recommendations are available with a Pro subscription.

Threat Deep-Dive

ATLAS Mapping

Compliance Reports

Actionable Recommendations

Start 14-Day Free Trial

Back to Research