Skip to content

ntat/Differentially_Private_Federated_Learning

Repository files navigation

Differentially Private Federated Learning

Full text available here: Nikolaos Tatarakis - Differentially Private Federated Learning.

Overview

Federated learning, is a decentralized way of training models. The idea is that multiple clients/devices can participate during training time without actually sharing their data (ie data stays local). Instead, only local model updates (model parameters) are being transferred to a server, who in turn, aggregates them to improve the global model, and then it redistributes it to every client (Federated Learning/Averaging, McMahan et al.).

Fed 1
Fed 2

Despite not sharing raw data directly, federated learning isn't a perfect privacy solution. The model updates can leak information through various attack vectors (eg gradient inversion attacks, model inversion attacks etc). Combining it with Differential Privacy (Dwork, 2006), can provide quantifiable privacy guarantees.

Essentially, differential privacy, is a mathematical framework that adds controlled noise (via a mechanism $M$) to protect individual data while maintaining useful statistical properties. It ensures that the outcome of a computation (eg a population statistic) doesn’t change much whether any single individual’s data is included or not. The privacy guarantee is controlled by the parameter called epsilon ($ϵ$). A smaller $ϵ$ means stronger privacy but more noise in the output. A larger $ϵ$ means less privacy but a more accurate result. So there are trade offs to be considered.


Dp 1
Dp 2

We built upon these two notions, and proposed a new algorithm that can scale down the standard deviation ($σ$) directly in proportion to the local mini-batch size of each client, and indirectly by the number of clients in the system and the dataset size ($σ$ is used in the Gausian Mechanism $M$ to control noise). Roughly speaking, this gives this algorithm the advantage of injecting noise directly into the gradients of the clients’ local models (as a part of their training procedure), which has two immediate effects:

  1. It provides strong privacy guarantees at a data-point level.
  2. It allows us to recount this noise to provide an additional layer of privacy guarantees at a client level without explicitly adding more noise to the system.

➡️ For privacy calculations we assume:

  • Same fraction of client participation per communication round.
  • All clients have the same amount of data and batch size.
  • All clients perform the same amount of updates/steps per communication round.

For $ϵ, δ$ budgeting we used the generalized version of the Moments Accountant (Abadi et al.), that works under the notion of Rényi Differential Privacy (Mironov, 2017), namely RDP Accountant.

This repository implements algorithm 4 and algorithm 5, which are the main findings of the thesis.

General Requirements

❗ Legacy code information: This code was written years ago and this is a slightly refactored version of it.

Tested with the following setup:

  • Python == 3.11.3
  • PyTorch == 2.1.2
  • Torchvision == 0.16.0
  • SciPy == 1.11.1
  • NumPy == 1.25.1
  • Matplotlib == 3.7.2

Usage

  1. Clone the repository:
    git clone https://github.com/ntat/Differentially_Private_Federated_Learning.git
  2. Install dependencies via pip:
    pip install -r requirements.txt
  3. Set the config.ini according to your training and privacy requirements. Before the actual model training happens, it's advisable to look into the offline_accounting folder so that you precompute your privacy budget according to your settings in the config, and confirm you are within acceptable privacy levels during the course of the training.
📁 Click to expand: config.ini
[Hyperparams]
Clients = 100
Shards = 200
comm_rounds = 635
local_epochs = 1
learning_rate = 0.02
tr_batch_size = 100

[Privacy]
C = 0.10
sigma = 4.0
target_ep = 1.31
target_ep_client = 8.0
clipThreshold = 4

[Data]
iid = True
  1. Run the main.py training script with python:
    python main.py -c config.ini

Selected results

  • 10000 Clients:

Discussion

Performing Differential Privacy for Machine Learning applications, at a sample level, is quite computationally inefficient mainly because of how auto-differentiation tools are structured. In our approach we are using the 'trick' described in (Goodfellow / Technical Report) for accessing the individual gradients. Although this is only limited to linear layers, it's still relatively efficient. One way to go around this for other type of layers (eg LSTM, ConvNets etc) is by microbatching (ie go through all samples in the batch one by one) and then do manual computations for each microbatch such as backward passes and clipping (very inefficient). Libraries like Opacus from META provide efficient tools for Machine Learning & Differential Privacy.

Releases

No releases published

Packages

No packages published

Languages