https://drive.google.com/file/d/1PUYqaUE64Obi3mcRvlwEkQ7RgDWV0twP/view?usp=sharing
Mitigating Execution Time Leakage in Differential Privacy
This repository contains code and materials for implementing privacy-preserving data analysis techniques using differential privacy in a database context. The project focuses on reducing execution time leakage in private SQL query processing by applying padding techniques, allowing tunable trade-offs between privacy and utility.
The project integrates OpenDP with PostgreSQL to enable privacy-safe data queries on cloud databases. It demonstrates methods to:
- Protect query execution time to minimize side-channel attacks.
- Implement padding techniques that balance privacy guarantees and query performance.
- Visualize and evaluate privacy-utility trade-offs.
- Conventional data privacy methods have significant limitations and assumptions.
- Differential Privacy improves privacy guarantees by protecting individuals and adding controlled noise.
- However, it is vulnerable to execution time leakage attacks, which this project aims to mitigate.
- Manual Time Delay: Introduces fixed delays but is largely ineffective in practice.
- Data Padding: Adds dummy data to mask query sizes and execution times.
- Full Padding: Offers strong privacy by fully hiding execution time patterns but at the cost of reduced utility and slower runtimes.
- Partial Padding (Differential Privacy Padding): Provides a balanced trade-off, improving efficiency while still protecting user-decided privacy.
- Padding strategies to reduce timing leakage in differential privacy.
- Integration of PostgreSQL with Python for executing and managing private queries.
- Visualization of query performance and padding effectiveness using R.
- Reproducible Jupyter notebooks and scripts for experimentation.
- Python 3.x
- PostgreSQL
- R (for visualization)
- Python libraries:
opendp,psycopg2,pandas,matplotlib - R packages:
ggplot2
-
Set up PostgreSQL and create a test database.
-
Install Python dependencies:
pip install opendp psycopg2 pandas matplotlib
- Configure database connection parameters in the scripts.
- Run Python scripts to execute private queries with padding.
- Use the R scripts/notebooks to visualize performance and privacy trade-offs.
padding_techniques.py: Implements padding methods to mask execution time.query_runner.py: Executes private SQL queries using OpenDP and padding.visualize_performance.R: R script to generate performance plots.
The padding technique reduced query slowdown from 93–1054× to 54–471× while maintaining tunable privacy-utility trade-offs, demonstrating significant performance improvement in privacy-preserving queries.