What is Differential Privacy?
Differential Privacy is a mathematical definition of privacy that provides provable guarantees about the privacy of individuals in a dataset. It works by adding carefully calibrated noise to query results or data, ensuring that individual records cannot be identified.
Core Concept
Definition A mechanism M satisfies ε-differential privacy if for any two databases D and D' differing by one record:
P[M(D) ∈ S] ≤ e^ε × P[M(D') ∈ S]
Key Parameters
Epsilon (ε)
- Privacy loss parameter
- Lower = more privacy
- Trade-off with accuracy
Delta (δ)
- Probability of privacy breach
- Usually very small
- Relaxed definition
Noise Mechanisms
Laplace Mechanism For numerical queries. Adds Laplace noise.
Gaussian Mechanism For (ε,δ)-DP. Adds Gaussian noise.
Exponential Mechanism For categorical outputs. Probabilistic selection.
Applications
- Census data release
- Machine learning
- Location data
- Health analytics
- Search queries
Benefits
- Provable guarantees
- Composition theorems
- Flexible implementation
- Regulatory compliance
Challenges
- Accuracy trade-offs
- Parameter selection
- Complex to implement
- Computational cost
Implementations
- Google DP library
- Apple (device analytics)
- US Census Bureau
- OpenDP