Differentially private synthetic data generation 

Yiyun He (UC Irvine)
Monday, April 1, 2024 - 2:30pm to 3:20pm
CLK 219
We present a highly effective algorithmic approach, PMM, for generating \epsilon-differentially private synthetic data in a bounded metric space with near-optimal utility guarantees under the 1-Wasserstein distance. In particular, for a dataset in the hypercube [0,1]^d, our algorithm generates synthetic dataset such that the expected 1-Wasserstein distance between the empirical measure of true and synthetic dataset is O(n^{-1/d}) for d>1. Our accuracy guarantee is optimal up to a constant factor for d>1, and up to a logarithmic factor for d=1. Also, PMM is time-efficient with a fast running time of O(\epsilon d n). Derived from PMM algorithm, more variations of synthetic data publishing problems can be studied under different settings.

