A Programming Language for Data Privacy with Accuracy Estimations
Journal article, 2021
Differential privacy offers a formal framework for reasoning about the privacy and accuracy of computations on private data. It also offers a rich set of building blocks for constructing private data analyses. When carefully calibrated, these analyses simultaneously guarantee the privacy of the individuals contributing their data, and the accuracy of the data analysis results, inferring useful properties about the population. The compositional nature of differential privacy has motivated the design and implementation of several programming languages to ease the implementation of differentially private analyses. Even though these programming languages provide support for reasoning about privacy, most of them disregard reasoning about the accuracy of data analyses. To overcome this limitation, we present DPella, a programming framework providing data analysts with support for reasoning about privacy, accuracy, and their trade-offs. The distinguishing feature of DPella is a novel component that statically tracks the accuracy of different data analyses. To provide tight accuracy estimations, this component leverages taint analysis for automatically inferring statistical independence of the different noise quantities added for guaranteeing privacy. We evaluate our approach by implementing several classical queries from the literature and showing how data analysts can calibrate the privacy parameters to meet the accuracy requirements, and vice versa.