Differential Privacy - A Balancing Act
Doctoral thesis, 2021

Data privacy is an ever important aspect of data analyses. Historically, a plethora of privacy techniques have been introduced to protect data, but few have stood the test of time. From investigating the overlap between big data research, and security and privacy research, I have found that differential privacy presents itself as a promising defender of data privacy.

Differential privacy is a rigorous, mathematical notion of privacy. Nevertheless, privacy comes at a cost. In order to achieve differential privacy, we need to introduce some form of inaccuracy (i.e. error) to our analyses. Hence, practitioners need to engage in a balancing act between accuracy and privacy when adopting differential privacy. As a consequence, understanding this accuracy/privacy trade-off is vital to being able to use differential privacy in real data analyses.

In this thesis, I aim to bridge the gap between differential privacy in theory, and differential privacy in practice. Most notably, I aim to convey a better understanding of the accuracy/privacy trade-off, by 1) implementing tools to tweak accuracy/privacy in a real use case, 2) presenting a methodology for empirically predicting error, and 3) systematizing and analyzing known accuracy improvement techniques for differentially private algorithms. Additionally, I also put differential privacy into context by investigating how it can be applied in the automotive domain. Using the automotive domain as an example, I introduce the main challenges that constitutes the balancing act, and provide advice for moving forward.

privacy

big data

data privacy

vehicular data

accuracy

accuracy/privacy trade-off

utility

differential privacy

CSE EDIT 8103
Opponent: Associate Professor Michael Hay, Department of Computer Science, Colgate University, United States

Author

Boel Nelson

Chalmers, Computer Science and Engineering (Chalmers), Information Security

Security and Privacy for Big Data: A Systematic Literature Review

2016 IEEE International Conference on Big Data (Big Data),; (2016)p. 3693-3702

Paper in proceeding

Introducing Differential Privacy to the Automotive Domain: Opportunities and Challenges

Vehicular Technology Conference (VTC-Fall), 2017 IEEE 86th,; Vol. 2017(2018)p. 1-7

Paper in proceeding

Joint Subjective and Objective Data Capture and Analytics for Automotive Applications

2017 IEEE 86th Vehicular Technology Conference (VTC-Fall),; (2018)

Paper in proceeding

Nelson, B. Randori: Local Differential Privacy for All

Nelson, B. Efficient Error Prediction for Differentially Private Algorithms

SoK: Chasing Accuracy and Privacy, and Catching Both in Differentially Private Histogram Publication

Transactions on Data Privacy,; Vol. 13(2020)p. 201-245

Journal article

How much privacy is enough?

In an age of data, privacy emerges as an important factor in data analysis. Still, we can ask ourselves: what is adequate privacy? How much privacy do we need? And how can we achieve a given level of privacy? In this thesis I investigate a specific definition of privacy, that essentially specifies how much information can be leaked from data. This definition of privacy is called differential privacy.

Basically, differential privacy puts restraints on how accurate data we can release. Suppose that we want to calculate the average salary in Sweden. Let's assume that the average salary is 28 000 SEK. Then, instead of reporting the true average, we may report that the average salary is 28 367 SEK. Notice how we don't disclose the true answer, but we are able to give an answer that is close enough to the true answer to still be useful. That is, we can achieve privacy by reporting almost the real answer.

Consequently, differential privacy in practice boils down to a balancing act between accuracy and privacy. We need to introduce inaccuracy to preserve privacy, but if we make results too inaccurate they become useless. Hence, to facilitate the use of differentially privacy in practice, we need to understand the balancing act at differential privacy's core. This thesis is dedicated to exploring the balancing act in different settings, for example in the automotive domain. To further aid in understanding the privacy/accuracy trade-off, I provide tools and methods for understanding the balancing act better.

BAuD II: Storskalig insamling och analys av data för kunskapsdriven produktutveckling

VINNOVA (2014-03935), 2015-01-01 -- 2017-12-31.

WebSec: Securing Web-driven Systems

Swedish Foundation for Strategic Research (SSF) (RIT17-0011), 2018-03-01 -- 2023-02-28.

Areas of Advance

Information and Communication Technology

Transport

Subject Categories

Computer and Information Science

Computer Science

ISBN

978-91-7905-487-8

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 4954

Publisher

Chalmers University of Technology

CSE EDIT 8103

Online

Opponent: Associate Professor Michael Hay, Department of Computer Science, Colgate University, United States

More information

Latest update

5/25/2021