Differential Privacy - A Balancing Act
Doctoral thesis, 2021
Data privacy is an ever important aspect of data analyses. Historically, a plethora of privacy techniques have been introduced to protect data, but few have stood the test of time. From investigating the overlap between big data research, and security and privacy research, I have found that differential privacy presents itself as a promising defender of data privacy.
Differential privacy is a rigorous, mathematical notion of privacy. Nevertheless, privacy comes at a cost. In order to achieve differential privacy, we need to introduce some form of inaccuracy (i.e. error) to our analyses. Hence, practitioners need to engage in a balancing act between accuracy and privacy when adopting differential privacy. As a consequence, understanding this accuracy/privacy trade-off is vital to being able to use differential privacy in real data analyses.
In this thesis, I aim to bridge the gap between differential privacy in theory, and differential privacy in practice. Most notably, I aim to convey a better understanding of the accuracy/privacy trade-off, by 1) implementing tools to tweak accuracy/privacy in a real use case, 2) presenting a methodology for empirically predicting error, and 3) systematizing and analyzing known accuracy improvement techniques for differentially private algorithms. Additionally, I also put differential privacy into context by investigating how it can be applied in the automotive domain. Using the automotive domain as an example, I introduce the main challenges that constitutes the balancing act, and provide advice for moving forward.
vehicular data
utility
accuracy/privacy trade-off
privacy
accuracy
differential privacy
data privacy
big data
Author
Boel Nelson
Chalmers, Computer Science and Engineering (Chalmers), Information Security
Security and Privacy for Big Data: A Systematic Literature Review
2016 IEEE International Conference on Big Data (Big Data),;(2016)p. 3693-3702
Paper in proceeding
Introducing Differential Privacy to the Automotive Domain: Opportunities and Challenges
IEEE Vehicular Technology Conference,;Vol. 2017-September(2017)p. 1-7
Paper in proceeding
Joint Subjective and Objective Data Capture and Analytics for Automotive Applications
IEEE Vehicular Technology Conference,;(2017)
Paper in proceeding
Nelson, B. Randori: Local Differential Privacy for All
Efficient Error Prediction for Differentially Private Algorithms
ACM International Conference Proceeding Series,;(2021)
Paper in proceeding
SoK: Chasing Accuracy and Privacy, and Catching Both in Differentially Private Histogram Publication
Transactions on Data Privacy,;Vol. 13(2020)p. 201-245
Journal article
How much privacy is enough?
In an age of data, privacy emerges as an important factor in data analysis. Still, we can ask ourselves: what is adequate privacy? How much privacy do we need? And how can we achieve a given level of privacy? In this thesis I investigate a specific definition of privacy, that essentially specifies how much information can be leaked from data. This definition of privacy is called differential privacy.
Basically, differential privacy puts restraints on how accurate data we can release. Suppose that we want to calculate the average salary in Sweden. Let's assume that the average salary is 28 000 SEK. Then, instead of reporting the true average, we may report that the average salary is 28 367 SEK. Notice how we don't disclose the true answer, but we are able to give an answer that is close enough to the true answer to still be useful. That is, we can achieve privacy by reporting almost the real answer.
Consequently, differential privacy in practice boils down to a balancing act between accuracy and privacy. We need to introduce inaccuracy to preserve privacy, but if we make results too inaccurate they become useless. Hence, to facilitate the use of differentially privacy in practice, we need to understand the balancing act at differential privacy's core. This thesis is dedicated to exploring the balancing act in different settings, for example in the automotive domain. To further aid in understanding the privacy/accuracy trade-off, I provide tools and methods for understanding the balancing act better.
BAuD II: Storskalig insamling och analys av data för kunskapsdriven produktutveckling
VINNOVA (2014-03935), 2015-01-01 -- 2017-12-31.
WebSec: Securing Web-driven Systems
Swedish Foundation for Strategic Research (SSF) (RIT17-0011), 2018-03-01 -- 2023-02-28.
Areas of Advance
Information and Communication Technology
Transport
Subject Categories
Computer and Information Science
Computer Science
ISBN
978-91-7905-487-8
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 4954
Publisher
Chalmers
CSE EDIT 8103
Opponent: Associate Professor Michael Hay, Department of Computer Science, Colgate University, United States