Differential Privacy - A Balancing Act
Doktorsavhandling, 2021

Data privacy is an ever important aspect of data analyses. Historically, a plethora of privacy techniques have been introduced to protect data, but few have stood the test of time. From investigating the overlap between big data research, and security and privacy research, I have found that differential privacy presents itself as a promising defender of data privacy.

Differential privacy is a rigorous, mathematical notion of privacy. Nevertheless, privacy comes at a cost. In order to achieve differential privacy, we need to introduce some form of inaccuracy (i.e. error) to our analyses. Hence, practitioners need to engage in a balancing act between accuracy and privacy when adopting differential privacy. As a consequence, understanding this accuracy/privacy trade-off is vital to being able to use differential privacy in real data analyses.

In this thesis, I aim to bridge the gap between differential privacy in theory, and differential privacy in practice. Most notably, I aim to convey a better understanding of the accuracy/privacy trade-off, by 1) implementing tools to tweak accuracy/privacy in a real use case, 2) presenting a methodology for empirically predicting error, and 3) systematizing and analyzing known accuracy improvement techniques for differentially private algorithms. Additionally, I also put differential privacy into context by investigating how it can be applied in the automotive domain. Using the automotive domain as an example, I introduce the main challenges that constitutes the balancing act, and provide advice for moving forward.

vehicular data

utility

accuracy/privacy trade-off

privacy

accuracy

differential privacy

data privacy

big data

CSE EDIT 8103
Opponent: Associate Professor Michael Hay, Department of Computer Science, Colgate University, United States

Författare

Boel Nelson

Chalmers, Data- och informationsteknik, Informationssäkerhet

Security and Privacy for Big Data: A Systematic Literature Review

2016 IEEE International Conference on Big Data (Big Data),;(2016)p. 3693-3702

Paper i proceeding

Introducing Differential Privacy to the Automotive Domain: Opportunities and Challenges

IEEE Vehicular Technology Conference,;Vol. 2017-September(2017)p. 1-7

Paper i proceeding

Joint Subjective and Objective Data Capture and Analytics for Automotive Applications

IEEE Vehicular Technology Conference,;(2017)

Paper i proceeding

Nelson, B. Randori: Local Differential Privacy for All

Efficient Error Prediction for Differentially Private Algorithms

ACM International Conference Proceeding Series,;(2021)

Paper i proceeding

SoK: Chasing Accuracy and Privacy, and Catching Both in Differentially Private Histogram Publication

Transactions on Data Privacy,;Vol. 13(2020)p. 201-245

Artikel i vetenskaplig tidskrift

How much privacy is enough?

In an age of data, privacy emerges as an important factor in data analysis. Still, we can ask ourselves: what is adequate privacy? How much privacy do we need? And how can we achieve a given level of privacy? In this thesis I investigate a specific definition of privacy, that essentially specifies how much information can be leaked from data. This definition of privacy is called differential privacy.

Basically, differential privacy puts restraints on how accurate data we can release. Suppose that we want to calculate the average salary in Sweden. Let's assume that the average salary is 28 000 SEK. Then, instead of reporting the true average, we may report that the average salary is 28 367 SEK. Notice how we don't disclose the true answer, but we are able to give an answer that is close enough to the true answer to still be useful. That is, we can achieve privacy by reporting almost the real answer.

Consequently, differential privacy in practice boils down to a balancing act between accuracy and privacy. We need to introduce inaccuracy to preserve privacy, but if we make results too inaccurate they become useless. Hence, to facilitate the use of differentially privacy in practice, we need to understand the balancing act at differential privacy's core. This thesis is dedicated to exploring the balancing act in different settings, for example in the automotive domain. To further aid in understanding the privacy/accuracy trade-off, I provide tools and methods for understanding the balancing act better.

BAuD II: Storskalig insamling och analys av data för kunskapsdriven produktutveckling

VINNOVA (2014-03935), 2015-01-01 -- 2017-12-31.

WebSec: Säkerhet i webb-drivna system

Stiftelsen för Strategisk forskning (SSF) (RIT17-0011), 2018-03-01 -- 2023-02-28.

Styrkeområden

Informations- och kommunikationsteknik

Transport

Ämneskategorier

Data- och informationsvetenskap

Datavetenskap (datalogi)

ISBN

978-91-7905-487-8

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 4954

Utgivare

Chalmers

CSE EDIT 8103

Online

Opponent: Associate Professor Michael Hay, Department of Computer Science, Colgate University, United States

Mer information

Senast uppdaterat

2023-11-08