Privacy in the Age of Artificial Intelligence
An increasing number of people are using the Internet in their daily life. Indeed, more than 40% of the world population have access to the Internet, while Facebook (one of the top social network on the web) is actively used by more than 1.3 billion users each day (Statista 2017). This huge amount of customers creates an abundance of user data containing personal information. These data are becoming valuable to companies and used in various way to enrich user experience or increase revenue.
This has led many citizens and politicians to be concerned about their privacy on the Internet to such an extent that the European Union issued a "Right to be Forgotten" ruling, reflecting the desire of many individuals to restrict the use of their information. As a result, many online companies pledged to collect or share user data anonymously. However, anonymisation is not enough and makes no sense in many cases. For example, an MIT graduate was able to easily re-identify the private medical data of Governor William Weld of Massachusetts from supposedly anonymous records released by the Group Insurance Commission. All she did was to link the insurance data with the publicly available voter registration list and some background knowledge (Ohm 2009).
Those shortcomings have led to the development of a more rigorous mathematical framework for privacy: Differential privacy. Its main characteristic is to bound the information one can gain from released data, no matter what side information they have available.
In this thesis, we present differentially private algorithms for the multi-armed bandit problem. This is a well known multi round game, that originally stemmed from clinical trials applications and is now one promising solution to enrich user experience in the booming online advertising and recommendation systems. However, as recommendation systems are inherently based on user data, there is always some private information leakage. In our work, we show how to minimise this privacy loss, while maintaining the effectiveness of such algorithms. In addition, we show how one can take advantage of the correlation structure inherent in a user graph such as the one arising from a social network.
sequential decision problem