Automatic Annotation of Confidential Data in Java Code
Paper in proceeding, 2022

The problem of confidential information leak can be addressed by using automatic tools that take a set of annotated inputs (the source) and track their flow to public sinks. Unfortunately, manually annotating the code with labels specifying the secret sources is one of the main obstacles in the adoption of such trackers.
In this work, we present an approach for the automatic generation of labels for confidential data in Java programs. Our solution is based on a graph-based representation of Java methods: starting from a minimal set of known API calls, it propagates the labels both intra- and inter-procedurally until a fix-point is reached.
In our evaluation, we encode our synthesis and propagation algorithm in Datalog and assess the accuracy of our technique on seven previously annotated internal code bases, where we can reconstruct 75% of the preexisting manual annotations. In addition to this single data point, we also perform an assessment using samples from the SecuriBench-micro benchmark, and we provide additional sample programs that demonstrate the capabilities and the limitations of our approach.

data security

Author

Iulia Bastys

Chalmers, Computer Science and Engineering (Chalmers), Information Security

Pauligne Bolignano

Amazon

Franco Raimondi

Amazon

Middlesex University

Daniel Schoepe

Amazon

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. 13291 LNCS 146-161
9783031081460 (ISBN)

14th International Symposium on Foundations & Practice of Security
Paris, France,

Areas of Advance

Information and Communication Technology

Subject Categories

Computer Science

DOI

10.1007/978-3-031-08147-7_10

More information

Latest update

9/17/2024