Adaptive Microarchitectural Optimizations to Improve Performance and Security of Multi-Core Architectures
Doktorsavhandling, 2023

With the current technological barriers, microarchitectural optimizations are increasingly important to ensure performance scalability of computing systems. The shift to multi-core architectures increases the demands on the memory system, and amplifies the role of microarchitectural optimizations in performance improvement. In a multi-core system, microarchitectural resources are usually shared, such as the cache, to maximize utilization but sharing can also lead to contention and lower performance. This can be mitigated through partitioning of shared caches.

However, microarchitectural optimizations which were assumed to be fundamentally secure for a long time, can be used in side-channel attacks to exploit secrets, as cryptographic keys. Timing-based side-channels exploit predictable timing variations due to the interaction with microarchitectural optimizations during program execution. Going forward, there is a strong need to be able to leverage microarchitectural optimizations for performance without compromising security.

This thesis contributes with three adaptive microarchitectural resource management optimizations to improve security and/or performance of multi-core architectures and a systematization-of-knowledge of timing-based side-channel attacks. 

We observe that to achieve high-performance cache partitioning in a multi-core system three requirements need to be met: i) fine-granularity of partitions, ii) locality-aware placement and iii) frequent changes. These requirements lead to high overheads for current centralized partitioning solutions, especially as the number of cores in the system increases. To address this problem, we present an adaptive and scalable cache partitioning solution (DELTA) using a distributed and asynchronous allocation algorithm. The allocations occur through core-to-core challenges, where applications with larger performance benefit will gain cache capacity. The solution is implementable in hardware, due to low computational complexity, and can scale to large core counts.

According to our analysis, better performance can be achieved by coordination of multiple optimizations for different resources, e.g., off-chip bandwidth and cache, but is challenging due to the increased number of possible allocations which need to be evaluated. Based on these observations, we present a solution (CBP) for coordinated management of the optimizations: cache partitioning, bandwidth partitioning and prefetching. Efficient allocations, considering the inter-resource interactions and trade-offs, are achieved using local resource managers to limit the solution space.

The continuously growing number of side-channel attacks leveraging microarchitectural optimizations prompts us to review attacks and defenses to understand the vulnerabilities of different microarchitectural optimizations. We identify the four root causes of timing-based side-channel attacks: determinism, sharing, access violation and information flow. Our key insight is that eliminating any of the exploited root causes, in any of the attack steps, is enough to provide protection. Based on our framework, we present a systematization of the attacks and defenses on a wide range of microarchitectural optimizations, which highlights their key similarities. 

Shared caches are an attractive attack surface for side-channel attacks, while defenses need to be efficient since the cache is crucial for performance. To address this issue, we present an adaptive and scalable cache partitioning solution (SCALE) for protection against cache side-channel attacks. The solution leverages randomness, and provides quantifiable and information theoretic security guarantees using differential privacy. The solution closes the performance gap to a state-of-the-art non-secure allocation policy for a mix of secure and non-secure applications.

Multi-Core Architectures

Bandwidth Partitioning

Prefetch Throttling

Cache Partitioning

Microarchitectural Optimizations

Side-channel Attacks

EA
Opponent: Professor Moinuddin K. Qureshi, Georgia Institute of Technology

Författare

Nadja Holtryd

Chalmers, Data- och informationsteknik, Datorteknik

DELTA: Distributed Locality-Aware Cache Partitioning for Tile-based Chip Multiprocessors

Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium, IPDPS 2020,;(2020)p. 578-589

Paper i proceeding

CBP: Coordinated management of cache partitioning, bandwidth partitioning and prefetch throttling

30th International Conference on Parallel Architectures and Compilation Techniques (Proceedings),;(2021)p. 213-225

Paper i proceeding

Nadja Ramhöj Holtryd, Madhavan Manivannan and Per Stenström, 'SoK: Analysis of Root Causes and Defense Strategies for Attacks on Microarchitectural Optimizations'

Nadja Ramhöj Holtryd, Madhavan Manivannan and Per Stenström, 'SCALE: Secure and Scalable Cache Partitioning'

Microprocessors are an integral part of modern society. They contain billions of tiny transistors which act as the fundamental building blocks. A large chunk of the transistor budget is dedicated to holding data (caches) and another is reserved for the computational units that process it (cores). In addition, there are separate chips just dedicated to storing data (memory). Think of the data as books, the cache as a desk and the memory as a library. If you need a certain book, finding it on your desk as opposed to visiting the library saves both time and effort.

For an average user higher performance means faster execution and lower costs. In the past this has primarily been achieved by shrinking transistors to fit more in a given area and operating them at higher frequencies. Unfortunately, we’re reaching atomic transistor dimensions which changes the physical properties of the transistors making this approach infeasible. This has paved the way for multi-core processing which achieves higher aggregate performance by running applications concurrently.

In a multi-core processor, sharing of cache space by concurrently running applications can lead to conflicts and become a problem, in the same way as your desk would become awfully crowded if shared by too many people. With enough desk intruders you’d be running back and forth to the library. This has fuelled the need to have more optimized caching strategies. One solution to this problem is through partitioning, which is akin to setting up rules for what parts of the desk each person can use, how one can share books that multiple people need and what happens when someone is happy with just a few books while others need troves of them. However, determining appropriate space allocation for each person over time is challenging.

In addition to performance, security is also of paramount importance. Recent discoveries have shown that many microprocessor optimizations which aim to improve performance, such as caching, lead to new security vulnerabilities. Attacks exploiting these vulnerabilities can reveal secrets, such as cryptographic keys, which can have devastating consequences. In the desk analogy, this would be when information about different peoples books and reading habits can be gathered by observing the shared desk. The need to protect this information complicates the sharing of desk space.

This thesis tackles the issues of improving the performance and security of multi-core processors with main focus on the cache and the memory system.

Meeting Challenges in Computer Architecture (MECCA)

Europeiska kommissionen (EU) (EC/FP7/340328), 2014-02-01 -- 2019-01-31.

Low-energy toolset for heterogeneous computing (LEGaTO)

Europeiska kommissionen (EU) (EC/H2020/780681), 2018-02-01 -- 2021-01-31.

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier (SSIF 2011)

Datorsystem

ISBN

978-91-7905-749-7

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5215

Utgivare

Chalmers

EA

Online

Opponent: Professor Moinuddin K. Qureshi, Georgia Institute of Technology

Mer information

Senast uppdaterat

2023-02-09