Design, Analysis and Modeling of On-chip Interconnects for Large Scale Systems
Licentiate thesis, 2026
As the semiconductor industry faces increasing challenges in technology scaling, integrading more devices onto a single monolithic die incurs prohibitive manufacturing costs due to low yield. Chiplet-based systems have emerged as a cost-effective alternative, offering higher yield and enabling heterogeneous integration across different process technologies. However, chiplet-based systems come with performance overheads. In particular, inter-chiplet communication is constrained by limited bandwidth and higher latency overheads that are further exacerbated by the use of passive silicon interposers, which do not include any active components and limit the network design. Studying these overheads requires an infrastructure capable of simulating such systems. To that end, this thesis produces BZSim, an open-source framework for fast, large-scale microarchitectural simulation with detailed interconnect modeling. BZSim integrates BookSim into ZSim, combining accurate network modeling with fast parallel execution. It detects low-contention network traffic and analytically calculates its latency instead of simulating it, reducing the simulation overhead. This way, it achieves 2-4x faster simulations with a maximum normalized IPC error of 3-5% and an average packet latency error of 10-20%. Therefore it is an order of magnitude faster than gem5 while remaining just 3x slower than standalone ZSim. While the cost and yield advantages of chiplet-based designs are well established, their system performance overheads relative to monolithic designs have not been systematically studied. Using BZSim, this thesis conducts a comprehensive analysis of chiplet-based systems across a range of design parameters, measuring performance overheads relative to monolithic designs and their scalability to system and chiplet size. The study covers design alternatives in memory hierarchy organization, interconnect choices, and technology aspects. The analysis reveals that chiplet-based chips reduce recurring engineering costs by almost half compared to monolithic designs, at the cost of about a third of the monolithic performance. The inter-chiplet interconnect identified as a key source of this overhead, motivating a targeted network solution. For this reason, this thesis proposes PRISM: a new network for multi-chiplet chips on passive silicon interposers. PRISM introduces a new topology which offeres low hop count for inter-chiplet traffic, and a deterministic routing algorithm with virtual-network separation for deadlock-free routing. It also improves chiplet-to-chiplet bandwidth utilization through selective flit compression, applied dynamically per flit only when inter-chiplet congestion is detected, minimizing compression overheads. Evaluated on a 256-core system with 16 chiplets, PRISM improves inter-chiplet packet latency by up to 5.8x and overall system performance by up to 17.4%, with selective compression outperforming always-on compression by up to 6.5%.
Microarchitectural simulation
Network-on-Chip
Chiplet-based systems
In-network compression