DDRNoC: Dual Data-Rate Network-on-Chip
Networks-on-Chip (NoCs) are becoming increasing important for the performance of modern multi-core system-on-chip. For various on-chip networks with virtual channel (VC) ow control, the slow control logic (VC and switch allocation logic) of the NoC routers limits the NoC clock period while their datapath (switch and link) possesses signifcant slack. This slack results in wasted performance potential of the datapath, limits the saturation throughput of the network and reduces its energy efficiency. The aim of this thesis is to improve NoC performance by eliminating this slack and removing control logic from the router critical path. To this end, this thesis presents the Dual Data-Rate (DDR) network architecture called the DDRNoC. It utilizes the NoC datapath twice with in a clock cycle to forward its at DDR. This not only exploits the slack present in the datapath but also requires a clock with period twice the datapath delay, thus removing the shorter control logic from the critical path. This enables the DDRNoC to achieve throughput higher than single data-rate networks. Moreover, the DDRNoC also employs lookahead signalling to reduce end-to-end packet latency. FreewayNoC, an extension to the DDRNoC supplements the DDRNoC with simplified pipeline stage bypassing to reduce the zero-load latency of packets in the network.
Implementation of the DDRNoC and FreewayNoC architectures require redesign of the switch allocation (SA) mechanism to resolve contention among competing its by granting up to two its access to each switch input and output port per clock cycle. It further requires separate paths for the propagation of lookahead control signals. FreewayNoC also requires implementation of multiple checks to guarantee con ict-free bypassing of the SA stage.
Physical implementation results using 28nm process technology show that DDRNoC and FreewayNoC have 5% and 15% area overhead, respectively, compared to a simple 3-stage network with VCs. Performance evaluation shows that for a 16X16 mesh network, FreewayNoC supports 25% higher throughput compared to current state-of-the-art NoC, ShortPath. Moreover, FreewayNoC achieves a zero-load latency which scales better than ShortPath and equally well with an ideal network that has no control overheads. For application driven traffic, FreewayNoC reduces average packet latency by 18% compared to ShortPath. Alternatively, low voltage implementation of the DDRNoC and FreewayNoC can be used to conserve power and improve energy efficiency at the cost of higher packet latency.