Using Many Small 1T1C Memory Arrays in a Large and Dense Multicore Processor
Paper in proceeding, 2022
A memory system for multicore processors with a large number of processing elements (PE) is presented. Each PE has a local memory implemented in one-transistor one-capacitor (1T1C) DRAM technology, and these local memories contain many small memory arrays. The energy consumption and access time are reduced compared to state-of-the-art dynamic memories, while the aggregate bandwidth is increased by orders of magnitude. The memory arrays are composed of 4F2 cells, with ≤128 bit-lines and word-lines. The area overhead for peripheral circuitry is minimized. The word-line driver is a two-transistor demultiplexer, and due to how short the word-lines are, small transistors can be used. The sense amplifiers are multiplexed 4-to-1 for use by several bit-lines. The sense amplifiers are controlled with low voltage current injection to a bus. The address is represented as a combination of eight 1-of-N encoded parts, and their cross products select memory bank, sector, array half, and word-line. The design space is explored by varying the number of banks, sectors, bit-lines and word-lines to find the most ideal combination for a 14 nm technology with 52 nm pitch. Considerable differences in performance measured as area, energy and access time have been found. The optimal constellation provides high area utilization (63 %), low energy consumption (25 fJ/bit) and short access time (515 ps). Finally, a future memory cell at the end of Moore's law is predicted, and its implications for a new emerging computer paradigm, the Surface Based Processor (SBP).
memory system
multicore processor
sense amplifier
dynamic memory
word-line driver