On implementability of polymorphic register files
Paper i proceeding, 2012
This paper studies the implementability of performance efficient multi-lane Polymorphic Register Files (PRFs). Our PRF implementation uses a 2D array of p x q linearly addressable memory banks, with customized addressing functions to avoid address routing circuits. We target one single-view and a set of four non redundant multi-view parallel memory schemes that cover all widely used access patterns in scientific and multimedia applications: 1) p x q rectangle, p·q row, p·q main and secondary diagonals; 2) p x q rectangle, p·q column, p·q main and secondary diagonals; 3) p·q row, p·q column, aligned pxq rectangle; 4) pxq, q xp rectangles (transposition). Reconfigurable hardware was chosen for the implementation due to its potential in enhancing the PRF runtime adaptability. For a proof of concept, we prototyped a 2 read, 1 write ports PRF on a Virtex-7 XC7VX1140T-2 FPGA. We consider four sizes for the 16 lanes PRFs - 16x16, 32x32, 64x64 and 128x128 and three multi-lane configurations, 8, 16 and 32, for the 128 x 128 PRF. Synthesis results suggest clock frequencies between 111 MHz and 326 MHz while utilizing less than 10% of the available LUTs. By using customized addressing functions, the LUT usage is reduced by up to 29% and the clock frequency is up to 77% higher compared to a straight-forward implementation.