A Flexible Code Compression Scheme using Partitioned Look-Up Tables
Rapport, 2008
Wide instruction formats make it possible to control
microarchitecture resources more finely by enabling more parallelism
(VLIW) or by utilizing the microarchitecture more efficiently by
exposing the control to the compiler. Unfortunately, wide
instructions impose a higher pressure on the memory system due to an
increased instruction-fetch bandwidth and a larger code working
set/footprint.
This paper presents a code compression scheme that allows the
compiler to select what subset of the wide instruction set to use
in each program phase at the granularity of basic blocks based on a
profiling methodology. The decompression engine comprises a set of
tables that convert a narrow instruction into a wide instruction in
a dynamic fashion. The paper also presents a method for how to
configure and dimension the decompression engine and how to
generate a compressed program with embedded instructions that
dynamically manage the tables in the decompression engine.
We find that the 77 control bits in the original FlexCore
instruction format can be reduced to 32 bits offering a compression
of 58% and a modest performance overhead of less than 1% for
management of the decompression tables.