Techniques to Reduce Energy Dissipation in Level-1 Data Caches
Licentiatavhandling, 2013
The number of battery powered devices is growing significantly and these devices require energy-efficient hardware to operate for a useful period of time. Also, the future of the performance scaling in processors depends on the energy efficiency as the increasing number of cores on a single chip does not leave room for inefficient microarchitectures. Thus, energy efficiency has become one of the most important goals in the design of a wide range of processor types. The energy dissipation of data caches represents a significant portion of total energy for general-purpose processors. In this thesis, three new techniques are introduced to reduce the energy dissipation of level-one data caches (L1 DCs). In addition, some of the presented work addresses the energy dissipation of the data translation lookaside buffer also, which is closely related to the L1 DC. Since data caches affect the processor performance, energy-saving techniques must be considered in terms of their impact on performance. Some of
the proposed techniques allow the L1 DC to be accessed early in the pipeline, improving processor performance. Two of the presented techniques, that is, the tagless access buffer (TAB) and the data filter cache (DFC), reduce the energy dissipation of data caches and improve the overall performance by diverting part of the data accesses to a very small and energy-efficient cache or buffer structure. TAB uses hardware/software co-design to achieve this
goal, while DFC is entirely based on hardware. Although the software control in TAB enables an efficient hardware implementation and less redundant line fetch operations from L1 DC and the higher levels in the hierarchy, it requires modifications to the instruction set architecture which can be impractical due to binary incompatibility.
The DFC can achieve very significant energy gains by means of hardware control only. The third technique presented in this thesis, the speculative tag access, improves the efficiency of the L1 DC by performing the tag match operation early in the pipeline in a speculative way. In this manner, only one way of the data is accessed on a speculation success. Compared to the other two techniques, the complexity of control and the area overhead is very low and yet the energy reduction is significant.