Techniques to Cancel Execution Early to Improve Processor Efficiency
Doktorsavhandling, 2011

The evolution of computer systems to continuously improve execution efficiency has traditionally embraced various approaches across microprocessor generations. Unfortunately, contemporary processors still suffer from several inefficiencies although they offer an unprecedented level of computing capabilities. At the same time, the traditional approach of solely caring about performance is nowadays superseded with more critical and multi-dimensional constraints that include power consumption, scalability, security and reliability to mention a few. This dissertation aims to address the prevailing inefficiencies to improve processor efficiency. To this end, this dissertation contributes with a number of techniques so that processors offer better performance and higher energy efficiency. The first contribution is a novel scheme that detects and eliminates execution of trivial operations, such as multiplication by ‘0’ or ‘1’, early to improve energy efficiency. The second contribution identifies the tradeoffs and relative efficiencies of two techniques that target program inefficiency in the forms of trivial computation and instruction reuse. The most important finding is that these techniques detect sets of instructions that are almost disjoint and thus may provide cumulative benefits if combined. The next set of contributions increases execution efficiency of memory instructions by cancelling memory accesses early. The third contribution is a novel scheme that leverages frequent value locality and establishes that a significant fraction of memory instructions reads the value ‘0’ from memory. This dissertation then contributes with another microarchitectural technique to take advantage of small value locality. The scheme exploits the observation that a substantial fraction of memory instructions manipulates small values — values that can be represented using typically a few bits. The proposed schemes store small values compactly to reduce architectural inefficiency and eliminate unnecessary memory accesses to reduce program inefficiency. The penultimate scheme utilizes the observation that a notable fraction of memory requests can be satisfied by the contents of register file to make the associated memory accesses unnecessary. Finally, this dissertation presents a novel unified scheme that employs a single structure to simultaneously target multiple forms of program inefficiency in memory instructions. Experimental results show that the proposed schemes improve performance and energy efficiency of processors. The proposed techniques are in general non-speculative. Moreover, additional resource requirements and associated overhead of each scheme are moderately low. Consequently, the schemes that are proposed in this dissertation contribute to resource-efficient and complexity-effective processor design.

processor design


narrow-width cache

instruction reuse

zero-value cache


narrow-width load


small value locality

register file cache

frequent value locality

trivial instruction

silent load


zero load

Lecture room EC, ED&IT building, Hörsalsvägen 11, Chalmers University of Technology, Sweden
Opponent: Professor David J. Lilja, Fellow of the IEEE, Department of Electrical and Computer Engineering, The University of Minnesota, USA


Mafijul Islam

Chalmers, Data- och informationsteknik, Datorteknik

Characterization and Exploitation of Narrow-Width Loads:The Narrow-Width Cache Approach

IEEE/ACM International Conference on Compilers, Architecture, and Synthesis of Embedded Systems (CASES 2010),; (2010)p. 227-236

Paper i proceeding

Zero-Value Caches: Cancelling Loads that Return Zero.

Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT,; (2009)p. 237-245

Paper i proceeding

Early Detection and Bypassing of Trivial Operations to Improve Energy Efficiency of Processors

Microprocessors and Microsystems, Elsevier,; Vol. 42(2008)p. 183-196

Artikel i vetenskaplig tidskrift

Energy and Performance Tradeoffs between Instruction Reuse and Trivial Computations for Embedded Applications

IEEE International Symposium on Embedded Computer Systems,; (2007)

Paper i proceeding




Informations- och kommunikationsteknik


Hållbar utveckling


Grundläggande vetenskaper



Technical report - Department of Computer Science and Engineering, Chalmers University of Technology and Göteborg University: 78D

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 3228

Lecture room EC, ED&IT building, Hörsalsvägen 11, Chalmers University of Technology, Sweden

Opponent: Professor David J. Lilja, Fellow of the IEEE, Department of Electrical and Computer Engineering, The University of Minnesota, USA