Enhancing thread-level parallelism in asymmetric multicores using transparent instruction offloading
Paper in proceeding, 2020

Asymmetric multicore architectures (AMC) with single-ISA can accelerate multi-threaded applications by running the serial region on the big core and the parallel region on multiple small cores. In such architectures, all cores implement resource-expensive and application-specific instruction extensions (e.g., SIMD and FP). We argue that instead of implementing such extensions in the big core, the resources must be traded-off to increase the number of small cores. Furthermore, when the big core requires such instruction extensions, we offload execution to the small cores. This design mainly leverages the observation that SIMD/FP operations are more frequently executed inside parallel regions. The proposed AMC provides an additional 1.76x speedup and 12.4% energy savings compared to a traditional AMC of the same area due to enhanced thread-level parallelism (TLP) exploitation.

Functional unit sharing

Offloading

AMC

SIMD

Author

Jeckson Dellagostin Souza

Universidade Federal do Rio Grande do Sul (UFRGS)

Madhavan Manivannan

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Miquel Pericas

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Antonio Carlos Schneider Beck

Universidade Federal do Rio Grande do Sul (UFRGS)

Proceedings - Design Automation Conference

0738100X (ISSN)

Vol. 2020-July 9218614
9781450367257 (ISBN)

2020 57th ACM/IEEE Design Automation Conference (DAC)
San Francisco, USA,

Subject Categories

Computer Engineering

Embedded Systems

Computer Systems

DOI

10.1109/DAC18072.2020.9218614

More information

Latest update

1/3/2024 9