A Highly Scalable Instruction Scheduler Design based on CPU Stall Elimination
Paper i proceeding, 2021
In this paper, by targeting low-level code optimization, an instruction scheduler is designed and experimented with a synergistic processor unit (SPU) to show its effectiveness on a basic block and data dependency graph (DDG) called compiler instruction scheduler (CIS). In our methodology, a source C/C++ file is converted to an assembly file via spu-gcc to detect stalls in basic code blocks and CIS generates the DDG of executable code to eliminate stalls to find optimization opportunities and increase the program performance. The CIS simply shuffles the instruction sequences of the assembly code to eliminate CPU stalls in a given basic instruction block. Random and sliding window schedulers are implemented to generate a new assembly code sequence based on DDG and a basic block in parallel. Finally, this paper describes how CIS finds the optimized code sequence for a given file without any conflicts and hazards. Compared to the original code compilation process, we have shown that CIS improves the code execution metrics, and also our evaluated speedup results are found to be promising.