Self-Tuned Software-Managed Energy Reduction in Infiniband Links
Paper in proceeding, 2016

One of the biggest challenges in high-performance computing is to reduce the power and energy consumption. Research in energy efficiency has focused mainly on energy consumption at the node level. Less attention has been given to the interconnect, which is becoming a significant source of energy-inefficiency. Although supercomputers undoubtedly require a high-performance interconnect, previous work has shown that network links have low average utilization. It is therefore possible to save energy using low-power modes, but link wake-up latencies must not lead to a loss in performance. This paper proposes the Self-tuned Pattern Prediction System (SPPS), a self-tuned algorithm for energy proportionality, which reduces interconnect energy consumption without needing any application-specific configuration parameters. The algorithm uses prediction to discover repetitive patterns in the application's communication, and it is implemented inside the MPI library, so that existing MPI programs do not need to be modified. We build on previous work, which showed how the application structure can be successfully exploited to predict the communication idle intervals. The previous work, however, required the manual adjustment of a critical idle interval length, whose value depends on the application and has a major effect on energy savings. The new technique automatically discovers the optimal value of this parameter, resulting in a self-tuned algorithm that obtains large interconnect energy savings at little performance cost. We study the effectiveness of our approach using ten real applications and benchmarks. Our simulations show average energy savings in the network links of up to 21%. Moreover, the link wake-up latencies and additional computation times have a negligible effect on performance, with an average penalty less than 1%.

Author

B. Dickov

Polytechnic University of Catalonia

Centro Nacional de Supercomputacion

H. Shin

Centro Nacional de Supercomputacion

Miquel Pericas

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Sally A McKee

Polytechnic University of Catalonia

Centro Nacional de Supercomputacion

21st IEEE International Conference on Parallel and Distributed Systems, ICPADS 2015, Melbourne, Australia, 14-17 December 2015

1521-9097 (ISSN)

649-657
978-0-7695-5785-4 (ISBN)

Subject Categories

Computer Engineering

Areas of Advance

Information and Communication Technology

DOI

10.1109/ICPADS.2015.87

More information

Latest update

11/12/2019