Designing OS for HPC applications: Scheduling
Paper in proceedings, 2010

Operating systems have historically been implemented as independent layers between hardware and applications. User programs communicate with the OS through a set of well defined system calls, and do not have direct access to the hardware. The OS, in turn, communicates with the underlying architecture via control registers. Except for these interfaces, the three layers are practically oblivious to each other. While this structure improves portability and transparency, it may not deliver optimal performance. This is especially true for High Performance Computing (HPC) systems, where modern parallel applications and multi-core architectures pose new challenges in terms of performance, power consumption, and system utilization. The hardware, the OS, and the applications can no longer remain isolated, and instead should cooperate to deliver high performance with minimal power consumption. In this paper we present our experience with the design and implementation of High Performance Linux (HPL), an operating system designed to optimize the performance of HPC applications running on a state-of-the-art compute cluster. We show how characterizing parallel applications through hardware and software performance counters drives the design of the OS and how including knowledge about the architecture improves performance and efficiency. We perform experiments on a dual-socket IBM POWER6 machine, showing performance improvements and stability (performance variation of 2.11% on average) for NAS, a widely used parallel benchmark suite.

Author

Roberto Gioiosa

Centro Nacional de Supercomputacion

Sally A McKee

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

M. Valero

Polytechnic University of Catalonia

Proceedings - IEEE International Conference on Cluster Computing, ICCC

1552-5244 (ISSN)

78-87

Areas of Advance

Information and Communication Technology

Subject Categories

Computer and Information Science

DOI

10.1109/CLUSTER.2010.16

ISBN

978-076954220-1

More information

Latest update

3/29/2018