Auto-tuning Static Schedules for Task Data-flow Applications
Paper in proceeding, 2017

Scheduling task-based parallel applications on many-core processors is becoming more challenging and has received lots of attention recently. The main challenge is to efficiently map the tasks to the underlying hardware topology using application characteristics such as the dependences between tasks, in order to satisfy the requirements. To achieve this, each application must be studied exhaustively as to define the usage of the data by the different tasks, that would provide the knowledge for mapping tasks that share the same data close to each other. In addition, different hardware topologies will require different mappings for the same application to produce the best performance.

In this work we use the synchronization graph of a task-based parallel application that is produced during compilation and try to automatically tune the scheduling policy on top of any underlying hardware using heuristic-based Genetic Algorithm techniques. This tool is integrated into an actual task-based parallel programming platform called SWITCHES and is evaluated using real applications from the SWITCHES benchmark suite. We compare our results with the execution time of predefined schedules within SWITCHES and observe that the tool can converge close to an optimal solution with no effort from the user and using fewer resources.

Genetic Algorithm

Task Parallelism

Data-Flow

Auto-tuning

Author

Andreas Diavastos

University of Cyprus

Pedro Petersen Moura Trancoso

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

ACM International Conference Proceeding Series

Vol. Part F132205 1-6 a1
978-145035363-2 (ISBN)

First Workshop on AutotuniNg and aDaptivity AppRoaches for Energy efficient HPC Systems, ANDARE’17
Portland, Oregon, USA,

Subject Categories

Computer Engineering

Computer Science

Computer Systems

DOI

10.1145/3152821.3152879

More information

Latest update

3/21/2022