Models and Methods for Development of DSP Applications on Manycore Processors
Doktorsavhandling, 2009

Advanced digital signal processing systems require specialized high-performance embedded computer architectures. The term high-performance translates to large amounts of data and computations per time unit. The term embedded further implies requirements on physical size and power efficiency. Thus the requirements are of both functional and non-functional nature. This thesis addresses the development of high-performance digital signal processing systems relying on manycore technology. We propose building two-level hierarchical computer architectures for this domain of applications. Further, we outline a tool flow based on methods and analysis techniques for automated, multi-objective mapping of such applications on distributed memory manycore processors. In particular, the focus is put on how to provide a means for tunable strategies for mapping of task graphs on array structured distributed memory manycores, with respect to given application constraints. We argue for code mapping strategies based on predicted execution performance, which can be used in an auto-tuning feedback loop or to guide manual tuning directed by the programmer. Automated parallelization, optimisation and mapping to a manycore processor benefits from the use of a concurrent programming model as the starting point. Such a model allows the programmer to express different types and granularities of parallelism as well as computation characteristics of importance in the addressed class of applications. The programming model should also abstract away machine dependent hardware details. The analytical study of WCDMA baseband processing in radio base stations, presented in this thesis, suggests dataflow models as a good match to the characteristics of the application and as execution model abstracting computations on a manycore. Construction of portable tools further requires a manycore machine model and an intermediate representation. The models are needed in order to decouple algorithms, used to transform and map application software, from hardware. We propose a manycore machine model that captures common hardware resources, as well as resource dependent performance metrics for parallel computation and communication. Further, we have developed a multifunctional intermediate representation, which can be used as source for code generation and for dynamic execution analysis. Finally, we demonstrate how we can dynamically analyse execution using abstract interpretation on the intermediate representation. It is shown that the performance predictions can be used to accurately rank different mappings by best throughput or shortest end-to-end computation latency.

parallel code mapping

parallel processing

concurrent models of computation


manycore processors

parallel machine model

dynamic performance analysis

high-performance digital signal processing

Wigforssalen, house Visionen, Halmstad University
Opponent: Professor Shuvra S. Bhattacharyya, Dept. of Electrical and Computer Engineering, University of Maryland, USA


Jerker Bengtsson

Chalmers, Data- och informationsteknik, Datorteknik





Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 2969

Technical report D - Department of Computer Science and Engineering, Chalmers University of Technology and Göteborg University

Wigforssalen, house Visionen, Halmstad University

Opponent: Professor Shuvra S. Bhattacharyya, Dept. of Electrical and Computer Engineering, University of Maryland, USA