Performance Impact of Batching Web Application Requests using Hot-spot Processing on GPUs
Paper in proceedings, 2015

Web applications are a good fit for many-core servers because of their inherent high-degree of request-level parallelism. Yet, processing-intensive web-server requests can lead to low quality-of-service due to hot-spots, which calls for methods that can improve single-thread performance. This paper explores how to use off-chip GPUs to speed up web application hot-spots written in productivity-friendly environments (e.g. C#). First, we apply a number of straightforward optimizations through refactoring of a commercial-strength, web application code. This yields a speedup of 7.6 in a CPU multi-threaded, and multi-core test. Second, we then gather similar requests from different threads of the optimized code, by applying a technique called batching, to exploit SIMD parallelism provided by GPUs. Surprisingly, there is ample parallelism to be exploited from the already optimized code yielding a speedup of a factor between 2x to 3x compared to the best optimized CPU version.

code optimization

Cloud computing

data parallelism

Author

Tobias Fjälling

Per Stenström

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

29th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, Hyderabad, India, 25-29 May 2015

1530-2075 (ISSN)

989-999

Areas of Advance

Information and Communication Technology

Subject Categories

Computer and Information Science

Electrical Engineering, Electronic Engineering, Information Engineering

DOI

10.1109/IPDPS.2015.64

ISBN

978-147998648-4

More information

Created

10/8/2017