Understanding the behavior of in-memory computing workloads
Paper i proceeding, 2014

© 2014 IEEE. The increasing demands of big data applications have led researchers and practitioners to turn to in-memory computing to speed processing. For instance, the Apache Spark framework stores intermediate results in memory to deliver good performance on iterative machine learning and interactive data analysis tasks. To the best of our knowledge, though, little work has been done to understand Spark's architectural and microarchitectural behaviors. Furthermore, although conventional commodity processors have been well optimized for traditional desktops and HPC, their effectiveness for Spark workloads remains to be studied. To shed some light on the effectiveness of conventional generalpurpose processors on Spark workloads, we study their behavior in comparison to those of Hadoop, CloudSuite, SPEC CPU2006, TPC-C, and DesktopCloud. We evaluate the benchmarks on a 17-node Xeon cluster. Our performance results reveal that Spark workloads have significantly different characteristics from Hadoop and traditional HPC benchmarks. At the system level, Spark workloads have good memory bandwidth utilization (up to 50%), stable memory accesses, and high disk IO request frequency (200 per second). At the microarchitectural level, the cache and TLB are effective for Spark workloads, but the L2 cache miss rate is high. We hope this work yields insights for chip and datacenter system designers.

Författare

[Person 52d88ae8-9fce-4b0f-b60d-9c12fa8a8eac not found]

Chinese Academy of Sciences

[Person 2dd6bd63-7791-433b-b73c-95ca88b6cbc6 not found]

Chinese Academy of Sciences

[Person b21fe68e-7ba4-4c7c-9b50-16955fc56a5f not found]

Chinese Academy of Sciences

[Person ad71a64f-f324-45c3-950d-da83020810f5 not found]

Chinese Academy of Sciences

[Person dfea1c65-c240-48f4-b876-0cd0edd3917e not found]

Chalmers, Data- och informationsteknik, Datorteknik

[Person 638461ae-0e95-4208-8bd8-9b9bc7160423 not found]

Chinese Academy of Sciences

[Person 75a846aa-9219-4add-8604-6a84ebb640af not found]

Chinese Academy of Sciences

IISWC 2014 - IEEE International Symposium on Workload Characterization

22-30

Ämneskategorier

Data- och informationsvetenskap

DOI

10.1109/IISWC.2014.6983036

ISBN

9781479964536

Mer information

Skapat

2017-10-08