Implementing graph transformations in the bulk synchronous parallel model
Paper in proceedings, 2014
Big data becomes a challenge in more and more domains. In many areas, such as in social networks, the entities of interest have relational references to each other and thereby form large-scale graphs (in the order of billions of vertices). At the same time, querying and updating these data structures is a key requirement. Complex queries and updates demand expressive high-level languages which can still be efficiently executed on these large-scale graphs. In this paper, we use the well-studied concepts of graph transformation rules and units as a high-level modeling language with declarative and operational features for transforming graph structures. In order to apply them to large-scale graphs, we introduce an approach to distribute and parallelize graph transformations by mapping them to the Bulk Synchronous Parallel (BSP) model. Our tool support builds on Henshin as modeling tool and consists of a code generator for the BSP framework Apache Giraph. We evaluated the approach with the IMDB movie database and a computation cluster with up to 48 processing nodes with 8 cores each.