Charles Explorer logo
🇬🇧

aDFS: An Almost Depth-First-Search Distributed Graph-Querying System

Publication at Faculty of Mathematics and Physics |
2021

Abstract

Graph processing is an invaluable tool for data analytics. In particular, pattern-matching queries enable flexible graph exploration and analysis, similar to what SQL provides for relational databases.

Graph queries focus on following connections in the data; they are a challenging workload because even seemingly trivial queries can easily produce billions of intermediate results and irregular data access patterns. In this paper, we introduce aDFS: A distributed graphquerying system that can process practically any query fully in memory, while maintaining bounded runtime memory consumption.

To achieve this behavior, aDFS relies on (i) almost depth-first (aDFS) graph exploration with some breadth-first characteristics for performance, and (ii) non-blocking dispatching of intermediate results to remote edges. We evaluate aDFS against state-of-the-art graph-querying (Neo4J and GraphFrames for Apache Spark), graph-mining (G-Miner, Fractal, and Peregrine), as well as dataflow joins (BiGJoin), and show that aDFS significantly outperforms prior work on a diverse selection of workloads.