Direction-Optimizing
Breadth-First Search
Breadth-first search (BFS) is an important kernel used by many graph algorithms. It is only a traversal order, and depending on what is recorded, it can return connectivity, distance, or parenthood. Due to its simplicity and importance to graph workloads, it has been used as the first and main kernel in the Graph 500 competition.
A typical BFS implementation will examine every edge in a directed graph once (twice if undirected). Most of these edge examinations will fail because the other end of the edge has already been visited. The bottom-up approach can be drastically faster by checking many fewer edges. It is only advantageous when the frontier constitutes a large fraction of the graph, and this typically happens during the middle steps of a BFS on low-diameter graphs (such as social networks).
To get the best performance, we combine the novel bottom-up approach with the classical top-down approach. Since our hybrid switches approaches on a step-level granularity, and it uses each for when they are advantageous, it is called Direction-Optimizing BFS.
The direction-optimizing approach was conceived in time for the November 2011 Graph500 rankings, and it allowed a quad-socket Intel server to beat small clusters and specialized hardware such as the Cray XMT and the Convey HC-1. The algorithm was initially disclosed in a technical report [1] and later formally presented at Supercomputing 2012 [2].
We redesigned our algorithm to run on a cluster to perform BFS on larger graphs. Our implementation [3] is built on top of the CombBLAS framework.
Graph500 Finishes
- mirasol - 17th in November 2011
- Fastest single-node & highest efficiency (search rate per core or socket)
- Quad Intel Xeon E7-8870 (Westmere-EX) w/ 256GB RAM
- Located at Georgia Tech
- Hopper - 8th in June 2012
- Fastest Cray result
- 4817 nodes (115600 cores) on Cray XE6
- Located at NERSC
Adoption of Direction-Optimizing Approach in Graph500
- June 2012 - at least 5 of top 9
- November 2012 - at least 27 of top 31
- June 2013 - at least 28 of top 33
- November 2013 - at least 30 of top 35
- June 2014 - at least 31 of top 36
- November 2014 - at least 32 of top 37
- July 2015 - at least 32 of top 37
- November 2015 - at least 32 of top 37
Implementations
References
- "Searching for a Parent Instead of Fighting Over Children: A Fast Breadth-First Search Implementation for Graph500", Scott Beamer, Krste Asanović, and David Patterson, Technical Report UCB/EECS-2011-117, EECS Department, University of California, Berkeley, November 2011. TR
- "Direction-Optimizing Breadth-First Search", Scott Beamer, Krste Asanović, and David Patterson, International Conference on High Performance Computing, Networking, Storage and Analysis (SC), Salt Lake City, Utah, November 2012. ACM errata corrected
Best Student Paper Finalist
- "Distributed Memory Breadth-First Search Revisited: Enabling Bottom-Up Search", Scott Beamer, Aydın Buluç, Krste Asanović, and David Patterson, Workshop on Multithreaded Architectures and Applications (MTAAP), at the International Parallel & Distributed Processing Symposium (IPDPS), Boston, May 2013. IEEE
- "Direction-Optimizing Breadth-First Search", Scott Beamer, Krste Asanović, and David Patterson, Journal of Scientific Programming, 21(3-4), October 2013. IOS
- "Distributed-Memory Breadth-First Search on Massive Graphs", Aydın Buluç, Scott Beamer, Kamesh Madduri, Krste Asanović, and David Patterson, In D. Bader, editor, Parallel Graph Algorithms, CRC Press, Taylor-Francis, 2016 (in press). PDF site
- "Understanding and Improving Graph Algorithm Performance", Scott Beamer, Ph.D. Thesis, University of California Berkeley, September 2016. PDF TR
SPEC Kaivalya Dixit Distinguished Dissertation Award
Last updated: 2017-03-01