#
Streaming Balanced Graph Partitioning Algorithms for Random Graphs

##
Will Devanny

Presenting a paper by Isabelle Stanton

Abstract:
There has been a recent explosion in the size of stored data,
partially due to advances in storage technology, and partially due to
the growing popularity of cloud-computing and the vast quantities of
data generated. This motivates the need for streaming algorithms that
can compute approximate solutions without full random access to all
of the data. We model the problem of loading a graph onto a
distributed cluster as computing an approximately balanced
k-partitioning of a graph in a streaming fashion with only one pass
over the data. We give lower bounds on this problem, showing that no
algorithm can obtain an o(n) approximation with a random or
adversarial stream ordering. We analyze two variants of a randomized
greedy algorithm, one that prefers the arg max and one that is
proportional, on random graphs with embedded balanced k-cuts and are
able to theoretically bound the performance of each algorithms - the
arg max algorithm is able to recover the embedded k-cut, while,
surprisingly, the proportional variant can not. This matches the
experimental results in [25].