The small-ID technique (1)
09 Jun 2021In this series of posts, I’ll describe a neat technique for the LOCAL model. I will call it the small-ID technique, but it has several names. It can be called the speed-up technique, because it is an automatic way to speed-up an algorithm whose complexity is in a given range. It can also be called the simulation technique because the nodes simulate computation in another graph. And it is also a gap technique, because it is used to prove that there is no problem whose complexity is in some range.
I chose to use yet another name to avoid confusion with the round elimination technique, which is completely different, but used to be called simulation technique (e.g. in the series of posts I wrote on the topic).
This technique appeared for the first time in the celebrated paper An Exponential Separation between Randomized and Deterministic Complexity in the LOCAL Model by Yi-Jun Chang, Tsvi Kopelowitz, and Seth Pettie.
This first post will be about the idea of the technique.
Setting
The technique works for locally checkable labelings (LCL). These are the distributed problems whose outputs can be checked locally. For example, a proper coloring can be checked locally: one only needs to check every ball of radius 1 in the graph to tell whether a coloring is proper or not. For LCL, the radius of the verification ball does not need to be 1, it can be any constant. Many other problems fall into this category, including maximal matching, maximal independent set and sinkless orientation.
Here, we will consider an arbitrary LCL, and its verification radius $r$.
We consider networks with bounded maximum degree $\Delta$, and assume that the nodes are given unique identifiers on $O(\log n)$ bits, for example in ${1,…,n^2}$.
There exist LCL problems in such graphs with complexities such as $\Theta(1)$, $\Theta(\log^*n)$ or $\Theta(\log n)$, where $n$ is the number of nodes in the network.
First attempt
The core idea of the technique is that if the nodes believe they live in a smaller graph then they will compute faster. Let’s try something along these lines.
For concreteness, consider a coloring algorithm $A$ that runs in $f(n)$ rounds, where $f$ is not constant, but is also small enough such that no node can see the whole graph in its neighborhood at distance $f(n)$. Consider a huge graph $H$ on $h=1000000$ vertices. Every node will output a color after $f(h)=f(1000000)$ rounds, and globally the coloring computed is a proper coloring. If we zoom on some given node $u$, we can see that it stops after $f(1000000)$ rounds and outputs a color, and that its neighbors also stop in that time, also output a color, and these colors are different from the one of $u$. Now, suppose I take a small subgraph $S$ of $H$ on $s=100$ vertices, containing $u$ and its neighbors. As the size of $S$ is $s$, if I run my $f(n)$-algorithm on $S$ alone, it should output after $f(s)=f(100)$ rounds. In particular, $u$ stops after $f(100)$ rounds, outputs a color, and its neighbors do the same with a different color. Note that $s < h$, thus $f(s) < f(h)$, because $f$ is not constant. Thus to do the same job, which is output a coloring that is locally correct around $u$, we took less time.
Now, we can design a new algorithm $A’$. First, it gathers its neighborhood at some distance $d$, such that the subgraph induced by this neighborhood has around $s=100$ vertices, and radius at least f(100). Then it runs $A$ on this subgraph which takes $f(100)$ rounds, and outputs the same color as $A$. This new algorithm $A’$ is much faster than $A$.
This is too good to be true, so where’s the catch? One of the problems is that we are not sure that the neighbors will chose colors that are consistent, because they have a different subgraph. But this is fixable. The main problem is the identifier range. Our original algorithm $A$ works with the assumption that the identifiers where in the range ${1,…,n^2}$ where $n$ is the size of the network. Now with a subgraph of size $s$, we need identifiers in ${1,…,s^2}$, and we have identifiers in ${1,…,h^2}$, where $h$ is the size of the real graph.
Second attempt
To make our idea work, what we will do is to compute new identifiers, that are small enough to ensure that the algorithm computing on the subgraph will indeed output a correct solution, in less time. Hence the name of small-ID technique. Again there is a problem: if we use new identifiers on $o(\log n)$ bits, then these cannot be unique. We will make sure that two identifiers that are equal are far enough in the graph, so that the algorithm that believes that it lives in a graph of size $s$, will look at distance $f(s)$, and will not see two identical identifiers. If this holds then the output of $A$ should be locally correct, and thus globally correct, because we tackle only LCLs.
In the next post, I will describe how the technique precisely works on cycles. We’ll see that there are some subtleties with the way we compute the small-IDs, but the main idea is just that.
$\rightarrow$ Next post of the series