Expected Lookup Complexity in Chord

As professional tides bring me into P2P once again and I revisit some classics of the P2P literature, it is disconcerting to realize just how poorly did I really understand some of those things in the past. And one of those things is the expected lookup complexity of Chord [1].

From the outset, the problem seems simple enough. We start with a set of nodes O={o1,,on} that is arranged in a ring (Fig. 1) and to which Chord assigns, for each node oiO, a random non-negative integer id – let’s call it id(oi) – such that id(oi)[0,2m) for some natural number m.

Fig. 1 shows a fake network with m=6 in which all ids are assigned to nodes. This would never happen in practice as the id space is made to be way bigger than the network size, but it illustrates the shape of the network well, and makes it easier to explain how routing – which is the main point here – works.

The Chord protocol guarantees that each node oi eventually builds a local routing table, named a finger table, which contains up to m entries such that the kth entry in its table points to a node ojO that has the closest id to id(oi)+2k, where 0km1. Fig. 1 depicts finger tables for nodes with ids 0 and 20.

chord_dht(m = 6, draw_fingers = c(0, 20)) |> dht_plot()
A full-network Chord DHT with m = $6$

Figure 1: A full-network Chord DHT with m = 6

When a node oa wishes to find a node ob, then, all it needs to do is to lookup the node in its own routing table that has the id that is closest to id(ob). Because of the way fingers are arranged, oa is guaranteed to have in its routing table some node oc such that id(oc) is at least within a distance of 2m1 of id(ob).

oa can then ask oc to repeat this process, and return the node od that it knows to be the closest to ob. Again, because of the way fingers are constructed, we know that od must have a node in its finger table that has an id within at least 2m2 of the target. By continuing this process iteratively, it is not difficult to see that, after m steps, we will necessarily get to ob as we will find a node that is within distance 2mm=1 of it.

And this completes the basic intuition I already had: lookups always take less than O(m)=O(log2m) steps. Chord, however, makes a more subtle guarantee: namely that lookups take O(logn) with high probability for a network with size n<2m. And that is the problem we want to get into. Let us put that as a theorem, like the original paper [1] does:

Theorem. With high probability, the number of nodes that must be contacted to resolve a successor query in an n-node network is O(logn).

Proof sketch. Suppose there are two nodes oa and ob that are part of a network containing n nodes, such that oa is looking ob up. Following the reasoning we outlined before and assuming wlog that finger tables are complete, Chord guarantees that oa should know a node oc such that (id(ob)id(oc))mod2m2m1; i.e. oc is within distance 2m1, in id space, of ob. Note that this is irrespective of the number of nodes n in the network.

As before, this means once again that, after logN forwardings, we are guaranteed to have found a node that is within a distance of:

(1)2mlogn=2m2logn=2mn of node ob.

Ignoring rounding issues – and recalling ids are integer valued – this means that there are 2mn “vacant” ids between node oc and ob. To prove the theorem, we would have to show that there are O(logn) nodes occupying these ids, with high probability. This would mean that, with high probability, there would be a logarithmic number of hops that need to be followed before we reach ob, thus satisfying the bound. We will appease ourselves, however, with just showing that the expected number of nodes between oc and ob has to be 1; namely, that on average there is only one more hop to follow before we reach ob.

To see that this is true, let us start by looking at the probability that some random id in the DHT is occupied. Since ids are assigned uniformly at random and there are 2m such ids, it follows that each id can be occupied with probability 12m2. This means that the probability p that an id is occupied by one out of n nodes is given by:

(2)p=n2m

We can then treat id occupation as a Bernoulli experiment, meaning that for a given range of s distinct ids, the probability of having ks occupied ids follows a binomial distribution B(s,p). If XB(s,p), it follows that the expected number of occupied ids between oc and ob is given by:

E[X]=s×p=2mn×p=2mn×n2m=1

which is what we wanted to demonstrate.

This is not a complete proof, and it sweeps a number of important points under the rug - most notably that ids cannot be occupied twice, and that the expectation being 1 may not trivially imply that O(logn) bound is satisfied whp. But we will defer the complete proof to another day.

[1]
I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, Chord: A scalable peer-to-peer lookup service for internet applications,” in Proceedings of the 2001 ACM SIGCOMM, 2001, pp. 149–160.

  1. For simplicity, we are also treating the successor of a node o as a finger pointing to id(o)+20=id(o)+1↩︎

  2. That is not exactly correct as two nodes cannot occupy the same id. But we will assume they can for the purposes of this analysis.↩︎