Senior/Staff System Design Interviews: Consistent Hashing

Whether a senior/staff system design interview ends up meriting an offer often depends on how well one understands scalability and distributed systems. In the world of distributed systems and large-scale applications, load balancing is a crucial factor to consider. Traditional hashing techniques can fall short when it comes to dynamically allocating resources and maintaining a balanced distribution of workload. This is where consistent hashing comes into play. In this blog post, we will explore the concept of consistent hashing and how it addresses the challenges of scalability and load balancing.

Understanding Hashing

To understand consistent hashing, let’s first revisit hashing. In traditional hashing, a hash function takes inputs and maps them to a fixed range of values (on average, uniformly). We can extrapolate this to requests and machines by thinking of our hash function as acting on our request and spitting out a value that corresponds to a machine.


Therefore, if we have 30 machines, we take our favorite hash function F with values between 1 and 10^10 (arbitrarily large number) and retrieve our machine corresponding to number N = F(request) modulo 30.


Great! If the two requests are identical, this method should always send them to the same machine. (Can you think of why that’s a good idea?). We can think of it as a load-balancer. (For an introduction on load-balancers, check out this article). Equivalently, we may also think of the requests as being partitioned among the machines, with each machine’s being responsible for a fixed subset of the requests.


Wonderful, wonderful. But what happens when we need to add our 31st machine?

The Challenge of Distributed Systems

N = F(request) modulo 31, in general, evaluates extremely differently than N = F(request) modulo 30. Why did we think we were being clever? Who knows.

We may even have a machine go out of service and will then have to evaluate modulo 29.
It may seem hopeless, but it turns out the fix is a simple tweak!

Introducing Consistent Hashing

Although our first method worked well when our machines stayed the same, it did not play well with additions/removals. That is the sense in which consistent hashing is, well, “consistent.”


Consistent hashing is a technique where you overshoot the number of machines by a healthy amount (let’s say 10000), leaving you evaluating N = F(request) modulo 10e4.


Hold on. In our previous example, we had a different machine corresponding to each number, but now we have 10000 – 30 = 9970 numbers that don’t correspond to anything!

Well… they will!


Let’s augment our hash function to not only act on requests, but also our very server names! So by computing F(“Server k”) modulo 10000, where k = 0, …, 29, we get 30 different values corresponding to our machines, likely distributed uniformly between 0 and 9999.


Now, when given a request and computing its hash (0 – 9999), we have two scenarios:
1. We land on a machine, in which case, we use it!
2. We land on an empty number k, in which case we check whether k + 1 has a machine, and repeat steps 1 & 2.

Note: if we go past 9999 at any point, we wrap back to 0. This is why consistent hashing often makes people think of a circle or a ring.


2nd Note: two machines hashing to the same spot is an edge-case! But can you think of how we could resolve that?


We can add as many machines as we want, and we keep all-but-one of our existing machines serving the exact same amount of load as before for each insertion. Similarly removing a machine affects just a single node. When we add a new node, we reduce traffic to an immediately adjacent node, although that node may still be storing data for all the traffic it used to serve and no longer needs to! This is why it’s important to use consistent-hashing in conjunction with good cache eviction strategies.

Preparing for interviews? Just interested in learning?

Get system design articles written by Staff+ engineers delivered straight to your inbox!