Load Balancing at Scale with Vivek Panyam
Software Engineering Daily,
Originally posted on Software Engineering Daily
Facebook serves interactive content to billions of users. Google serves query requests on the world’s biggest search engine. Uber handles a significant percentage of the transportation within the United States. These services are handling radically different types of traffic, but many of the techniques they use to balance loads are similar.
Vivek Panyam is an engineer with Uber, and he previously interned at Google and Facebook. In a popular blog post about load balancing at scale, he described how a large company scales up a popular service. The methods for scaling up load balancing are simple, but effective–and they help to illustrate how load balancing works at different layers of the networking stack.
Let’s say you have a simple service where a user makes a request, and your service sends them a response with a cat picture. Your service starts to get popular, and begins timing out and failing to send a response to users.
When your service starts to get overwhelmed, you can scale up load by creating another service instance that is a copy of your cat picture service. Now you have two service instances, and you can use a layer 7 load balancer to route traffic evenly between those two service instances. You can keep adding service instances as the load scales and have the load distributed among those new instances.
Eventually, your L7 load balancer is handling so much traffic itself that you can’t put any more service instances in front of it. So you have to set up another L7 load balancer, and put an L4 load balancer in front of those L7 load balancers. You can scale up that tier of L7 load balancers, each of which is balancing traffic across a set of your service instances. But eventually, even your L4 load balancer gets overwhelmed with requests for cat pictures. You have to set up another tier, this time with L3 load balancing…
In this episode, Vivek gives a clear description for how load balancing works. We also review the 7 networking layers before discussing why there are different types of load balancers associated with the different networking layers.
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.