You just need load balancers when you want to balance traffic across two or more instances, right ?? Nope, there is so much more than just the act of sending traffic across two or more instances that are sitting behind the load balancer. And particularly in AWS, Elastic Load Balancers are a key part of your architecture. Taking the right design decisions can enable you to have a great end user experience. In this blog post, I want to talk about the different things that you should take into consideration when you are building out your cloud infrastructure.
Amazon Elastic Load Balancer (ELB) automatically distributes incoming application traffic across multiple EC2 instances. Using ELBs enables us to achieve fault tolerance in our applications, while at the same time offering features like high availability, automatic scaling, and robust security. Now let’s look at the different design points.
Availability: The main reason to use a load balancer is to ensure that the failure of one instance does not bring your entire application down. ELBs can help you balance traffic across multiple instances. These instances can be in a single Availability Zone (AZ) or multiple AZs. This makes your application highly available as it can sustain a single instance or even a single AZ failure.
Health Checks: By using health checks, an ELB can ensure that it is only routing traffic to healthy instances. You can configure ELB health checks like the one in the example below:
In this case, the application we are dealing with is a simple web application. ELB will continuously try and check for the index.html file, and as long as it can access that file, that particular instance will be considered healthy and it will continue to receive traffic. You can additionally set healthy and unhealthy threshold limits. In the above example, if the ELB is not able to reach the index.html file for two intervals of 30 seconds, then it will mark the instance as unhealthy. Similarly, if a new instance is being added behind the ELB, it will wait for 10 intervals of 30 seconds before starting to route traffic to the new instance. These parameters can be really helpful in cases where your instance has to go and install some packages and download the code from a code repository or S3 buckets. We can increase the healthy threshold limits to make sure that the instance is ready to receive traffic before it is being marked healthy.
DNS Failover for ELB: As we discussed earlier, ELB can route traffic between instances in different AZs. But what if the entire region goes down? To avoid a complete meltdown during such scenarios, we can use ELBs with Route53 (AWS DNS Service). You can have multiple ELBs from different regions registered with Route53 with health checks at the ELB level. If the ELB goes down due to a region failover or just ELB failure, Route53 will start routing all the incoming traffic to the secondary ELB instance, making sure that your end users are still able to reach your application. This feature adds that additional level of availability to your design.
Sticky Sessions: Classic and Application Load Balancers route traffic to the underlying instances using different algorithms. But, there are use cases where we want user traffic to be routed to a specific instance. There are different ways of doing this, we can store the user state data in ElastiCache or an RDS Instance and have all the EC2 instances fetch user information from there to give the end user that seamless experience. But, if we don’t want to complicate things, ELB also offers sticky sessions. Enabling sticky sessions will help us get the same end result without the added cost and complexity. We can have load balancer based (duration based) sticky sessions or application based sticky sessions. The main difference between these two methods is who generates the user cookies. Using sticky sessions can get tricky in scenarios where you are using ELBs with Auto Scaling groups (ASG). Your ASG might be adding new instances behind the ELB, but since we have enabled sticky sessions, we might not see the increase in performance that we were expecting because the new instance won’t be getting any existing traffic. So we need to take this into consideration when building our infrastructure.
Elastic: Since ELB is a managed service by AWS, it automatically scales up to match the increased traffic to your application. ELBs scale up in a more linear manner rather than exponential, so in scenarios where you are expecting a huge bump in traffic, you can contact AWS and give them details such as the start and end periods of your higher traffic interval, and they can pre-warm the load balancer to ensure that you are able to serve all the requests coming your way.
Secure: ELBs provide integrated certificate management and SSL decryption which allows you to centrally manage the SSL settings of the load balancer and offload CPU intensive work from the EC2 instances. ELB also integrates with AWS certificate manager to make is easy to enable SSL/TLS for your application.
Internal and External Load Balancer: Another important benefit of ELBs is that you can create internal or external load balancers. External Load Balancers are the ones that help you serve traffic over the internet. The load balancer and the instances running behind the load balancer need access to an internet gateway to serve users. However, there are also use cases where you want your private application servers to load balance traffic. In such cases, you can deploy an internal load balancer. Using a combination of internal and external load balancers you can create a multi-tiered solution, ensuring that only your web servers talk to the Internet and your applications servers are secure in a private subnet in your VPC.
Autoscaling with ELB: Elastic Load Balancers in conjunction with Auto Scaling groups can be used when you want to make sure that you have a minimum number of instances always available and increase the number of instances based on the amount of traffic that your ELB is getting. You can have an ASG of minimum two instances and then you can monitor ELB Metrics like SurgeQueueLength and SpillOverCount which help you determine how many requests are waiting to be served and how many requests have been dropped respectively. Ideally, SpillOverCount should be zero and then you can do a cost vs availability calculation of your application to determine how many requests you can afford to have in your queue. Based on these two parameters you can add more instances to the ASG and then have ELB route traffic to the new instances.
All these things will help you build a highly available, secure and reliable load balancer for your application. But there are use cases where you need something more out of your load balancers, this is where the two different types of load balancers come into play.
Classic Load Balancer: Classic Load Balancers are the type of load balancers that we have used in the past. They route traffic based on the application or network level information and are ideal in scenarios where you just want to route traffic between multiple instances running in one or multiple AZs.
To distribute traffic evenly across multiple AZs, we need to enable cross-zone load balancing on the load balancer.
Application Load Balancer: This type of load balancer operates at the application layer and allows us to define routing rules based on content across multiple services and containers running on one or more EC2 instances. You can configure one or more listeners for your load balancer. A listener checks for connection requests from clients and then uses the rules that you define to route traffic to a specific target group.
Once the listener forwards traffic to the target group, your target group can route this request to one of the targets. Your EC2 instances are the end targets and you can have the same target span across multiple target groups. Each target group has its own health checks which are performed against all the targets that are registered in that group. The additional benefits that the application load balancer offers are as follows:
- Support for path-based routing: This enables your listener to forward requests based on the URL in the request.
- Multiple services on EC2: You can have multiple services running on your EC2 instance, register them to multiple target groups using different port combinations and then have rules in the listener to route traffic appropriately.
- Support for containerized applications: You can have containers running on your EC2 container services cluster and then have rules for routing traffic to the target groups that offer specific services.
- Monitoring each service independently: Defining health checks at the target group level, helps us to monitor our per service capacity and then use CloudWatch and ASG to increase the number of instances required for a specific service.
Phew.. This was a lot of information to take in, but there are still topics like pricing and developer resources that I haven’t covered in this blog post, if you want to learn more about them then you can check out the following links: