Workload Distribution
Distribution of workloads to resources such as services, servers or platforms. This is the core functionality provided by a load balancer and has several common variations:Host-based | Distributing requests based on the requested hostname. |
Path-based | Using the entire URL to distribute requests as opposed to just the hostname. |
Content-based | Inspecting the message content of a request. This allows distribution based on content such as the value of a parameter. |
Layers
Generally speaking, load balancers operate at one of two levels:Network-layer | Load balancing that works at the networking transport layer, also known as layer 4. This performs routing based on networking information such as IP addresses and is not able to perform content-based routing. These are often dedicated hardware devices that can operate at high speed. |
Application-layer | Load balancing that operates at the application layer, also known as layer 7. These can read requests in their entirety and perform content-based routing. This allows the management of load based on a full understanding of traffic. |
Scheduling Algorithms
Load balancers may support multiple algorithms for distributing load. For example:Round Robin | Routing to a pool of servers in a rotating sequential manner. |
Weighted Round Robin | Assigning a weight to each server and routing more often to heavy servers and less often to light servers in a rotating sequential manner. |
Least Connection | Routing the current request to the server that currently has the least active sessions. |
Weighted Least Connection | Defining the capacity of each server as a weight and routing to the server that currently has the least active sessions in comparison to its capacity. |
Adaptive Load Balancing | Obtaining feedback from servers to determine their actual load and routing requests to the least busy server. |
Chained Failover | A fixed sequence of servers whereby the request is always routed to the first server in the chain if it is not busy. If the first server is busy, the request is routed to the second in the chain and so on. |
Response Time | Routing to the server that is responding the fastest recently. |
Software Defined | The ability to customize the load balancing algorithm to build in intelligence. This can be implemented at either the layer 4 or layer 7 level with intelligence about your network or application respectively. |
Autoscaling
Starting up and shutting down resources in response to demand conditions. For example, a cloud load balancer that starts new computing instances in response to peak traffic and releases the instances when traffic subsides.Health Check
The ability to determine if a resource is down or performing poorly in order to remove the resource from the load balancing pool. This process is also known as failover. Health checks may also perform notifications and autoscaling.Sticky Session
The ability to assign the same user or device to the same resource in order to maintain session state on the resource.Persistent Connections
Allowing a server to open a persistent connection with a client such as a WebSocket.Certificates
Presenting certificates to a client and authentication of client certificates.Encryption
Handling encrypted connections such as TLS and SSL.Server Name Indication
Dynamically returning certificates for a service based on the requested hostname.Authentication
Authentication of clients and authorization to access resources.Compression
Compression of responses.Caching
An application layer load balancer may offer the ability to cache responses to reduce load.Request Tracing
Assigning each request a unique id for the purposes of logging, monitoring and troubleshooting.Redirects
The ability to redirect an incoming request based on factors such as the requested path.Fixed Response
Returning a static response for a request such as an error message.High Availability
Load balancing services may offer a high availability SLA such as an uptime guarantee of 99.99%. This requires multiple load balancers such that load balancers themselves will failover if there is a problem.Rate Limiting
Imposing limits on the rate at which a single client is entitled to a response.Firewall
A load balancing service may offer a network firewall or web application firewall as an integrated feature.Monitoring
Load balancers are a good place to monitor your traffic as everything passes through them. As such, load balancers may provide performance, health and security monitoring tools and integrations.Logging
Logging of request and response metadata. This can serve as an important audit trail or source for analytics data.Notes
There are ways to achieve load balancing without a load balancer. For example, clients can be designed to pick a random choice from a list of currently running domains or IPs. Edge computing is another approach to distributing load that involves returning a different DNS result to clients in different regions to serve them from the data center closest to their location.Overview: Load Balancing | ||
Type | ||
Definition | A class of tools for distributing workloads across multiple computing resources. | |
Related Concepts |