| |
Load balancing is a class of tools for distributing workloads across multiple computing resources. This is a basic element of infrastructure that allows computing services to be scaled. The following are examples of load balancer functionality.
Workload DistributionDistribution of workloads to resources such as services, servers or platforms. This is the core functionality provided by a load balancer and has several common variations:Host-based | Distributing requests based on the requested hostname. | Path-based | Using the entire URL to distribute requests as opposed to just the hostname. | Content-based | Inspecting the message content of a request. This allows distribution based on content such as the value of a parameter. |
LayersGenerally speaking, load balancers operate at one of two levels:Network-layer | Load balancing that works at the networking transport layer, also known as layer 4. This performs routing based on networking information such as IP addresses and is not able to perform content-based routing. These are often dedicated hardware devices that can operate at high speed. | Application-layer | Load balancing that operates at the application layer, also known as layer 7. These can read requests in their entirety and perform content-based routing. This allows the management of load based on a full understanding of traffic. |
Scheduling AlgorithmsLoad balancers may support multiple algorithms for distributing load. For example:Round Robin | Routing to a pool of servers in a rotating sequential manner. | Weighted Round Robin | Assigning a weight to each server and routing more often to heavy servers and less often to light servers in a rotating sequential manner. | Least Connection | Routing the current request to the server that currently has the least active sessions. | Weighted Least Connection | Defining the capacity of each server as a weight and routing to the server that currently has the least active sessions in comparison to its capacity. | Adaptive Load Balancing | Obtaining feedback from servers to determine their actual load and routing requests to the least busy server. | Chained Failover | A fixed sequence of servers whereby the request is always routed to the first server in the chain if it is not busy. If the first server is busy, the request is routed to the second in the chain and so on. | Response Time | Routing to the server that is responding the fastest recently. | Software Defined | The ability to customize the load balancing algorithm to build in intelligence. This can be implemented at either the layer 4 or layer 7 level with intelligence about your network or application respectively. |
Starting up and shutting down resources in response to demand conditions. For example, a cloud load balancer that starts new computing instances in response to peak traffic and releases the instances when traffic subsides.Health CheckThe ability to determine if a resource is down or performing poorly in order to remove the resource from the load balancing pool. This process is also known as failover. Health checks may also perform notifications and autoscaling.
Sticky SessionThe ability to assign the same user or device to the same resource in order to maintain session state on the resource.Persistent ConnectionsAllowing a server to open a persistent connection with a client such as a WebSocket.CertificatesPresenting certificates to a client and authentication of client certificates.EncryptionHandling encrypted connections such as TLS and SSL.
Server Name IndicationDynamically returning certificates for a service based on the requested hostname.AuthenticationAuthentication of clients and authorization to access resources.CompressionCompression of responses.CachingAn application layer load balancer may offer the ability to cache responses to reduce load.Request TracingAssigning each request a unique id for the purposes of logging, monitoring and troubleshooting.RedirectsThe ability to redirect an incoming request based on factors such as the requested path.Fixed ResponseReturning a static response for a request such as an error message.Load balancing services may offer a high availability SLA such as an uptime guarantee of 99.99%. This requires multiple load balancers such that load balancers themselves will failover if there is a problem.Rate LimitingImposing limits on the rate at which a single client is entitled to a response.FirewallA load balancing service may offer a network firewall or web application firewall as an integrated feature.MonitoringLoad balancers are a good place to monitor your traffic as everything passes through them. As such, load balancers may provide performance, health and security monitoring tools and integrations.LoggingLogging of request and response metadata. This can serve as an important audit trail or source for analytics data.NotesThere are ways to achieve load balancing without a load balancer. For example, clients can be designed to pick a random choice from a list of currently running domains or IPs. Edge computing is another approach to distributing load that involves returning a different DNS result to clients in different regions to serve them from the data center closest to their location. |
Type | | Definition | A class of tools for distributing workloads across multiple computing resources. | Related Concepts | |
Information Technology
This is the complete list of articles we have written about information technology.
If you enjoyed this page, please consider bookmarking Simplicable.
© 2010-2023 Simplicable. All Rights Reserved. Reproduction of materials found on this site, in any form, without explicit permission is prohibited.
View credits & copyrights or citation information for this page.
|