5 Examples of Autoscaling

John Spacey, updated on July 26, 2023

Autoscaling is a feature of cloud infrastructure that allows the resources allocated to a service to be automatically increased and decreased. The following are illustrative examples.

Software as a Service

Software as a service may offer customers an SLA whereby the service is expected to scale to handle peak load without slowing down. The service may be managed to automatically scale when resource demand hits a preset level. This is hidden from customer who simply sees that the service works at a reasonable speed.

Cloud APIs

Cloud APIs include interfaces for requesting more resources. Services designed for cloud may request resources based on events, resource utilization, management decisions and other application specific criteria.

Infrastructure Management

The providers of cloud infrastructure may automatically power down machines that aren't being used. A pool of running unallocated servers is maintained based on demand forecasts or actual observed demand.

Scheduled Scaling

A bank schedules more resources during peak banking hours for backend processing such as settlement.

Predictive Scaling

A video-on-demand company uses predictive analytics to request resources to handle anticipated demand based on historical patterns. If people watch more movies on Valentine's Day, the service will scale up more servers that evening in preparation. Such analytics could potentially consider a large number of factors such as the weather.

Overview: Autoscaling
Type	Cloud
Definition	A feature of cloud APIs that allows resources to be automatically scaled up and down based on a schedule or application specific factors such as events.
Value	Conserving resources such as electricity. Reducing cloud service charges such as CPU hours. Responding to peaks in demand for a service.
Related Concepts	Utility Computing Infrastructure as Code Automation Cloud Computing