| |
High availability is a service that is designed and operated to minimize downtime. It is common for high availability techniques to achieve an availability of over 99.99%. This translates to 52.56 minutes of downtime a year. The following are common high availability techniques.
Designing and building components to be reliable over time in a variety of real world conditions.Change ManagementControlling change to an environment using a managed process that requires extensive testing before changes are launched to production.Tracking the configuration and design of production to support activities such as troubleshooting and rollback to a stable state.
Managing the capacity of resources such as licenses, storage and computing.Runbook AutomationAutomation of support, operations and incident response processes.Load BalancingDistributing workloads across multiple resources using techniques such as cloud computing.Failure DetectionAutomatically detecting failures.
Automatically moving workloads from failed resources to functioning resources.Providing a single contact for users to report incidents.Rapidly escalating incidents to the people in a position to fix things. Restoring service in the quickest way possible.
Following up on incidents to determine root cause and implement fixes and changes to prevent future incidents.
Reliability Engineering
This is the complete list of articles we have written about reliability engineering.
If you enjoyed this page, please consider bookmarking Simplicable.
© 2010-2023 Simplicable. All Rights Reserved. Reproduction of materials found on this site, in any form, without explicit permission is prohibited.
View credits & copyrights or citation information for this page.
|