DecouplingStructuring designs with independent components that are decoupled such as an aircraft with two engines with completely redundant systems in areas such as control and fuel supply. bulkhead is a structure that isolates damage to one area such as a fireproof wall designed to prevent a fire from spreading quickly through a building.
RedundancyRedundancy such as a software platform that runs on 1,200 servers in 60 data centers as opposed to two servers in one data center.
RetryRetrying things that fail such as an email server that will try to resend a message that fails for several days.
UndoThe ability to go backwards to correct failures and mistakes.backups that are started up when they are needed such as a data center with two backup generators that can each generate enough power for the entire facility.Derating is a design that alters its services when something is wrong to prevent things from getting worse. For example, a vehicle that automatically limits speed when its engine is overheating or experiencing mechanical problems. This may allow the occupants of the vehicle to get to a safe place before the engine completely fails. continue operating when errors occur. Generally speaking, older software was often designed to halt at the first sign of an error. Engineers feared that continuing after an error might produce unpredictable results. Modern engineers have no such fear and tend to handle exceptions without halting execution.
MonitoringMonitoring failure to implement fixes, workarounds and graceful degradation. For example, an aircraft that shutsdown an engine after a bird strike to prevent it from catching fire or damaging the rest of the aircraft.
DurabilityDesigns that are fundamentally durable such that a wide range of stresses aren't likely to cause damage. For example, a bicycle tire rim made with metal that can withstand forces far beyond anything typically experienced by a bicycle without bending or buckling.resilient to stress by virtue of their simplicity. For example, a city with more green space is more resilient to flooding as opposed to a concrete laden city where water can't be absorbed by the soil.
|Overview: Design For Failure|
The practice of designing things to retain their quality in the face of failures and stresses.