Engineering
The read-and-write head of a new hard drive model fails in accelerated life testing. The root cause is determined to be a flaw in the device's caching algorithm that wasn't properly caching data to reduce load on hardware under extreme test scenarios.Software
A bank experiences a problem whereby about 5% of customer's get error messages when paying a bill online. Developers investigate and determine the cause to be data on the customer accounts that isn't formatted as the code expects. The quality assurance team does an analysis and determines that the data is acceptable to the business and that the software should handle such data variations. The root cause is labeled a software bug.Latent Human Error
A grocery store orders accidentally orders 1,000 bags of apples when they only require 100. The order was entered incorrectly and the supplier won't take them back. The store needs to aggressively discount and advertise to sell the apples at a loss. The issue is initially considered human error. A root cause analysis process discovers latent human error in ordering systems. For example, there is no validation or warning for usually large orders. Also, fonts on the system are abnormally small and difficult for some employees to read clearly.Availability
A media company's website has availability of 97% where its peers commonly achieve 99.99%. Each time the website goes down it is attributed to a cause such as a failed change, human error, data issues and service crashes. The company performs a gap analysis to discover root causes of these failures. The report finds that the website's code, platform, infrastructure and development processes all have issues that are creating an environment of instability. For example, the firm has outsourced development to a firm that has a high turnover rate. The firm is regularly moving employees around such that each developer only works on the code for 2 weeks on average. Developers are unfamiliar with the platform resulting in bugs and increasing smelly code.Information Security
A government department experiences an information security incident after an employee clicks on a link in an email. The direct cause is reported as human error as the employee was trained not to click on links from external emails. The root causes include that the email wasn't stopped by spam filters and that the employee's machine wasn't updated with recent patches allowing a vulnerability in their operating system to be exploited.Safety
An aircraft lands short of a runway in shallow water. The passengers are rescued. Initially the cause appears to be human error. A safety investigation confirms pilot error as the root cause.Overview: Root Cause | ||
Area | ||
Definition | The fundamental, underlying, deep or initial causes of an event. | |
Value | Resolving problems such that they don't reoccur. | |
Related Concepts |