
Detection, Assessment & Reporting
The ongoing process of detecting incidents, recording them and communicating them to initiate incident management. The following are basic examples of this process:Automated alerts and impact analysis | Monitoring |
Help desk accepts incident reports from users | Customer service logs customer reported incidents |
Determine if an incident has actually occurred | Initial incident analysis and triage |
Determine incident type, classification, priority and impact | Create an incident ticket |
Notify level 1 support | Escalate high impact incidents immediately |
Incident Communication
The process of owning the incident, monitoring responses and providing timely and targeted communications to all stakeholders.Initial notification | Stakeholder analysis |
Impact analysis | Proactively notifying impacted stakeholders |
Contact management processes | Monitoring incident resolution and providing updates |
Tailoring updates for each stakeholder | Maintaining confidentiality where appropriate |
User and customer service | Collecting feedback from users and customers |
Communicate issue closure | Business apologies as appropriate |
Communicate steps that will be taken to prevent future incidents | Post-incident communication to update stakeholders on problem management and continuous improvement steps that are taken |
Response & Resolution
The process of investigating, containing, mitigating and resolving an incident.Incident containment | Troubleshooting |
Planning and approvals for changes | Escalating to other levels of support |
Engaging individuals and teams to help | Applying fixes or mitigating incident |
Workarounds | Updating status regularly |
Incident resolution |
Problem Management
The process of investigating the root cause of incidents to implement long term fixes that are sustainable. This can be a less urgent but far more extensive process that can span months where large changes are required.Problem identification | Problem assessment including initial priority and categorization |
Create problem ticket | Problem investigation |
Root cause analysis | Develop a plan to fix the problem |
Implement plan | Deployment, verification and testing |
Problem closure |
Continuous Improvement
The process of improving the incident response process. This often means creating standard operating procedures for troubleshooting and resolving recurring incidents.Incident review | Feedback from stakeholders |
Post-incident analysis | Improvements to documents |
Improvements to alerts and monitoring | Improvements to processes and training |
Lessons learned – what worked, what didn’t work | Documenting steps to handle similar incidents in future |
Overview: Incident Management | ||
Type | ||
Definition | The time sensitive process of resolving operational issues. | |
Related Concepts |