FREE ITIL® Exam Practice Tests
ITIL® - Events, Incidents, Problem and Workaround
We begin the key terms definition and understanding with events, alerts and incidents.
Event is defined as an expected or unexpected change of state of an IT component that could or is negatively impacting delivery of IT services.
Events are typically notifications created by an IT service, Configuration Item (CI) or a monitoring tool.
To establish importance, Events may be classified as:
- Events that indicate a normal operation, otherwise known as Informational event types - For example a user logging on to use an application.
- Events that indicate an abnormal operation, otherwise known as exceptional event types - For example a user who is trying to log on to an application with an incorrect password or a PC scan that reveals the installation of unauthorized software.
- Events that signal an unusual but not exceptional operation, otherwise known as Warning event types - It may provide an indication that the situation requires a little more supervision. For example utilization of a server's memory reaches within five per cent of its highest acceptable level.
But what is an Alert and an Incident and what the relationship between the three?
We understand this in the next slide.
Alerts & Incidents
An alert is a warning that a threshold has been reached, something has changed, or a Failure has occurred.
Alerts are often created and managed by System Management tools and managed by the Event Management Process.
The Objective is to notify the concerned Stakeholders so that an action could be taken to correct the situation.
An Incident, on the other hand, refers to an unplanned interruption to an IT service or a reduction in the quality of an IT service.
It could be a failure of an IT component that has not yet affected service, but could likely disrupt service if left unchecked. This can be raised by IT support teams. For Example: Failure of a server in a clustered mode.
So how are Event, alert and Incident related?
Alerts and Incidents are all events but not all events are alerts and much less incidents
We will understand how these are handled in the respective processes, namely event management and Incident Management.
This brings us to another key term in service operations, Service request.
A service request is a generic description for many varying types of demands that are placed upon the IT Department by the users.Many of these requests are actually small changes - low risk, frequently occurring, low cost, etcetera.
Their scale and frequency, low-risk nature means that they are better handled by a separate process, rather than being allowed to congest and obstruct the normal Incident and Change Management processes and so are typically handled in the request fulfillment process.
Some common Examples are:
- A request to change a password or unlock accounts.
- A request to install an approved software application onto a particular PC,
- A request to relocate desktop equipment
So what happens when we have one or more incidents to resolve but have not found an underlying cause?
That means we have a problem! Let us understand this key term in our next slides.
Problem & Workaround
In the context of ITIL, a problem is defined as a cause of one or more incidents, whose cause is usually not known at that time.
And the process for managing problems is called problem management. Problems will be typically classified and prioritized as incidents.
And what's a workaround?
A workaround is a temporary way to restore service failures to a usable level. For example; rebooting a server hang, so we don't know why the server failed, but if we reboot the server, the service will be up.
The workarounds are used for reducing or eliminating the Impact of an Incident or Problem for which a full Resolution is not yet available.
Workarounds could be found while trying to resolve incidents or problems. Workarounds for Incidents that do not have associated Problem Records are documented in the Incident Record and Workarounds for Problems are documented in Known Error Records.
Just in case you are wondering what are incident or problem records? Incident or problem records are records of incidents or problems created in the service management tool.
So are all problems to be solved? Can they be solved? Like we said earlier, problems have similar or same prioritization as incidents, so for low priority problems, a call may be taken on whether or not to make efforts at finding solutions when a workaround will suffice. Using our server reboot example, if the root cause is found to be a faulty motherboard and the server is a low criticality server and a server reboot takes care of the issue for a month, the organization can decide to live with it.
Also it may not be possible to resolve an issue without a hardware or a software upgrade, so once again we will be required to live with a workaround or plan for alternatives, if it is business critical.
And what is a known error; let's understand that in the next slide.