Service Operation Processes Tutorial

3.1 Service Operation Processes

Learning Unit 03: Where we will discuss Service Operation processes. This unit covers all the processes of Service Operation in detail. Let us look at the agenda for this Learning Unit in the next slide.

3.2 Service Operation Processes

The Learning Unit includes five processes that are part of the Service Operations publications and they are Event Management, Incident management, Problem Management, Request Fulfillment and Access Management. Let us proceed to the next slide and look at the topics covered for the Event Management process.

3.3 Event Management

Let us start with a detailed discussion on the Event Management Process. In the next few slides we will be covering the following topics on Event Management. Purpose and Objectives of Event Management Scope of Event Management Value to Business from Event Management process Event Management Policies Basic Concepts Event Management process activities Triggers, Inputs, Outputs and Interfaces Critical Success Factors and Key Performance Indicators and Challenges and Risks pertaining to Event Management.

3.4 Event Management - Purpose And Objectives

We will now start with explaining the purpose and objectives of the Event Management Process. The purpose of Event Management is to manage events throughout their lifecycle. An event’s lifecycle includes detecting events, make sense of them and determine the appropriate control action. Event Management is responsible for coordinating these activities. Event management also forms the basis for operational monitoring and control and can help in automating many routine operations management activities. The Objectives of Event Management process are : To detect all changes of state that have significance for the management of a Configuration Item or an IT service To determine the appropriate control action for events and ensure these are communicated to the appropriate functions To provide the trigger, or entry point, for the execution of many service operation processes and operations management activities To provide the means to compare actual operating performance and behaviour against design standards and Service Level Agreement and To provide a basis for service assurance and reporting, and service improvement. In the next slide we will discuss about the Scope of Event management.

3.5 Event Management - Scope

This slide explains the scope of the Event Management in Service Operation. Identifying the scope of Event Management means, where its utilization can be done on the basis of various triggers. One of them can be the identified CI’S. Now here comes a question on how we can identify the CI’S which we will monitor? The answer to this question is taken care by Design and transition. During transition itself you will identify all the important CI’S and important key elements which would need monitoring once the service is live. Event management will continuously monitor the service on these items and generate notifications. The set of activities followed by the team is an Event Management process. Event management can be described as the process that monitors all events that occur through the IT infrastructure to allow for normal operation and also to detect and escalate exception conditions. There are more factors which contribute to the scope of Event Management. Let us learn about them. Event management is used for the check on Environmental conditions for example fire and smoke detection which is crucial to our business contingency plans. Bigger organizations use Software license monitoring of usage to ensure optimum/legal license utilization and allocation is done. Event management can also be used for Security checks like intrusion detection or access history data or financial data access record etc. It can also be used to track Normal activity for example, tracking the use of an application or the performance of a server.

3.6 Event Management - Value To Business

It is a globally accepted fact that implementing ITIL framework and processes lead to immense benefits to the service providers as well the business. We will now take a look at the value derived by business through the Event Management process. ? Event Management provides a mechanism for early detection of incidents. Present day event management tools are capable enough to proactively detect incidents and assign them to designated groups for action before any actual service outage occurs. ? With Event Management it is possible to automate infrastructure, application or service monitoring and generate warnings and exceptions. This helps in reducing or removing the need for resource intensive real-time monitoring. ? When integrated with other processes, Event Management improves their performance through early response to status changes and exceptions. Incident Management, Capacity Management and Availability Management are some such processes that benefit from Event Management. This will further lead to more efficient and effective overall service management. ? Event Management provides a basis for automated operations which helps in increasing efficiency and utilization of human resources for more innovative activities. ? Event Management can directly contribute to improving service delivery and customer satisfaction. Proactive notifications will result in early detection of incidents. This will help in faster resolution and reduced service outages.

3.7 Event Management - Policies

A policy is a formally documented management expectations and intentions. Policies are used to direct decisions, and to ensure consistent and appropriate development of processes and standards. We will now look at some examples of Event Management related policies. ? ‘Event notifications should only go to those responsible for the handling of their further actions or decisions related to them’. This policy will ensure that Event Management will identify the right groups or individuals who will receive the notifications and the event routeing information is up-to-date taking care of new events and personnel changes. ? ‘Event management and support should be centralized as much as reasonably possible’. This policy helps in avoiding conflicts in management of events and ensures that support personnel do not receive notifications for events they are not yet prepared to handle. ? ‘All application events should utilize a common set of messaging and logging standards and protocols wherever possible’. This policy helps in maintaining consistency in events handling. This will further lead to faster implementation of new events and their response actions. ? ‘Event handling actions should be automated wherever possible’. This policy ensures consistent handling of events and reduction in human effort needed to manage them. ? ‘A standard classification scheme should be in place that references common handling and escalation processes’. This policy tries to establish a systematic approach for responding to events in line with meeting operational and service level objectives. ? ‘All recognized events should be captured and logged’. This policy ensures that a mechanism is in place for logging all events.

3.8 Event Management - Basic Concepts

Let us now look into the concept of Event Management. To understand the basic concepts in Event management we first need to understand what is an event? Let us understand it better by a simulation of activities. Once the service becomes live and operational we need to have certain monitory measures in place to ensure that service is performing its tasks without any issues. These monitory measures are taken care by Event management process. An Event is defined as any detectable or discernible occurrence that has significance for the management of the IT infrastructure or the delivery of IT service, and evaluation of the impact a deviation might cause to the services. During transition itself you will identify all the important CI’S and important key elements which would need monitoring once the service is live. Event management will continuously monitor the service on these items and generate notifications. The set of activities followed by the team is an Event Management process. Event management can be described as the process that monitors all events that occur through the IT infrastructure to allow for normal operation and also to detect and escalate exception conditions. In the next slide let us learn about categorization of events.

3.9 Event Management - Basic Concepts

Now let us look at the different types of events. Events can be classified into informational events, warning events and exception events. This classification is essential to identify the significance and define a suitable course of action for the events generated in the IT environment. Events that signify regular operation fall under the category of Informational Events. An email reaching the intended recipient or a schedule job completing normal are examples of informational events. Events that signify unusual, but not an exceptional occurrence may be categorized as Warning Events. A database nearing the threshold limit or a transaction taking 10% more time to complete are some examples under this category. Events that signify an exception or exceeding the set limits are called as exceptional events. A scheduled job failing with errors or a device’s CPU is above the expected utilization rate are examples of exceptional events.

3.10 Event Management - Basic Concepts

In our last two slides we discussed about the scope of event management. As we have understood the scope, let us look at the desired inputs and outputs of the Event Management process. This is a generic graph to present the outputs of Event Management. As explained earlier, on the basis of identification and checks, and after making sense and taking appropriate action, the event outputs are categorized into three types. They are known as informational events, warning events which would say something has gone wrong but might not have a significant event at the moment and exception events which need to be checked on the exceptions and their associated actions. Let us now move on to the next slide which talks about handling exceptions in the process.

3.11 Event Management - Basic Concepts

In our last slide we discussed about the inputs and outputs of the Event Management process. This slide explains about handling the exceptions in the process. Some events will represent a situation where the appropriate response will need to be handled through the Incident, Problem or Change Management process. It is important to note here that a single Incident may initiate anyone or a combination of these three processes – for example, a non-critical server failure is logged as an Incident, but as there is no workaround, a Problem Record is created to determine the root cause and resolution and an RFC(pronounced as R-F-c) is logged to relocate the workload onto an alternative server while the problem is resolved. There are two Situations in the Event Management process where an RFC can be created: Firstly, When an exception occurs: let us understand this with an example; a scan of a network segment reveals that two new devices have been added without the necessary authorization. A way of dealing with this situation is to open an RFC, which can be used as a vehicle for the Change Management process to deal with the exception (as an alternative to the more conventional approach of opening an Incident that would be routed via the Service Desk to Change Management). Investigation of Change Management is appropriate here since unauthorized changes imply that the Change Management process was not effective. Second situation is when Correlation identifies that a change is needed: In this case the event correlation activity determines that the appropriate response to an event is for something to be changed. For example, a performance threshold has been reached and a parameter on a major server needs to be tuned. How does the correlation activity determine this? It was programmed to do so either in the Service Design process or because this has happened before and Problem Management or Operations Management updated the Correlation Engine to take this action. Now let us understand the recording process for each occurrence: Open an Incident Record: As with an RFC, an Incident can be generated immediately when an exception is detected, or when the Correlation Engine determines that a specific type or combination of events represents an Incident. When an Incident Record is opened, as much information as possible should be included – with links to the events concerned and if possible a completed diagnostic script. Open or link to a Problem Record: It is rare for a Problem Record to be opened without related incidents for example as a result of a Service Failure Analysis or maturity assessment, or because of a high number of retry network errors, even though a failure has not yet occurred. In most cases this step refers to linking an Incident to an existing Problem Record. This will assist the Problem Management teams to reassess the severity and impact of the problem, and may result in a changed priority to an outstanding problem. Let us now move on to learn about handling warnings and information events.?

3.12 Event Management - Basic Concepts

In our last slide we discussed about managing the exceptions. Exceptions will include ignoring few events, record and raise warning on few and take action by raising Incident or problem record on few more. This slide explains on handling of warnings and information events. While managing Warnings and exceptions, Actions taken by Event Management will depend on the event type. In case of information, Event Management has to simply log it and store for future reference and no action is to be taken In case of warnings, Event Management needs to check the warnings against the significance and take appropriate action. For example there are certain warnings which are frequently occurring, however do not have a significant effect on the service like server bounce etc. In such cases these warnings are generated alerts through an automated process to notify the key people and make them aware of the warning. Here human intervention is required as the key person needs to check the warning message and perform troubleshooting to check if that is usual occurrence or something has really gone wrong. Certain warnings are already defined as fake warnings or warnings which need to be ignored, such case might have an option to send an auto response to inform the intended audience that warning was generated for this issue and it needs to be ignored. Certain warnings are important to be noted as they are ignored, significant impact on the service can occur. This will lead to the generation of an Incident for any of the reasons either Incident / problem or change required to fix the issue. So now that we are aware of each type of event and how process handles the events, let us move to our next topic on event managements value to business.

3.13 Event Management - Process Activities

We all know that a process is a set of activities designed to achieve a specific objective. Now let us gain some knowledge on the activities performed within Event Management process. • The first activity within Event Management process is the occurrence of an event. Within an IT environment there are innumerable events occurring. Only those that are of significance to IT service management will be dealt with. • The infrastructure components or configuration items will generate event notifications based on certain predefined conditions. • The next activity is event detection. Once a notification is generated, it will be detected by an agent running on the same system, or transmitted directly to a management tool specifically designed to read and interpret the notification. • The detected event is logged as an event record in the event management tool or in the system log of the device or application that generated the event. • Once the event is logged a first-level correlation and filtering is performed to decide whether to communicate the event to a management tool or to ignore it. If ignored, no further action will be taken. • The recorded events will then be categorised as Informational, Warning or Exceptional events based on the significance of the events. • A correlation engine within the event management tool performs the second-level event correlation wherein the event is compared with a set of criteria and rules in a defined order. It will also determine if a further action is required? • The next activity is to select an appropriate response for handling the event. Some options available are: Auto response, Alert and human intervention, or opening an incident record. • All significant and exceptional events are formally reviewed to ensure that the events have been appropriately handled. The review will also help in ensuring that the handover between event management and other processes took place as designed. • All recorded events should be formally closed. Events that generated an incident, problem or changes should be closed with a link to respective records in the service management tools. Next, we will understand the triggers, inputs, outputs and interfaces of event management.

3.14 Event Management - Triggers, Inputs, Outputs And Interfaces

A change in state of an infrastructure component or configuration item is the key trigger for initiating the Event Management process. Let us look at some of the examples of triggers for this process. • Exceptions to CI performance as compared to design specifications, or standard operating procedures; • Exceptions to an automated procedure or process; • An exception within a business process that is being monitored by event management; • The completion of an automated task or job; • A status change in a server or database configuration item; • Access of an application or database by a user or automated procedure or job; and • A device, database or application reaching or exceeding a predefined threshold of performance Next, let us understand the key inputs of event management followed by the outputs.

3.15 Event Management - Triggers, Inputs, Outputs And Interfaces

Service Design and Service Transition provide the key inputs for the Event Management process. We will now detail some of the inputs from these two stages of service lifecycle. • The operational and service level requirements associated with events and their actions; • The alarms, alerts and thresholds that have been implemented or configured for recognizing events; • The event correlation tables, rules, event codes and automated response solutions which are designed and implemented to support event management activities; • The defined roles and responsibilities for recognizing events and communicating them to those who need to handle them; and • The operational procedures for recognizing, logging, escalating and communicating events.

3.16 Event Management - Triggers, Inputs, Outputs And Interfaces

The Event Management process outputs are : • Events that have been communicated and escalated to those responsible for further action; • Updated event logs consisting details of events, escalations and communications made to support diagnosis and improvement activities; • Events that indicate an incident has occurred; • Events indicating the potential breach of an SLA or OLA objective; • Events and alerts that indicate completion status of deployment, operational or other support activities; and • Populated SKMS with event information and history. Now let us discuss the interfaces of event management.

3.17 Event Management - Triggers, Inputs, Outputs And Interfaces

Event Management interacts and interfaces with a number of other ITIL processes. Let us examine some of the important ones. • With respect to Service Level Management, Event Management can help in minimizing the impact on service level targets. This is achieved by detecting incidents and failures as early as possible and rectifying them through automated responses or routeing to relevant groups. • Event management plays a significant role in alerting Information Security Management of any breach or potential breach, to be detected and acted upon as soon as possible. • The Capacity and Availability Management processes determine and design the Event Management requirements for services provided. They define the significance of events, thresholds and response actions required. Event Management suggests improvements to Capacity and Availability Management process through monitoring data and reports. • Service Asset and Configuration Management uses event management to determine the status of configuration items. Alerts can be configured to notify if any unauthorised changes, deletion or addition have been performed with respect to configuration items. • Event records and history can be a rich source of information that can be processed for inclusion in knowledge management. This helps in providing valuable inputs and feedback to Service Design and Service Transition phases. • Certain exceptional events may require changes to be implemented. These changes are routed though Change Management. • Event management tools may be integrated with service management tools to automatically log incidents for certain categories of events. This will further lead to performing activities necessary for resolving related incidents and problems. Thus there is tight interface between Event Management and Incident and Problem Management processes. • Events can be used to determine unauthorized access attempts and security breaches. Thus Event Management helps improve the efficiency and effectiveness of the Access Management process. In the next slide we will discuss about the CSFs and KPIs of event management.

3.18 Event Management - CSFs And KPIs

Critical success factors and key performance indicators help service providers and business to measure the efficiency of the process and identify opportunities for improvement. We will now discuss a few examples of Event Management related critical success factors and their key performance indicators. “Detecting all changes of state that have significance for the management of CIs and IT services” can be one critical success factor. The related key performance indicator will be “the Number and ratio of events compared with the number of incidents”. The second Critical Success Factor is “Ensuring all events are communicated to the appropriate functions that need to be informed or take further control actions”. The Key Performance Indicator will be “ Number and percentage of events that required human intervention and whether this was performed”. A third example of Critical Success Factor for Event Management is to “ Provide the means to compare actual operating performance and behaviour against design standards and SLAs”. The related key performance indicators are “the number and percentage of incidents that were resolved without impact to the business”; “the number and percentage of events that resulted in incidents or changes” and “the number and percentage of events indicating performance issues”. Another example of Event Management Critical Success Factor is “Providing a basis for service assurance, reporting and service improvement”. The related key performance indicators are “the number and percentage of repeated or duplicated events” and “the number of events/alerts generated without actual degradation of service or functionality”. Let’s now proceed to understand the challenges and risks of event management.

3.19 Event Management - Challenges And Risks

There is a possibility that IT organizations and businesses might encounter some challenges while implementing and adhering to processes. Some such Event Management process related challenges are: • To obtain funding for the necessary tools and effort needed to install and exploit the benefits of the tools. A proper business case detailing the benefits will have to be produced to obtain the required funding. • Setting the correct level of filtering is one of the biggest challenges of Event Management. Sufficient care must be taken and all aspects considered while determining the event filtering criteria. • Another key challenge is deploying the necessary monitoring agents across the entire IT infrastructure. This could be time consuming and might require commitment over a long period of time. • Automated monitoring activities can generate additional network traffic that might have negative impacts on planned capacity levels of the network. • Acquiring the necessary skills and deploying event management tools without setting up processes to operate them are the other challenges that may come up.

3.20 Event Management - Challenges And Risks

Risk mitigation and management is essential for the success of any process implementation. The three important risks pertaining to Event Management process are : ? The risk of failure to obtain adequate funding; ? The risk of ensuring the correct level of filtering; and ? The risk of failure to maintain momentum in deploying the necessary monitoring agents across the IT infrastructure. These risks need to be addressed by the service management teams to minimize the adverse impact on the success of the Event Management process. Lastly, we will discuss the roles involved in the event management in the next slide.

3.21 Event Management - Roles

this slide talks about the roles involved in Event Management. Event Management does not need dedicated people. It can be mapped to: • In many organizations Service Desk Performs the Event Management process as part of their role. • Wherein some organizations Event Management could be a part of the role of Technical Management functions or Application Management functions or IT Operations Management. This completely depends on the organization's discretion and their requirement of the Event Management process for a specific area.

3.25 Incident Management

We will now move on to the Incident Management process and in the next few slides we shall be discussing topics on • The Purpose and Objectives of Incident Management • Scope of Incident Management • Value to Business from Incident Management • Incident Management Policies • Basic Concepts • Incident Management Process Activities • Triggers, Inputs, Outputs and Interfaces • Critical Success Factors and Key Performance Indicators; and finally • Challenges and Risks related to Incident Management. Let us begin with the purpose and objective of incident management.

3.26 Incident Management - Purpose And Objectives

Users and customers might face a number of issues like disruption in service, degradation in the quality of service or failure of certain components. These may have a direct impact on the business activities and outcomes. Incident Management process is concerned with handling these type of issues and resolving them as soon as possible. As a first step, let us try to understand the purpose and objectives of Incident Management. The key purpose of Incident Management process is to restore normal service operation as quickly as possible and minimize the adverse impact on business operations. In doing so, Incident Management ensures that the agreed levels of service quality are maintained. Here ‘agreed levels’ refer to the agreed levels as documented and accepted in the Service Level Agreement. The objectives of Incident Management process are: • To ensure that standardized methods and procedures are used for efficient and prompt response, analysis, documentation, ongoing management and reporting of incidents; • To increase visibility and communication of incidents to business and IT support staff; • To enhance business perception of IT by adopting a professional approach in resolving incidents and communicating status updates; • To align incident management activities and priorities with those of the business; and • To maintain user satisfaction with the quality of IT services. In the next slide let us understand the scope of incident management.

3.27 Incident Management - Scope

This slide explains the scope of the incident management process. Incident Management includes any event which disrupts, or which could disrupt, a service. This includes events which are communicated directly by users, either through the Service Desk or through an interface for Event Management to Incident Management tools. Incidents can also be reported and/or logged by technical staff for example, if they notice something untoward with a hardware or network component they may report or log an Incident and refer it to the Service Desk. This does not mean, however, that all events are incidents. Many classes of events are not related to disruptions at all, but are indicators of normal operation or are simply informational. Although both incidents and service requests are reported to the Service Desk, this does not mean that they are the same. Service requests do not represent a disruption to agreed service, but are a way of meeting the Customer’s needs and may be addressing an agreed target in an SLA. Service requests are dealt by the Request Fulfillment process.

3.28 Incident Management - Value To Business

This slide explains the incident management’s value to the business. The value of Incident Management includes The ability to detect and resolve incidents which results in lower downtime to the business, which in turn means higher availability of the service. This means that the business is able to exploit the functionality of the service as designed. This also relates to the ability to align IT activity to real-time business priorities. This is because Incident Management includes the capability to identify business priorities and dynamically allocate resources as necessary. Incident Management includes the ability to identify potential improvements to services. This happens as a result of understanding what constitutes an Incident and also from being in contact with the activities of business operations staff. The Service Desk while handling the incidents can also identify additional service or training requirements found in IT or the business. Incident Management is highly visible to the business, and it is therefore easier to demonstrate its value than most areas in Service Operation. For this reason, Incident Management is often one of the first processes to be implemented in Service Management projects. The added benefit of doing this is that Incident Management can be used to highlight other areas that need attention – thereby providing a justification for expenditure on implementing other processes.

3.29 Incident Management - Policies

ITIL encourages establishment of policies and procedures for all processes required to manage IT services. We will now discuss some of the Incident Management related policies. One key policy of Incident Management is that ‘incidents and their status must be timely and effectively communicated’. A prerequisite for this is a well-established and good service desk function. Communication should be in business language without any technical jargon and should be sent to all those who are impacted by the incident. ‘Incidents must be resolved within timeframes acceptable to the business’. This policy will help in ensuring that all relevant service level agreements, operational level agreements and underpinning contacts are in place. Another important policy is to ‘maintain customer satisfaction at all times’. This implies that the support staff should have adequate technical skills and a customer oriented approach to ensure that the customer expectations are met. ‘Incident processing and handling should be aligned with overall service levels and objectives’. This policy helps in prioritizing activities based on business requirements. A policy that ‘all incidents should be stored and managed in a single management system’ will ensure that a definitive recognized source of incident information is available for analysis, investigation and reporting purposes. ‘All incidents should subscribe to a standard classification schema that is consistent across the business enterprise’. This policy is essential to ensure that consistent classification methods are adopted and IT support teams are aware of these classifications. A very important policy that needs to be enforced is that ‘the Incident records should be audited on a regular basis to ensure they have been entered and categorized correctly’. This ensures that incident information is accurate, correctly categorised and can be trusted by other support teams. ‘A common and agreed set of criteria for prioritizing and escalating incidents should be in place wherever possible’. This policy implies that criteria for prioritization and escalation are established, communicated and agreed by both IT and business. Let us now proceed to understand the basic concepts and terms used in incident management.

3.30 Incident Management - Basic Concepts

Let us start with understanding the basic concepts of incident management. In ITIL terminology, an ‘Incident’ is defined as: An unplanned interruption to an IT service or reduction in the quality of an IT service. Failure of a configuration item that has not yet impacted service is also an Incident, for example failure of one disk from a mirror set. We can understand it as something unusual has occurred and has a potential impact on the service or can impact the service if ignored. Reduction in quality of service can be explained by a server slow down which leads to complete shut down if ignored. Incident Management is the process for dealing with all incidents; this can include failures, questions or queries reported by the users (usually via a telephone call to the Service Desk), by technical staff, or automatically detected and reported by Event Monitoring tools. Let us continue the basic concepts on incident management in the next sldie.?

3.31 Incident Management - Basic Concepts

In our last slide we learnt about the Incident and incident management process But there are some basic things that needs to be taken into account for making decisions when considering an Incident Management. Let us see what they are? Depending upon the priority level of the incident Timescales must be agreed for all Incident-handling stages – based upon the overall Incident response and resolution targets within SLAs – and captured as targets within OLAs and Underpinning Contracts (UCs). We also need to make sure that all support groups should be made fully aware of these timescales. Many incidents are not new – they involve dealing with something that has happened before and may well happen again. For this reason, many organizations will find it helpful to pre-define ‘standard’ Incident Models – and apply them to appropriate incidents when they occur. An Incident Model is a way of pre-defining the steps that should be taken to handle a process (in this case a process for dealing with a particular type of Incident) in an agreed way. Incidents which would require specialized handling can be treated in this way (for example, security-related incidents can be routed to Information Security Management and capacity- or performance-related incidents that would be routed to Capacity Management. The Incident Model should include: • The steps that should be taken to handle the Incident • The chronological order these steps should be taken in, with any dependencies or co-processing defined • Responsibilities; who should do what • Timescales and thresholds for completion of the actions • Escalation procedures; who should be contacted and when • Any necessary evidence-preservation activities which is particularly relevant for security- and capacity-related incidents. The models should be input to the Incident-handling support tools in use and the tools should then automate the handling, management and escalation of the process. In case of higher impact and greater urgency a separate procedure, with shorter timescales and greater urgency, must be used for ‘major’ incidents. A definition of what constitutes a major Incident must be agreed and ideally mapped on to the overall Incident prioritization system – such that they will be dealt through the major Incident process. People sometimes confuse a major Incident with a problem. In reality, an Incident remains an Incident forever – it may grow in impact or priority to become a major Incident, but an Incident never ‘becomes’ a problem. A problem is the underlying cause of one or more incidents and remains a separate entity always! Where necessary, the major Incident procedure should include the dynamic establishment of a separate major Incident team under the direct leadership of the Incident Manager, formulated to concentrate on this Incident alone to ensure that adequate resources and focus are provided for finding a swift resolution. If the Service Desk Manager is also fulfilling the role of Incident Manager (say in a small organization), then a separate person may need to be designated to lead the major Incident investigation team – so as to avoid conflict of time or priorities – but should ultimately report back to the Incident Manager. Throughout, the Service Desk would ensure that all activities are recorded and users are kept fully informed of progress.

3.32 Incident Management - Process Flow

slide explains how the flow of the incident management process takes place. The process starts with identifying the occurrence which is Incident identification Work cannot begin on dealing with an Incident until it is known that an Incident has occurred. As far as possible, all key components should be monitored so that failures or potential failures are detected early so that the Incident Management process can be started quickly. Once you identify the Incident we have to log the Incident which is known as Incident logging: All incidents must be fully logged and date/time stamped, regardless of whether they are raised through a Service Desk telephone call or whether automatically detected via an event alert. The information needed for each Incident is likely to include: Unique reference number Incident categorization (often broken down into between two and four levels of sub-categories) Incident urgency / Incident impact Incident prioritization /Date/time recorded Name/ID of the person and/or group recording the Incident • Description of symptoms • Activities undertaken to resolve the Incident • Resolution date and time • Closure category , date and time Once logged we need to categorize Incident. Part of the initial logging must be to allocate suitable Incident categorization coding so that the exact type of the call is recorded. This will be important later when looking at Incident types/frequencies to establish trends for use in Problem Management, Supplier Management and other ITSM activities. All organizations are unique and it is therefore difficult to give generic guidance on the categories an organization should use, particularly at the lower levels. Sometimes the details available at the time an Incident is logged may be incomplete, misleading or incorrect. It is therefore important that the categorization of the Incident is checked, and updated if necessary, at call closure time (in a separate closure categorization field, so as not to corrupt the original categorization). Another important aspect of logging every Incident is to agree and allocate an appropriate priority code – as this will determine how the Incident is handled both by support tools and support staff. Prioritization can normally be determined by taking into account both the urgency of the Incident (how quickly the business needs a resolution) and the level of impact it is causing. An indication of impact is often (but not always) the number of users being affected. If the Incident has been routed via the Service Desk, the Service Desk Analyst must carry out initial diagnosis, typically while the user is still on the telephone – to try to discover the full symptoms of the Incident and to determine exactly what has gone wrong and how to correct it. It is at this stage that diagnostic scripts and Known Error information can be most valuable in allowing early and accurate diagnosis. As soon as it becomes clear that the Service Desk is unable to resolve the Incident itself (or when target times for first-point resolution have been exceeded – whichever comes first!) The Incident must be immediately escalated for further support. If the organization has a second-level support group and the Service Desk believes that the Incident can be resolved by that group, it should refer the Incident to them. But a key to note here is Incident Ownership remains with the Service Desk! Regardless of where an Incident is referred to during its life, ownership of the Incident remains with the Service Desk at all times. The Service Desk remains responsible for tracking progress, keeping users informed and ultimately for Incident Closure. Hierarchy escalations done for incidents are of a serious nature (for example Priority 1 incidents) to keep appropriate IT managers notified, for informational purposes at least. Hierarchy escalations can also be initiated by the affected users or Customer management– that is why it is important that IT managers are made aware so that they can anticipate and prepare for any such escalation. Each of the support groups involved with the Incident handling will investigate and diagnose what has gone wrong – and all such activities (including details of any actions taken to try to resolve or re-create the Incident) should be fully documented in the incident record so that a complete historical record of all activities is maintained at all times. This investigation is likely to include such actions as: Establishing exactly what has gone wrong or being sought by the user Understanding the chronological order of events Confirming the full impact of the Incident, including the number and range of users affected Identifying any events that could have triggered the Incident (e.g. A recent change, some user action?) Knowledge searches looking for previous occurrences by searching previous Incident/Problem Records and/or Known Error Databases or manufacturers’/suppliers’ Error Logs or Knowledge Databases. When a potential resolution has been identified, this should be applied and tested. The specific actions to be undertaken and the people who will be involved in taking the recovery actions may vary, depending upon the nature of the fault – but could involve: • Asking the user to undertake directed activities on their own desktop or remote equipment • Specialist support groups being asked to implement specific recovery actions (e.g. Network Support reconfiguring a router) • A third-party supplier or maintainer being asked to resolve the fault. Even when a resolution has been found, sufficient testing must be performed to ensure that recovery action is complete and that the service has been fully restored to the user(s). Once the solution has been applied and recovery is confirmed Service Desk should check that the Incident is fully resolved and that the users are satisfied and willing to agree the Incident can be closed. The Service Desk should also check and confirm that the initial Incident categorization was correct or updated in case, it is incorrect so that a correct closure categorization is recorded in the Incident – seeking advise or guidance for future reference Service desk should Carry out a user satisfaction call-back or e-mail survey for the agreed percentage of incidents. Service desk also has to Chase any outstanding details and ensure that the Incident Record is fully documented so that a full historic record at a sufficient level of detail is complete. In case of Ongoing or recurring problem, Determine with the teams whether it is likely that the Incident could recur and decide whether any preventive action is necessary to avoid this. Raise a Problem Record in all such cases to initiate preventive action. Laslty Formal closure of the incident where the incident record is formally closed.

3.33 Incident Management - Triggers, Inputs, Outputs And Interfaces

Incidents may be reported by different people and logged in different ways. Let us see the key triggers for incidents now. • Incidents may be reported by a user through a call to service desk or might log the incident in a web-based application or tool. • Incidents may be raised automatically via event management tools whenever there is event of the type ‘warning’ or ‘exception’. • Technical staff may notice potential failures and report them to service desk or log an incident in the incident management tool directly. • Incidents may also be initiated by suppliers when they identify some potential issues or actual difficulty with respect to IT infrastructure, application or service. Let’s understand the key inputs and outputs of incident management in the next two slides.

3.34 Incident Management - Triggers, Inputs, Outputs And Interfaces

Incident Management depends on various inputs received from other processes and service management activities. We will now discuss some examples of the Incident Management inputs. • Information about CIs and their status is used by service desk and support teams for effective impact assessment and speedier resolution. • Information about known errors and their workarounds is utilised for linking incidents to problem records and for providing workarounds to users where incident resolution would take more time or is dependent on resolution by Problem Management. • Communication and feedback about incidents and their symptoms from user groups and support teams helps in initial diagnosis and analysis. • Communication and feedback about RFCs and releases enables Incident Management to relate new type of incidents to the changes and releases implemented. • Communication of events triggered from event management is a key source of incidents. These are sometimes automatically logged by an interface from event management to incident management tool. • Operational and service level objectives provide direction for deciding the service level agreements relating to Incident Management. • Customer feedback on incident resolution activities help in fine tuning and improving Incident Management process. • Agreed criteria for prioritizing and escalating incidents helps service desk and support teams to understand business requirements and have a common understanding of the criteria. Next, let us understand the outputs.

3.35 Incident Management - Triggers, Inputs, Outputs And Interfaces

Some of the important outputs from Incident Management process are: • Resolved incidents and actions taken to achieve the resolution; • Updated incident management records with complete and accurate details; • Updated classification of incidents to be used to support proactive problem management activities; • Problem records logged for incidents where an underlying cause has not been identified; • Validation statements or reports confirming that incidents have not recurred for problems that have been resolved; • Feedback to Change Management and Release and Deployment Management on incidents related to changes and releases implemented; • Identified CIs that are associated with or impacted by incidents; • Feedback from customers on resolutions provided and communications or interactions made; • Feedback on level and quality of monitoring technologies and event management activities having a bearing on Incident Management; and • Communications about incident and resolution details provided for assessing the overall service quality. In the next slide we will understand the interfaces of incident management.

3.36 Incident Management - Triggers, Inputs, Outputs And Interfaces

Incident Management process interfaces and interacts with a number of other service management processes. We will discuss some of the important ones here. • Service Level Management defines the acceptable levels of service within which incident management will operate. The response and resolution timelines, impact definitions and service descriptions are some important information provided by Service Level Management to Incident Management. Similarly, Incident Management provides reports and metrics to Service Level Management. These act as important inputs to service review meetings and service improvement plans. • Incident Management also deals with security related incidents. Information Security Management provides details of criteria for identification, prioritization, classification and escalation of security incidents. Through an analysis of security incidents, Information Security Management can contribute to improvements in service designs. • Capacity related issues will normally lead to service outages and degraded service performance. These result into incidents. Capacity Management is involved in resolving or providing assistance in fixing these types of incidents. Capacity Management might also provide workarounds for capacity related incidents. • Availability Management measurements are derived from data provided by Incident Management. This data is also used to look for improvement opportunities within the extended incident lifecycle. • Service Asset and Configuration Management maintains information about configuration items. When incidents are linked to configuration items, it becomes easier to identify CIs that are frequently failing or causing issues. Configuration Management helps in proper impact assessment of incidents thereby enabling Incident Management to provide appropriate resolutions. • Incident Management interfaces with Change Management where a change is required to implement a workaround or resolution. On the other hand, incidents occurring due to changes implemented are handled by Incident Management. • Problem records are logged for recurring incidents. Problem Management tries to identify and eliminate the root cause of these recurring incidents. Problem Management also provides workarounds for incidents where resolution would take longer time. • Unauthorised access as well as access attempts should be logged as incidents relating to security breaches. Incident Management provides valuable inputs along with Event Management for investigation of these access breaches.

3.37 Incident Management - CSFs And KPIs

Customer and user satisfaction is sometimes directly dependant on the efficiency of Incident Management Process. Let us analyse the various critical success factors and related key performance indicators of the Incident Management process. ‘Resolving incidents as quickly as possible and minimizing impacts to the business’ is the most important critical success factor for Incident Management. The related key performance indicators are ‘Mean elapsed time to achieve incident resolution or circumvention, broken down by impact code’; ‘Percentage of incidents closed by the service desk without reference to other levels of support’ and ‘Number of incidents resolved without impact to the business’. ‘Maintain quality of IT services’ is the next critical success factor. The related key performance indicators are ‘the total number of incidents (as a control measure)’; ‘Size of current incident backlog for each IT service’ and ‘Number and percentage of major incidents for each IT service’. We don’t have to emphasize much on customer satisfaction as this is key for organisational growth and success. The critical success factor in this direction is to ‘Maintain user satisfaction with IT services’. The related key performance indicators are ‘Average user/customer survey scores’; and ‘the percentage of satisfaction surveys answered versus total number of satisfaction surveys sent’. The relevance of Incident Management process is indicated by one critical success factor, which is ‘Aligning incident management activities and priorities with those of the business’. The key performance indicators in this direction are ‘the percentage of incidents handled within agreed response time’ and ‘the average cost per incident’.

3.38 Incident Management - Challenges And Risks

We shall now discuss the challenges relating to the Incident Management Process. • The ability to detect incidents as early as possible is the main challenge that exists for Incident Management. The success of Incident Management is dependent of the setting up and configuring of event management tools, awareness amongst users to report incidents as soon as they are detected and the existence of a good service desk. • The next challenge likely to be faced by Incident Management is to convince users and staff to log all incidents, and to encourage the use of self-help web-based capabilities. If all incidents detected are not logged and recorded, they may evolve into major incidents or problems and adversely impact the business. • Availability of information about problems and known errors is another challenge that is faced. This information is important to know the status of resolutions and availability of workarounds if any, to enable users to continue their business operations. • Other challenges are integration with the Configuration Management System and Service Level Management Processes. Integration with Configuration Management System is required to determine the relationships between configuration items and refer to their history during analysis and investigation. • Integration with the Service Level Management is essential for determining and assigning the correct priority as well as adherence to response and resolution timelines. This integration also helps in tracking achievements to service level targets and taking appropriate corrective steps wherever required. Let us now look at the risks of incident management in the next slide.

3.39 Incident Management - Challenges And Risks

Risks do exist in every area of service management. It is essential to proactively identify these risks and manage them to minimise the impact on service management and business objectives. The risks associated with Incident Management process are : • ‘The service desk and support teams being inundated with incidents that cannot be handled within acceptable timescales’. The levels of support and sizing of service desk and support teams must be designed and implemented in a pragmatic manner to mitigate this risk. • ‘The unintended backlog of incidents created by inadequate support tools’. Event management and service management tools meeting the service requirements and objectives must be implemented to avoid this risk. • ‘Lack of adequate and/or timely information sources’ is a major risk to achieving the key objective of ‘restoring services as quickly as possible’. Maintaining knowledge bases, configuration management system and known error database are measures to be adopted for this risk. • ‘Mismatches in objectives or actions because of poorly aligned or non-existent Operational Level Agreements and/or Underpinning Contracts’. Ensuring that the operational agreements and underpinning contracts are in place and are aligned with the service level agreements is the remedy for this risk. In the next slide we will discuss about the roles involved in incident management.

3.40 Incident Management - Roles

This slide introduces you to the roles involved in the incident management. So let us go through the Incident Management roles in detail. An Incident Manager has the responsibility for Driving the efficiency and effectiveness of the Incident Management process, Producing management information, Managing the work of Incident support staff (first- and second-line) , Monitoring the effectiveness of Incident Management and making recommendations for improvement , Developing and maintaining the Incident Management systems , Managing Major Incidents , Developing and maintaining the Incident Management process and procedures. The responsibilities of first line analyst includes Recording incidents Routing incidents to support specialist groups when needed Analysing for correct prioritization, classification and providing initial support Providing ownership, monitoring, tracking and communication of incidents Providing resolution and recovery of incidents mot assigned to support specialist groups Closing incidents Monitoring the status and progress towards resolution of assigned incidents Keeping users and the service desk informed about incident progress Escalating incidents as necessary per established escalation policies In many organizations the role of Incident Manager is assigned to the Service Desk Supervisor – though in larger organizations a separate role may be necessary. In either case it is important that the Incident Manager is given the authority to manage incidents effectively through first, second and third line. We will be discussing this role in detail in the Service Desk topic. Many organizations will choose to have a second-line support group, made up of staff with greater technical skills than the Service Desk – and with additional time to devote to Incident diagnosis and resolution without interference from telephone interruptions. Where a second-line group is used, there are often advantages of locating this group close to the Service Desk to aid with good communications and to ease movement of staff between the groups. A second-line support manager (or supervisor, in case of a small group) will normally head this group. Third-line support will be provided by a number of internal technical groups and/or third-party suppliers/maintainers. The list will vary from organization to organization but is likely to include: Network Support, Voice Support (if it is separate), Server Support, Desktop Support. This summarizes the incident management process now let us move to our next process which is Problem Management..?

3.41 Problem Management

The next process within Service Operation is Problem Management. We shall be covering • The Purpose and Objectives of Problem Management • Scope of Problem Management • Value to Business from Problem Management • Problem Management Policies • Basic Concepts • Problem Management Process Activities • Triggers, Inputs, Outputs and Interfaces • Critical Success Factors and Key Performance Indicators; and finally • Challenges and Risks related to Problem Management. Like any other process, let’s begin with the purpose and objective of problem management.

3.42 Problem Management - Purpose And Objectives

Problem Management process focuses on building stability and enhancing the quality of IT infrastructure, applications and services. The purpose of Problem Management is to manage the lifecycle of all problems from initial identification through further investigation, documentation and eventual removal of these problems. It strives to minimize the adverse impact of incidents and problems on the business that are caused by underlying errors within the IT Infrastructure and to proactively prevent recurrence of incidents related to these errors. Problem Management adopts both proactive and reactive approaches for identifying and eliminating the root cause of infrastructure and application errors and faults. The main objectives of Problem Management process are • To prevent problems and resulting incidents from happening through proactive analysis of incidents, events and other monitoring and reporting information; • To eliminate recurring incidents by identifying the root cause and providing a resolution; and • To minimize the impact of incidents that cannot be prevented by identifying and developing suitable workarounds. Next, let us understand the scope of problem management.

3.43 Problem Management - Scope

The scope of Problem Management includes the activities required to diagnose the root cause of incidents and to determine the resolution to those problems and to ensure that the resolution is implemented through the appropriate control procedures, especially change management and release and deployment management. Problem Management is also responsible for maintaining information about problems and the appropriate workarounds and resolutions developed. These are to be stored in the known error database. It should also be ensured that these are available to service desk and support teams to assist in proper analysis and timely resolution of incidents. This will help in minimizing the impact to business. As stated earlier, Problem Management adopts both proactive and reactive approaches. The scope with respect to reactive problem management is concerned with resolving recurring incidents. In case of proactive problem management, the scope includes conducting periodic scheduled reviews of operational logs, conducting major incident reviews and organizing brainstorming sessions to identify trends. In the next slide we will understand value to business of problem management.

3.44 Problem Management - Value To Business

This slide explains how the Problem Management process adds value to the business. Problem Management works together with Incident Management and Change Management to ensure that IT service availability and quality are increasing. When incidents are resolved, information about the resolution is recorded. Over time, this information is used to speed up the resolution time and identify permanent solutions, reducing the number and resolution time of the incidents. This results in less downtime and less disruption to business critical systems. Additional value is derived from the high availability of IT services, higher productivity of business and IT staff, reduced expenditure on workarounds or fixes that do not work and Reduction in cost of effort in firefighting or resolving repeat incidents. Let’s check out the roles involved in Problem management in the next slide..

3.45 Problem Management - Policies

Let us now look at some examples of Problem Management policies. • One important policy is that ‘Problems should be tracked separately from incidents’. It is essential to understand that the purpose of Incident Management and Problem Management are entirely different. Incident Management is reactive in nature and problem management is both reactive as well as proactive. • ‘All problems should be stored and managed in a single management system’ is a policy that takes care of efficient management of problem logs, investigations, findings and resolutions. This is essential from knowledge management perspective as well. • A policy that ‘all problems should subscribe to a standard classification schema that is consistent across the business enterprise’ implies that a defined and agreed set of problem classification categories are in place. This helps in appropriate focus and faster involvement of support teams meeting business requirements.

3.46 Problem Management - Basic Concepts

Let us start with understanding the basic concepts of the Problem Management process. ITIL defines a ‘problem’ as the Unknown, underlying cause of one or more incidents. When incident management is not able to find a solution of an issue or recurring issue Problem Management comes into the focus. The Problem Management’s primary objective is to identify the root cause of the occurrence and reduce or eliminate the recurrence of the problem by applying a Fix. Now let’s understand another aspect of this process : Known error – A Problem state with known root cause and workaround. Once the root cause is identified Problem Management has to find a temporary fix or an alternate solution to keep the service running which is termed as a workaround. Practically it is possible to record a Known Error before a workaround has been found for genuine reasons like the problem was fixed before finding a workaround. These Known Errors are managed and stored in KEDB .Known Error Data Base is a collection of Known Errors created by Problem Management referred to by Incident Management. Incident management can refer to the KEDB for their future reference and resolve the issues at their end for which Problem Management has been done. This reduces wrong problems (for which solution has already been found) being raised or routed to problem teams. A process has a flow to be described as Problem Model – A prototype of problems and problem types meant for future reference.

3.47 Problem Management - Process Flow

This slide explains the flow of the Problem Management process. It is likely that multiple ways of detecting problems will exist in all organizations. These will include: Suspicion or detection of a cause of one or more incidents by the Service Desk, resulting in a Problem Record being raised, technical support group reveals an underlying problem in Incident analysis, or is likely to exist. Automated detection of an infrastructure or application fault, using event/alert tools automatically raise an Incident which may reveal the need for a Problem Record. Notifications from a vendor and Incident analysis are part of proactive Problem Management – resulting in the need to raise a Problem Record so that the underlying fault can be investigated further. All the relevant details of the problem must be logged / recorded so that a full historic record exists. This must have date and time stamped to allow suitable control and escalation. Typically this will include details such as: User details, Service details, Equipment details, Date/time initially logged, Priority and categorization details, Incident description, Details of all diagnostic or attempted recovery actions taken. Problems must be categorized in the same way as incidents and it is advisable to use the same coding system so that the true nature of the problem can be easily traced in the future and meaningful management information can be obtained. Problems must be prioritized in the same way and for the same reasons as incidents – but the frequency and impact of related incidents must also be taken into account. Problem prioritization should also take into account the severity of the problems. Severity can be identified with questions like How much will be the cost? How many people, with what skills, will be needed to fix the problem? How long will it take to fix the problem? An investigation should be conducted to try to diagnose the root cause of the problem – the speed and nature of this investigation will vary depending upon the impact, severity and urgency of the problem – but the appropriate level of resources and expertise should be applied to finding a resolution. Some of the most useful and frequently used techniques include: Problem Investigation and Diagnosis There are many problem analysis, diagnosis and solving techniques available and much research has been done in this area. Some of the most useful and frequently used techniques include: Chronological analysis: When dealing with a difficult problem, there are often conflicting reports about exactly what has happened and when. It is therefore very helpful briefly to document all events in chronological order – to provide a timeline of events. This often makes it possible to see which events may have been triggered by others – or to discount any claims that are not supported by the sequence of events. Pain Value Analysis is where the impact of an Incident or problem, or Incident/problem type is broadly viewed. In depth analysis is done to determine exactly what level of pain has been caused to the organization/business by these incidents/problems. A formula can be devised to calculate this pain level. Typically this might include taking into account: The number of people affected The duration of the downtime caused The cost to the business (if this can be readily calculated or estimated). By taking all of these factors into account, a much more detailed picture of those incidents/problems or Incident/problem types that are causing most pain can be determined. Kepner and Tregoe is another analysis where Charles Kepner and Benjamin Tregoe developed a useful way of problem analysis which can be used formally to investigate deeper-rooted problems. They defined the following stages: Defining the problem Describing the problem in terms of identity, location, time and size Establishing possible causes Testing the most probable cause Verifying the true cause. Brainstorming is another technique which can often be valuable to gather together the relevant people, either physically or by electronic means, and to ‘brainstorm’ the problem – with people throwing in ideas on what the potential cause may be and potential actions to resolve the problem. Brainstorming sessions can be very constructive and innovative but it is equally important that someone, perhaps the Problem Manager, documents the outcome and any agreed actions and keeps a degree of control in the session(s). Ishikawa Diagrams is by Kaoru Ishikawa, a leader in Japanese quality control, developed a method of documenting causes and effects which can be useful in helping identify where something may be going wrong, or be improved. Such a diagram is typically the outcome of a brainstorming session where problem solvers can offer suggestions. The main goal is represented by the trunk of the diagram, and primary factors are represented as branches. Secondary factors are then added as stems, and so on. Creating the diagram stimulates discussion and often leads to increased understanding of a complex problem. Pareto Analysis is a technique for separating important potential causes from more trivial issues. The following steps should be considered for pareto analysis: Form a table listing the causes and their frequency as a percentage. Arrange the rows in the decreasing order of importance of the causes, i.e. The most important cause first. Add a cumulative percentage column to the table.. Create a bar chart with the causes, in order of their percentage of the total. Superimpose a line chart of the cumulative percentages. Draw line at 80% on the y-axis parallel to the x-axis. Then drop the line at the point of intersection with the curve on the x-axis. This point on the x-axis separates the important causes and trivial causes. Now as we are aware of the process flow and various techniques of Problem Management process let us move to know how Problem management can be value to business

3.48 Problem Management - Triggers, Inputs, Outputs And Interfaces

We will now discuss the triggers for Problem Management process and it would be appropriate to discuss the reactive and proactive problem management triggers separately. The triggers for reactive Problem Management are: • Problems logged by service desk or support teams with respect to recurring incidents or major incidents; • Problems and known error records logged by release and deployment teams with respect to testing defects identified and a decision taken to go-live with these known errors; and • Supplier notifications regarding potential faults or known deficiencies in their products or services. With respect to proactive Problem Management, the triggers are: • Patterns and trends in incidents identified as part of reviewing historical incident records; and • Review of other sources such as operation logs, operation communications or event logs. Let us now proceed to discuss the inputs of problem management in the next slide.

3.49 Problem Management - Triggers, Inputs, Outputs And Interfaces

Every process takes certain inputs and delivers specific outputs. Some examples of inputs relating to Problem Management process are: • Incident records for incidents that have triggered problem management activities; • Incident reports and logs along with history; • Information about CIs and their status; • Communication and feedback about incidents and their symptoms; • Communication and feedback about RFCs and releases that have been implemented or planned for implementation; • Communication of events that were triggered from event management; • Operational and service level objectives; • Customer feedback on problem resolution activities; • Agreed criteria for prioritizing and escalating problems; and • Output from risk management and risk assessment activities. Moving on to the next slide we will look at the problem management outputs.

3.50 Problem Management - Triggers, Inputs, Outputs And Interfaces

Now let us look as some examples of problem management outputs. Resolved problems and actions taken to achieve their resolution; Updated problem management records with accurate problem details and history; Requests for changes to remove infrastructure errors; Workarounds for incidents; Known error records; Problem management reports; and Output and improvement recommendations from major problem review activity. Let us now understand the interfaces of problem management.

3.51 Problem Management - Triggers, Inputs, Outputs And Interfaces

Problem Management interfaces with a number of other service management processes. We will now try to explain these interfaces in detail. • Problem management has a very close relationship with Incident Management. It is basically the recurring incidents and some major incidents that result in a need for triggering problem management. The resolutions and workarounds developed by problem management are implemented to eliminate recurrence of such incidents. Also, historical incident records and logs are used by problem management to proactively identify infrastructure errors and faults. • Problem management provides Financial Management information on cost of investigation, diagnosis, and resolution of problems. This information is used as inputs to budgeting and accounting systems. On the other hand, Financial Management assists Problem Management in assessing the impact of proposed resolutions and workarounds. • Service Level Management provides certain parameters within which Problem Management operates. Incidents and problems impact service levels and Problem Management helps in improvement of service levels by eliminating recurring incidents, errors and faults relating to infrastructure and applications. • A significant problem that could not be resolved before having a major impact on business sometimes leads to invocation of IT Service Continuity and recovery plans. • Most of the performance related problems are related to capacity and might require involvement of Capacity Management teams. Problem management teams might also use capacity management techniques for diagnosing and resolving problems. Problem Management also provides useful information for capacity planning and decision making. • The proactive activities of problem management sometimes lead to improving availability of services. Also, Problem Management provides information and inputs to various aspects of Availability Management. • Service Asset and Configuration Management provides information on configuration items and their relationship. This information is used by Problem Management to identify faulty configuration items and to determine the impact of problems and resolutions. • Resolutions and workarounds identified by Problem Management are implemented through the Change Management process. Change Management keeps Problem Management informed about the progress of these changes. • Release and Deployment Management is responsible for deploying the problems fixes into live environment. The Release and Deployment related known errors are included in the known error database if not fixed before go-live. • Knowledge Management owns the Service Knowledge Management System and the known error database is an important component of this system. Also, the problem records are maintained in the SKMS or linked to this system. • Recurring incidents and identified problems are a key source for identifying service improvements. Proactive problem management activities are tightly integrated with the seven step improvement process and help in improving the performance and quality of services provided.

3.52 Problem Management - CSFs And KPIs

We shall now proceed to look at the critical success factors and key performance indicators relating to Problem Management process. These are only representative CSFs and KPIs and each organization will have to develop their own CSFs and KPIs based on its own objectives and requirements. One generally accepted critical success factor for Problem Management is ‘Minimizing the impact to the business of incidents that cannot be prevented’. The related key performance indicators are ‘The number of known errors added to the KEDB’ and ‘The percentage accuracy of the KEDB’. Another important critical success factor is ‘Maintaining quality of IT services through elimination of recurring incidents’. Relevant to this CSF, the KPIs are ‘The total number of problems identified and recorded’, ‘The size of current problem backlog for each IT service’ and ‘the number of repeat incidents for each IT service’. We can also consider ‘Providing overall quality and professionalism of problem handling activities to maintain business confidence in IT capabilities’ as another important critical success factor. A number of key performance indicators can be set for this CSF. Some examples of relevant KPIs are ‘The number of major problems identified and logged’, ‘The percentage of major problem reviews successfully performed’, ‘the number and percentage of problems incorrectly assigned’, ‘the number and percentage of problems incorrectly categorized’, ‘the percentage of problems resolved within SLA targets’ and ‘the average cost per problem handled’. Let us no proceed to understand the challenges and risks of problem management.

3.53 Problem Management - Challenges And Risks

Different types of challenges do exist in every organization. It is essential to be aware of these challenges, educate the teams and take measures to tackle them in the most appropriate way. The ITIL Service Operation publication provides some common challenges relating to the Problem Management process. The same shall be discussed now. • ‘Establishment of an effective incident management process and relevant tools’ is a key challenge faced by service providers. Either these are not implemented or, are not very effective in the current form. Problem management’s success is dependent on availability of information on recurring incidents. • ‘The skills and capabilities for problem resolution staff to identify the true root cause of incidents’ may sometimes be a major challenge. The staff should be adequately trained and equipped to use the tools and techniques of root cause identification and analysis to meet the objectives of problem management. • ‘The ability to relate incidents to problems’ and ‘the ability to integrate problem management activities with the Configuration Management System’ are other challenges that may be experienced by problem management teams. • ‘The ability to use knowledge and service asset and configuration management resources for investigation and resolution of problems’ could be a specific challenge where staff are not aware of the information and contents within these two important service management information systems. • Another challenge could be ‘the ability to have a good working relationship between the second- and third-line staff working on problem support activities and first-line staff’. Coordination amongst the different levels of support staff cannot be under-estimated in service oriented and customer focused organizations. • ‘Making sure that business impact is well understood by all staff working on problem resolution’ though a basic requirement, might sometimes be a challenge during initial stages of service management implementation. It is essential to make staff understand the business impact of the problem as well as the resolutions identified.

3.54 Problem Management - Challenges And Risks

The success of service management to some extent also depends on the ability to identify risks and build a suitable response and mitigation plan for those risks. The problem management process related risks are: • Support staff being inundated with problems that cannot be handled within acceptable timescales; • Problems being bogged down and not progressed as intended because of inadequate support tools for investigation; • Lack of adequate and/or timely information sources such as incident logs, and configuration management and knowledge management systems; • Problem support staff that may not be properly trained on tools and techniques to investigate problems; and • Mismatches in objectives or actions because of poorly aligned or non-existent Operational Level Agreements and/or Underpinning Contracts. Next, we will look at the roles involved in problem management.

3.55 Problem Management - Roles

The first role is of Problem manager who should be a designated person or, in larger organizations could be a team responsible for Problem Management. Smaller organizations may not be able to justify a full-time resource for this role, and it can be combined with other roles in such cases, but it is essential that it is not just left to technical resources to perform. There needs to be a single point of coordination and an owner of the Problem Management process. This role will coordinate all Problem Management activities and will have specific responsibility for: Liaison with all problem resolution groups to ensure swift resolution of problems within SLA targets. Problem management will also take the Ownership and protection of the KEDB and should act as a Gatekeeper for the inclusion of all Known Errors and management of search algorithms. He will take the ownership of Formal closure of all Problem Records , ensure that third parties fulfill their contractual obligations, especially with regard to resolving problems and providing problem-related information and data and drive Major Problem Reviews. A problem solving group is where the actual solving of problems is likely to be undertaken by one or more technical support group and/or suppliers or support contractors – under the coordination of the Problem Manager. Dedicated Problem Management team should be formulated to work together in overcoming that particular problem. The Problem Manager has a role to play in making sure adequate resources is available in the team and for escalation and communication up the management chain of all organizations concerned. A problem analyst’s responsibilities includes reviewing incident data to analyse assigned problems, analyzing problems for correct prioritization and classification, investigating assigned problems through to resolution or root cause, coordinating actions of others as necessary to assist with analysis and resolution actions for problems and known errors, raising RFC’s to resolve problems, monitoring progress on the resolution of known errors and advising incident management staff on the best available workarounds for incidents, updating the KEDB with new or updated known erros and workarounds and assisting with the handling of major incidents and identifying their root causes. This summarizes the Problem Management process. Let us move to our new topic on Request Fulfillment.

3.56 Request Fulfillment

We will now discuss the Request Fulfilment process within Service Operation. The topics that would be covered are: ? The purpose and objectives of Request Fulfilment ? Scope of Request Fulfilment process ? Value to Business by adopting the request fulfillment process ? Request Fulfilment Policies ? Basic Concepts ? Request Fulfilment Process Activities ? Triggers, Inputs, Outputs and Interfaces ? Critical Success Factors and Key Performance Indicators relating to Request Fulfilment; and ? Challenges and Risks. Let’s begin with the purpose and objective of Request fulfillment.

3.57 Request Fulfillment - Purpose And Objectives

We need to understand that IT service users normally have two types of needs. One, to get resolutions and fixes for issues faced during the course of using the services, and the other is fulfilment of requests pertaining to information, access, standard components or adhoc reports. These two categories are taken care in Service Operation by two important processes namely – Incident Management and Request Fulfilment. We have earlier discussed the Incident Management Process. We shall now discuss in detail the Request Fulfilment process. The purpose of Request Fulfilment process is to manage the lifecycle of all service requests from the users. The activities required to fulfil a request will vary depending upon what is being requested. The lifecycle spans from initial request raised by the user to completing the request using the appropriate request model. The objectives of Request Fulfilment process are : To maintain user and customer satisfaction through efficient and professional handling of all service requests; To provide a channel for users to request and receive standard services for which a predefined authorization and qualification process exists; To provide information to users and customers about the availability of services and the procedure for obtaining them; To source and deliver the components of requested standard services; and To assist with general information, complaints or comments.

3.58 Request Fulfillment - Scope

Let us now understand the scope of Request Fulfillment process. The scope covers : • Defining and documenting the types of requests that will be handled through the request fulfillment process and the ones that will be handled through other processes. • Determining if requests also should follow the Incident Management route or should they be handled through a separate request fulfillment process. • The activities required to fulfill requests will vary based on what is being requested – whether it is a request for information, access, adhoc reports or procuring standard components. Hence the scope includes ‘documenting service request fulfillment activities into request models and maintaining them in Service Knowledge Management System’. The next slide talks about the request fulfillment value to business.

3.59 Request Fulfillment - Value To Business

So far we have learnt about the request fulfillment concept, purpose and its scope. This slide explains the Value to business of the Request Fulfillment process. The value of Request Fulfillment is to provide quick and effective access to standard services which business staff can use to improve their productivity or the quality of business services and products. Request fulfillment effectively reduces the bureaucracy involved in requesting and receiving access to existing or new services, thus also reducing the cost of providing these services. Centralizing fulfillment also increases the level of control over these services. This in turn can help reduce costs through centralized negotiation with suppliers, and can also help to reduce the cost of support. Moving on let us look at the roles involved in Request fulfillment.

3.60 Request Fulfillment - Policies

Policies help in building control and efficiency in service management. Some examples of Request Fulfillment policies are: ? ‘The activities used to fulfil a request should follow a predefined process flow or model devised to include the stages needed to fulfil the request, the support groups involved, target timescales and escalation paths’. This policy ensures fulfilment of requests in a consistent and efficient manner. ? ‘The ownership of service requests should reside with a centralized function such as the service desk, to monitor, escalate, despatch and fulfil the user requests’. This policy’s aim is to provide a single point of contact for all service request related information. ? ‘Service requests that impact CIs should usually be satisfied by implementing a standard change’. This policy ensures change management control on all types of changes to CIs. ? ‘All requests should be logged, controlled, coordinated, promoted and managed throughout their lifecycle via a single system’. This is an important policy which helps in a holistic approach to managing service requests. ? ‘All requests should be authorized before their fulfilment activities are undertaken’. This policy is an essential element to adhere to access management and information security management requirements. ? ‘Fulfilment of requests should take place under an agreed set of criteria for determining their priority that is aligned with overall service levels and objectives’. This policy ensures alignment to service level agreements and objectives. ? ‘Clear communication for making requests and determining their status must be in place’. This is a policy that ensures users are aware of the procedure for raising requests and that there is single point of contact for information on requests raised. Let know proceed to understand the basic concepts of request fulfilment.

3.61 Request Fulfillment - Basic Concepts

We shall now try to understand the definitions of two important concepts within Request Fulfillment process. The first one – What is a service request? A Service Request is a request from a user for information, advice, Standard Change or access to an IT service. Some examples of service requests are ‘a request to reset a user password which is defined as a standard change in the Change Model’ ; ‘a request for running an adhoc report’ or ‘a request to permit access to a network folder’. The next one – What is a Request Model? A Request Model is a set of predefined steps required to handle specific types of service requests. Request Model will include the individuals or support groups involved, target timescales and escalation paths. For example, the steps involved in providing access to a network folder along with the details of the role that can provide the authorization and the timescales within which such access requests are to be fulfilled, constitute a request model. Moving on, let us understand the process activities in the next slide.

3.62 Request Fulfillment - Process Activities

We shall now detail the Request Fulfillment Process Activities. ? The first activity is receiving a service request. Service Desk is generally the first and single point of contact for raising a request. Requests can come by way of a user call, email, RFC or automated web ordering interface. ? The next step is to log the request and validate it. The service request must be logged in the relevant service management tool with all information. It must be time-stamped and a unique reference number should be assigned. ? Generally while logging the request, a request category is also assigned to the service request. The typical categories are by service, by activity, by type, by function or by CI type. ? Assigning a priority to the service request is also normally performed by the service desk while logging the service request. Priority is determined by ascertaining the urgency and impact of the request. The priority definitions stated in service level agreements should be considered while assigning the priority. ? The next activity is to request authorization. Based on request type and model, a service request may be pre-authorized, authorized by service desk or by an appropriate authority. Any service request that is not authorized or rejected should be returned to the user with the reason for the rejection and the request record should be updated accordingly. ? Request review is an activity performed to determine the support group that will fulfill the request and if any further information is required from user. It may also be determined if any escalation is required. ? When the support team or function receives the request for execution, it will choose the appropriate request model based on the type of request being fulfilled. The steps as per the request model are performed to fulfill the request. ? The final step is to close the request. The service desk is notified about the fulfillment of the request by the support group or function. The service desk checks with the user if the request has been fulfilled and if the request can be closed. The service desk will also check and update the financial requirements, closure categorization and request documentation before formally closing the request in the tool.

3.63 Request Fulfillment - Triggers, Inputs, Outputs And Interfaces

We shall now look at the triggers, inputs, outputs and interfaces related to the request fulfillment process. The trigger to request fulfillment is basically a request made by the user by making a call, sending an email, or logging the request through a web-based application. The inputs to request fulfillment process include : ? Work requests made by users and managers; ? Authorizations issued by designated authorities; ? Service requests logged directly by business and support users; ? Requests For Change; ? Requests from various sources such as phone calls, web interfaces or emails; and ? Request for information or updates on service requests raised by users.

3.64 Request Fulfillment - Triggers, Inputs, Outputs And Interfaces

The outputs from Request Fulfillment process include: ? Authorized/rejected service requests; ? Request fulfilment status reports generated and issued by service desk or support groups; ? Fulfilled service requests; ? Incidents wrongly categorized as service requests and rerouted to Incident Management; ? Requests For Changes or standard changes raised as part of the fulfillment process; ? Updates to service asset or Configuration Items; ? Updates to request records logged in service management tool; ? Closed service requests; and ? Cancelled service requests.

3.65 Request Fulfillment - Triggers, Inputs, Outputs And Interfaces

The request fulfillment process interfaces with many other service management processes. Some of the important ones will be discussed now. • We shall start with the interface with Financial Management for IT Services. Where request fulfilment is related to procurement of standard components and spares, or where the cost of fulfilling the request needs to be charged to the concerned department or business unit, the costs will be reported by request fulfilment to Financial Management. • There exists a tight relationship between Request Fulfilment and Service Catalogue Management. The Service Catalogue, which details all the currently available services, normally includes all types of service requests that can be ordered or utilised by the users and customers. • Service Asset and Configuration Management is responsible for establishing and managing the Configuration Management System. Hence, it is responsible for reflecting all changes to assets and configuration items due to fulfilment of service requests. • Standard changes are normally categorised under service requests. Request Fulfilment should ensure that all identified standard changes are pre-approved by Change Management before they can be processed as service requests. Also, where a change is required to process a service request, a request for change should be logged and processed through Change Management. • Some requests may relate to deployment of new or changed components. Once built and tested, these components need to be handed over to Release and Deployment Management for implementing in live environments. • Service requests may be raised for completing certain tasks in the process of resolving Incidents and Problems. It is important to link these service requests to the related incidents or problems. • Normally, request for access is a category within service requests. Request Fulfilment also needs to ensure that only eligible users are logging service requests and authorizations are granted by designated people in the organization.

3.66 Request Fulfillment - CSFs And KPIs

We shall now discuss the critical success factors and the related key performance indicators pertaining to Request Fulfilment process. One critical success factor that stands foremost is that ‘requests must be fulfilled in an efficient and timely manner that is aligned to agreed service level targets for each type of request’. The relevant key performance indicators are : ‘ The mean elapsed time for handling each type of service request’; ‘The number and percentage of service requests completed within agreed target times’; ‘The percentage of service requests closed by the service desk without reference to other levels of support’; ‘ the total number of requests logged’; and ‘The average cost per type of service request’. Another critical success factor is to ensure that ‘Only authorized requests should be fulfilled’. The key performance indicators related to this factor are : ‘The percentage of service requests fulfilled that were appropriately authorized’ and ‘the number of incidents related to security threats from request fulfilment activities’. The most important critical success factor is ensuring that ‘user satisfaction is maintained’. The key performance indicators are : ‘Level of user satisfaction with the handling of service requests’; ‘the total number of incidents related to request fulfilment activities’ and ‘the size of current backlog of outstanding service requests’. Let’s now discuss the challenges and risks of request fulfilment process.

3.67 Request Fulfillment - Challenges And Risks

Some challenges do exist for the successful implementation and management of the Request Fulfilment process. A few examples of these challenges are discussed now. • ‘Clearly defining and documenting the type of requests that will be handled within the request fulfilment process’ is the basic challenge. What needs to fall under this process should be decided after a careful examination of various IT services, organizational security requirements and the size and structure of service desk and support teams. • ‘Establishing self-help front-end capabilities that allow the users to interface successfully with the request fulfilment process’ is another challenge. Implementing a self-help tool or application and educating the users on its use is essential to make process more effective. • Users and IT support teams must be aware of the timelines and acceptable limits within which the service requests should be fulfilled to avoid any adverse impact to business. Thus, ‘agreeing service level targets for each type of request’ becomes a challenge here. • ‘Agreement on the cost of fulfilling requests’ is another key challenge where charging for IT services is in place. • ‘Establishing agreements with respect to which services will be standardized and who are authorized to request them’ requires an elaborate discussion with business teams. Documenting and maintaining them is again a challenge. • ‘Accessibility of information on requests available through request fulfilment process’ is a challenge because the prerequisite for this is a well designed and established Service Catalogue. • As different types of requests exist, it is essential to define steps and activities required for fulfilling each type of service request. The challenge here is ‘ensuring availability of a documented request model’. • The ultimate challenge is ‘maintaining focus on user satisfaction’. Requests that are poorly managed, ignored or not fulfilled in a timely manner will lead to user dissatisfaction and adversely impacts the image of IT organization. Defining applicable service levels and adhering to them will, to some extent, enhances user satisfaction.

3.68 Request Fulfillment - Challenges And Risks

Some risks related to the Request Fulfilment process are: • Poorly defined scope – leading to ambiguity in understanding what is included and what is excluded from the definition of ‘Service Requests’. • Poorly designed or implemented user interfaces – which might lead to difficulties in logging service requests, selecting the right category or knowing the status of the requests raised. • Badly designed or operated back-end fulfilment processes – is a risk if proper request models are not implemented or the support teams are incapable of dealing with the volume or nature of requests raised. • Inadequate monitoring capabilities – will impact process efficiency as required measurements and metrics will not be available. In the next slide we will look at the roles of request fulfilment.

3.69 Request Fulfillment - Roles

In the Last slide we learnt about the Value of Request Fulfillment process. This slide explains the roles involved in the Request Fulfillment process. Initial handling of Service Requests will be undertaken by the Service Desk and Incident Management staff. Some of the roles of request fulfillment are Request fulfilment process owner, Request fulfilment process manager and Request fulfilment analyst. The Request fulfilment process owner carries out the generic process owner role for request fulfillment process. Apart from this desgining request fulfillment models and workflows, working with other process owners to ensure there is an integrated approach to the design and implementation of request fulfillment, incident management, event management, access management and problem management. The Request fulfilment process manager carry out the generic process manager role for the request fulfillment process. Also planning, managing support for this process tools and process, coordinating interfaces between request fulfillment and other service management processes, Hhandling staff, customer and management concerns, requests and enquiries. Ensuring request fulfillment activities operate in line with service level targets, reviewing and analyzing all reports, overseeing actions to obtain feedback from customers on quality of request fulfillment activities, assisting with activities to appropriately identify needed staffing resource levels to handle demand for request fulfillment activities and services. Also ensuring all authorized service requests are being fulfilled on a timely basis, represting request fulfillment activities at CAB meetings, reviewing the initial prioritization and authorization of service requests to determine accuracy and consistency. Request fulfilment analyst role is to the single point of contact and end to end responsibility to ensure submitted service requests have been processed, providing initial triage of service requests to determine which IT resources should be engaged to fulfill them, communicating service requests to other IT resources that will be involved in fulfilling them, escalating service requests in line with established service level targets and ensuring service requests are appropriately logged. Eventual fulfillment of the request will be undertaken by the appropriate Service Operation teams or departments and/or by external suppliers, as appropriate. Often, Facilities Management, Procurement and other business areas aid in the fulfillment of the Service Request. In most cases there will be no need for additional roles or posts to be created. In exceptional cases where a very high number of Service Requests are handled, or where the requests are of critical importance to the organization, it may be appropriate to have one or more of the Incident Management team dedicated to handling and managing Service Requests. This summarizes the Request Fulfillment process. The next process is the Access management. Let us learn about this process in the coming slides.

3.70 Access Management

Access Management is the process of granting authorized user the right to use a service, while preventing access to non-authorised users. In the next few slides we shall be discussing ? The Purpose and Objectives of Access Management; ? The Scope of Access Management; ? The Value to Business from the Access Management process; ? Access Management Policies; ? Basic Concepts; ? Access Management Process Activities; ? Triggers, Inputs, Outputs and Interfaces; ? Critical Success Factors and Key Performance Indicators; and ? Challenges and Risks relating to Access Management. Let’s begin the purpose and objective in the next slide.

3.71 Access Management - Purpose And Objectives

Let us start with the purpose and objectives of Access Management. The key purpose of Access Management is to provide the right for users to be able to use a service or group of services. It is also concerned with executing policies and actions defined in information security management. The objectives of this process are : ? To manage access to services based on policies and actions defined in Information Security Management; ? To efficiently respond to requests for granting access to services, changing access rights or restricting access, ensuring that the rights being provided or changed are properly granted; and ? To oversee access to services and ensure rights being provided are not improperly used. In the next slide we will understand the scope of access management.

3.72 Access Management - Scope

In this slide let us learn about the scope. Access Management is the effective execution of both Availability and Information Security Management, that enables the organization to manage the confidentiality, availability and integrity of the organization’s data and intellectual property. Access Management ensures that users are given the right to use a service, but it does not ensure that this access is available at all agreed times – this is provided by Availability Management. Access Management is executed by all Technical and Application Management functions and is usually not a separate function. However, there is likely to be a single control point of coordination, usually in IT Operations Management or on the Service Desk. Practically The Access Management can be initiated by a Service Request through the Service Desk. Moving on let us look at access management as value to business.

3.73 Access Management - Value To Business

In our last slide we discussed about the scope of Access Management. This slide explains Access Management’s value to the business. Access management provides value to business in many ways such as Controlled access to services ensures that the organization is able to maintain more effectively the confidentiality of its information. Employees have the right level of access to execute their jobs effectively and there is less likelihood of errors being made in data entry or in the use of a critical service of an unskilled user for example production control systems. Access management develops the ability to audit the use of services and to trace the abuse of services also provides the ability more easily to revoke access rights when needed – an important security consideration .One of the major value addition is that Access Management may be needed for regulatory compliance SOX, HIPAA, COBIT.

3.74 Access Management - Policies

Access Management is basically concerned with the execution of organizational information security policy. Hence, it is essential that a well-defined set of policies are in place for efficient management of the Access Management process. We will now discuss some examples of these policies. ? ‘Access management administration and associated activities should be guided and directed by the policies and controls as defined in the information security policy’. This ensures alignment with the organizational information security objectives and requirements. ? ‘Access management should log and track accesses to use of services and ensure rights being provided are appropriately used’. This policy mandates putting in place proper controls and auditing mechanism for tracking unauthorised access to or use of services. ? A policy that ‘Access management should maintain access to services in alignment with changes in personnel events such as transfers and terminations’, ensures that communications are in place with human resource functions to notify IT about personnel events and changes on a timely basis. ? Access management should maintain an accurate history of who has accessed, or tried to access, services. This policy helps in ensuring that information required for auditing, compliance checks and incident or a problem investigation is available. ? An important Access Management policy is that ‘Procedures for handling, escalating and communicating security events should be clearly defined and documented in accordance with the information security policy’. This ensures that support teams are aware of the information security policy and procedures for handling security events. In the next slide we will understand the basic concepts of access management.

3.75 Access Management - Basic Concepts

Let us start with understanding the basic concept of Access Management process. Access Management is the process of granting authorized users the right to use a service, while preventing access to non-authorized users. It has also been referred to as Rights Management or Identity Management in different organizations. In the next slide let us look at the key terms used in access management process.

3.76 Access Management - Basic Concepts

There are few basic terms we should be aware of, to understand the Access Management process in a better way. We shall now discuss these terms. ? The term ‘Access’ refers to the level and extent of a service’s functionality or data that a user is entitled to use. ? ‘Identity’ refers to the information that distinguishes an individual and verifies his/her status within the organization. The identity of a user is unique to the user and normally is referred to as ‘user ID’. ? ‘Rights’ also known as privileges, refers to the levels of access provided to a user for a service or group of services. The general rights provided are read only, write, delete, modify, and execute. ? The terms ‘Services’ or ‘Service Groups’ refers to a set of services users are entitled to use at the same time. Normally, users performing a similar set of services use a similar set of services and hence it is appropriate to provide access to the whole set of services rather than to individual services. ? Directory Services are specific types of tools used to manage access and rights. These tools assist in efficient management of access provision and control. Let us now proceed to understand the process activities of access management.

3.77 Access Management - Process Activities

A process consists of a set of activities that need to be executed in a specific sequence to achieve the desired outcome or objectives. We will now discuss the activities relating to Access Management process. ? The process is initiated by a user or functional manager requesting access to one more IT components or services. Where there is a Request Fulfilment process in place, these requests are normally routed through this process and the rules for requesting access are documented in the request model. ? The next step is verification. The service desk or support team responsible for handling access requests will have to verify every service request to establish the identity of the person and legitimate requirement for access to the service or component. ? After successful verification and securing the authorization, where required, the designated team will provide the access rights to the user. Care should be taken to determine any rights that might result in conflict of interest. This can be avoided by careful creation of roles and groups and by appropriate polices and decisions by the business. ? There could be many changes taking place with respect to people working in an organization. There could be job changes, promotions, transfers, resignations, retirements or dismissals. Access management should regularly check and monitor the identity status, role changes and transfers and then appropriately change the access rights assigned. ? An ongoing activity that needs to be performed by Access Management is to log and track access rights usage. Access monitoring and control should be included in monitoring activities of relevant service operation functions. Exceptions should be properly handled and access logs and records should be made available for security related incidents and breaches. ? Access management is responsible for removal or restriction of access rights. These are triggered by role changes, transfers, dismissals, retirements or resignation of users. In the next few slides we will discuss triggers, inputs , outputs and interfaces of access management.

3.78 Access Management - Triggers, Inputs, Outputs And Interfaces

We shall now discuss the triggers for access management. • Where a new or upgraded service is introduced and a large number of users have to be provided access to the service, a request for change is initiated to assign the access rights. • The general form of trigger for access is a service request. Where Request Fulfilment process is in place, access requests are also included under service requests. • Access requests from human resource department, is another general trigger for access management. HR managers are normally responsible for initiating the access requests when new people join the organization or when people get promoted, transferred, or retired. • Requests from managers for staff working in the department could be another source. This generally happens when the staff start using an existing or new service. In the next slide we will look at the inputs of access management.

3.79 Access Management - Triggers, Inputs, Outputs And Interfaces

The inputs to access management process are: • The information security policies developed during service design is a key input to Access Management. These policies are used to design, implement, manage and control the Access Management process and activities. These are also used in the development of access models. • Operational and service level requirements for granting access to services, performing access management administrative activities and responding to access management related events are also important inputs to Access Management. • Authorized RFCs for granting access rights and authorized service requests to grant or terminate access rights are the other inputs.

3.80 Access Management - Triggers, Inputs, Outputs And Interfaces

The outputs from Access Management process are : • Completed access requests by providing access to IT services in accordance with information security policies; • Access management records and history of access granted to services, generally maintained in the service management tools; • Access request records and history where access has been denied along with the reasons for the denial. These are also generally retained in the service management tools; and • Communications made to users and management regarding inappropriate access or abuse of services. Now let us understand the interfaces of access management in the next slide.

3.81 Access Management - Triggers, Inputs, Outputs And Interfaces

Access Management interacts mainly with Request Fulfilment and Information Security Management processes. There are also other processes with which information exchange as well as dependencies exists. We shall discuss the interfaces with all the relevant processes. • Demand Management helps in identifying resource levels required for handling the anticipated volume of access requests. • Strategy Management for IT Services provides inputs on appropriate structure for access management. For example, in large organizations it would be more apt to have access management within individual business units rather than a centralised access management function. • Information security Management is one process that probably drives and controls the Access Management process. The security and data protection policies are considered while executing the Access Management process. • Service Catalogue Management provides information on the available services and the means by which they can be accessed. It also provides information on service requests that are processed through Request Fulfilment and access requests form part of these service requests. • There could be a threat of access and security breaches during disasters and major incidents. Access management interfaces with IT Service Continuity Management to manage access to services during these situations. It is also possible that additional or special access rights will have to be granted to identified resources to restore services after a major disruption. • Service Level Management owns the service level agreements which includes the agreements relating to access to services. These agreements detail the timelines for processing access requests, costs involved and criteria for granting access to services. This information is critical for managing and completing access requests. • Access requests may pertain to configuration items also. Service Asset and Configuration Management holds information about configuration items which can be used to determine current access details. This process also provides valuable inputs during audits and investigations. • Some access requests will have to be routed through Change Management or routine types of access requests are pre-approved as standard changes. • Request Fulfilment process provides methods and means by which users can request standard services that are available. A majority of the access requests fall under the ‘standard services’ category.

3.82 Access Management - CSFs And KPIs

Let us now look at some sample critical success factors and key performance indicators of Access Management process. • ‘Ensuring that the confidentiality, integrity and availability of services are protected in accordance with the information security policy’ is an essential critical success factor which in in-line with the objectives of both Information Security Management and Access Management. The key performance indicators related to this CSF are : ’the percentage of incidents that involved inappropriate security access or attempts at access to services’; ‘ the number of audit findings that discovered incorrect access settings for users that have changed roles or left the company’; and ‘the number of incidents caused by incorrect access settings’. • Another important critical success factor is to ‘Provide appropriate access to services on a timely basis that meets business needs’. The related key performance indicator is ‘the percentage of requests for access that were provided within established SLAs and OLAs’. • ‘Provide timely communications about improper access or abuse of services’ – this critical success factor indicates the efficiency of the Access Management process. The relevant key performance indicator is ‘the average duration of access-related incidents’. In the next slide we will look at the challenges faced by access management.

3.83 Access Management - Challenges And Risks

Access Management is a sensitive and critical area of service management. There exist a number of challenges for this process. Some of the important ones are : • Monitoring and reporting on access activity, incidents and problems related to access; • Verifying the identity of a user requesting for access to a service or component; • Verifying the identity of the approving person or body; • Verifying that a user qualifies for access to a specific service; • Linking multiple access rights to an individual user; • Determining the status of users at any time; • Managing changes to a user’s access requirements; • Restricting access rights to unauthorized users; and • Building and maintaining a database of all users and the rights that they have been granted. Let’s look at the risks in the next slide.

3.84 Access Management - Challenges And Risks

The risks associated with Access Management process are: ? Lack of appropriate supporting technologies to manage and control services will result in an inefficient process not meeting its objectives. Service management, monitoring and directory services tools are essential for deriving the most out of this process. ? Controlling access from ‘back door’ sources is another major risk. Users and managers may try to bypass the process to gain access. This could be due to urgency, too many restrictions or authority conflicts. An efficient logging, monitoring, tracking and auditing mechanism should be put in place to mitigate this risk. ? Where external third-party suppliers are involved in managing and controlling access to services, there will always be an element of risk involved. It is essential to have a well-defined organizational security policy in place along with suitable confidentiality clauses in agreements and contracts. ? Lack of management support for access management activities and controls is a general risk across all processes. A process can be successful only when there is adequate support from senior management. ? Where there are too many levels of management controls and process activities, there is a possibility of hindering the ability of users to conduct business. Appropriate request models should be developed to ensure timely provision of access rights to users. The next slide talks about the roles of access management.

3.85 Access Management - Roles

This slide explains the roles involved in Access Management. Since Access Management is an execution of Security and Availability Management, these two areas will be responsible for defining the appropriate roles. It is unusual for an organization to appoint an ‘Access Manager’, although it is important that there is a single Access Management process and a single set of policies related to managing rights and access. This process and the related policies are likely to be defined and maintained by Information Security Management and executed by the various Service Operation functions. Their activities can be summarized as follows. The Service Desk is typically used as a means to request access to a service. This is normally done using a Service Request. The Service Desk will validate the request by checking that the request has been approved at the appropriate level of authority, that the user is a legitimate employee, contractor or Customer and that they qualify for access. The Service Desk will also be responsible for communicating with the user to ensure that they know when access has been granted and to ensure that they receive any other required support. Where IT operations are separated from technical or application management, it is common for operational management teams to look into operational access management tasks. Operational management team members will be tasked with providing, modifying, monitoring identity status and revoking access to key systems or resources. The operational bridge, if exists can be used to monitor events related to access management and can even provide first-line support and coordination in the resolution of those events where appropriate. Technical and Application Management play several important roles. During Service Design, they will ensure that mechanisms are created to simplify and control Access Management on each service that is designed. They will also specify ways in which abuse of rights can be detected and stopped During Service Transition, Technical and Application Management will test the service to ensure that access can be granted, controlled and prevented as designed. During Service Operation these teams will typically perform Access Management for the systems under their control. It is unusual for teams to have a dedicated person to manage Access Management, but each manager or team leader will ensure that the appropriate procedures are defined and executed according to the process and policy requirement. This summarizes the process module, now let us move to our next module on Operational Activities of processes.

3.86 Summary

Let us now quickly summarize on the topics covered under Service Operations processes. In this unit we discussed about: • Event Management • Incident Management • Problem Management • Request Fulfillment • Access Management With this we have come to the end of learning unit 3, before proceeding to the next unit do not forget the quiz section!

  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Request more information

For individuals
For business
Name*
Email*
Phone Number*
Your Message (Optional)
We are looking into your query.
Our consultants will get in touch with you soon.

A Simplilearn representative will get back to you in one business day.

First Name*
Last Name*
Email*
Phone Number*
Company*
Job Title*