An incident is. An unplanned interruption to a service or a reduction in the performance or reliability of a service

Size: px
Start display at page:

Download "An incident is. An unplanned interruption to a service or a reduction in the performance or reliability of a service"

Transcription

1 Incident Management

2 An incident is An unplanned interruption to a service or a reduction in the performance or reliability of a service

3 Purpose of Incident Management Restore normal service operation as quickly as possible and to minimize the adverse impact on university operations, ensuring that agreed levels of service quality are maintained.

4 Goals of Incident Management Timely resolution Maximize service availability Manage customer communications Improve coordination and communication between groups Integrate with existing ITIL processes

5 Standard vs Major Incidents Standard incidents are incidents prioritized as a priority 2, 3, or 4 Major incidents are incidents prioritized as a priority 1 Each follows a slightly different process

6 Priority Matrix

7 Priority Matrix- What to take into consideration Err on the side of assigning a higher priority. Impact determination based on user reports or on informed inference by IS staff based on the number of users/departments using a service. Department-based criteria takes precedence over the number-of-users criteria. VIP individuals automatically get moved up in priority by 1 For incidents in which the unplanned outage of an IS service is secondary to, and caused by, some other outage (e.g. a power outage), the IC has the option of moving priority down by 1.

8 Standard Incident Process All incidents start as Standard Incidents

9 Standard Incident Process Cont. Incident Identification & Logging Confirm the reported outage is unplanned Create an Incident ticket in JIRA or notify the Tech Desk of the reported incident Prioritize If priority 1, initiate Major Incident Management process If identified as a security incident and the impact is medium or high, regardless of the urgency, security@ithelp.uoregon.edu Investigation & Diagnosis If you lack expertise or access to resolve the incident, follow the Functional Escalation. Or if management is needed to resolve the incident, follow the Management Escalation. Resolution & Closure If any changes made, submit Emergency Change request Verify resolution with customer Ensure all incident details are documented Transition JIRA ticket to Closed

10 Major Incident Process Incidents prioritized as a priority 1

11 Major Incident Process Cont. Engage an Incident Commander via the IS Chatroom or by calling the new IC Hotline at IC responsibilities: 1. Mobilize resources 2. Coordinate Major Incident communications 3. Send/post messages (PIO responsibility) 4. Lead mobilized resources 5. Post restoration activities and closing incident 6. Send/post messages (PIO responsibility) 7. Prepare After Action Report

12 Major Incident Process Cont. Mobilized Resources: Individuals responsible for conducting activities determined by the Incident Commander. Mobilized Resources responsibilities: Analyze symptoms of reported incident Restore a failed IT service as quickly as possible Provide workarounds for impacted service if appropriate Resolve incidents and implements changes for a given service Document incident resolution or workaround in a KB document (e.g. Confluence) Contribute information to Incident Management process to inform on status, workarounds, incident cause, and resolution details. Open Change Management tickets to help with incident resolution if appropriate If necessary, requests Functional or Management escalations to help resolve an incident

13 Response Targets In the absence of an Service Level Agreement (SLA) for a Service, the Response Target times are guidelines and best effort approach to restoring normal service operations as quickly as possible and minimizing the adverse impact on business operations. Priority 2, 3, and 4 are limited to business hours only After hours, priority 1 incident Response Target times are doubled Priority 1 Priority 2 Priority 3 Priority 4 Time-to-act Time-toescalation Time-tocommunicate (15min)

14 How to create a JIRA Incident ticket Report and record all incidents using JIRA, which replaces Confluence for this purpose. 1. Click on Create issue 2. Ensure the Project Type is Incident Management 3. Complete the following required fields: Summary Incident Reported By Incident Symptoms Services Impacted Number of Users Impacted Date & Time Incident Reported Incident Priority (see Incident Prioritization) 4. Click Create Based on the Priority selected, a Standard or Major Incident sub-task will be created and the incident type (Standard or Major) will be noted under Details for the incident ticket

15 JIRA Incident Statuses New Status Description This is the state of your incident ticket while your are filling out the JIRA Incident ticket In-progress Resolved Closed Based upon the priority chosen, a Standard or Major Incident sub-task will be created. Complete the Investigation tab with information relating to the incident Complete the Resolution tab with information relating to the incident. Confirm resolution with customers. Once resolution has been confirmed successful, closed the incident.

16 Incident Management Benefits Interim IM process used August 2014 to March 2016 Improved awareness/visibility Improved collaboration and coordination between the teams Consolidated communications channels Reduced duplication of effort Faster time to restoration

17 How you can help! Report unplanned service outages or degradations by creating an Incident JIRA ticket or report the incident to the Technology Service Desk 6-HELP) Be patient as we work out the bugs in the next few months. In order to continually improve this process, we will need your feedback. Send your feedback to Patrick or me

18 Thank you!