OPMANTEK NETWORK MANAGEMENT AND IT AUDIT SOFTWARE. Configuring Event Escalation & Notifications in NMIS8 & opevents v1.0 May 2018.

Size: px
Start display at page:

Download "OPMANTEK NETWORK MANAGEMENT AND IT AUDIT SOFTWARE. Configuring Event Escalation & Notifications in NMIS8 & opevents v1.0 May 2018."

Transcription

1 OPMANTEK NETWORK MANAGEMENT AND IT AUDIT SOFTWARE Configuring Event Escalation & Notifications in NMIS8 & opevents v1.0 May 2018 Housekeeping Attendees will be on mute during the presentation to prevent interruptions from feedback and background noise. If you wish to answer a question please ask via GoToWebinar s chat We will have a Q&A session at the end and have allowed lots of time. This session will be recorded and made available to all attendees 1

2 Topics for Today NMIS8 Thresholding System How to Build and Maintain Escalation Policies and Rules in NMIS8 Expanding NMIS8 s Prebuilt Notification System How opevents Compliments and Expands NMIS IT Service Management Maturity Model Level 0 CHAOTIC Ad Hoc Undocumented Unpredictable Multiple help desks Minimal IT operations User call notification Tool Leverage Level 1 REACTIVE Fight fires Inventory Desktop software distribution Initiate problem management process Alert and event management Measure component availability (up/down) Level 2 PROACTIVE Analyze trends Set thresholds Predict problems Measure application availability Automate Mature problem configuration, change, asset and performance mgmt. processes Operational Process Engineering Level 3 Increasing Performance & Value to Organization VALUE IT as a strategic business SERVICES partner IT as a service provider IT and business metric Define services, classes, linkage pricing IT/business collaboration Understand costs improves business process Guarantee SLAs Real-time infrastructure Measure and report Business planning service availability Integrate processes Capacity Mgmt. Service Delivery Process Engineering Level 4 Service & Account Management Manage IT as a Business 2

3 Open-Source NMIS: Core Performance and Fault Monitoring Architecting a Solution Commercial Solutions opevents: Intelligent event response NMIS8 THRESHOLDING 3

4 Thresholding How to Intelligently Adjust Alarm Points NMIS Static Thresholding 6 threshold levels defined Normal/Warning/Minor/Major/ Critical/Fatal Can be rising or falling Events are normalized across similar OIDS and manufacturers Setup -> Thresholding Alert Tuning Thresholds stored in Common-threshold.nmis NMIS Thresholding Alert Tuning WHY Predefined thresholds provide reliable, out of the box fault management, while multiple threshold levels provide further flexibility for escalation and notification. 4

5 NMIS8 ESCALATION POLICIES AND RULES NMIS8 Escalation Policies and Rules Basic steps for creating event escalation and notifications 1. Add Contacts 2. Create escalation policy(s) 3. Test Notification 4. Adjust Configuration (if desired) 5

6 NMIS8 Escalation Periods Determining How Often Threshold Exceptions are Escalated System -> System Configuration -> NMIS Configuration, select escalation Eleven defined escalation points Time periods are in seconds Default settings start at Time Zero (event detection) and escalate up to 24hrs. Escalation for human action should only happen starting at the completion of a second polling cycle (whatever period that is); i.e. 5 minutes NMIS8 Contacts Determining How Often Threshold Exceptions are Escalated System -> System Configuration Contacts Also: Setup-> Contact Setup Define: DutyTime start and stop in hours (24hr clock) DutyTime Days of the week NMIS8 Event Level Location (optional, used to control time) Mobile, Pager, and Phone # TimeZone; leave blank if same as server Stored in /usr/local/nmis8/conf/contacts.nmis 6

7 NMIS8 Escalation Policy What Actions to Take at Each Escalation Period System -> System Configuration -> Escalation Policy Also: Setup -> s, Notifications and Escalations Six Built-In Notification Methods Syslog, json, , ccopy, pager and netsend These can be expanded on EXPANDING NMIS8 S NOTIFICATION SYSTEM 7

8 Expanding NMIS8 s Notification System Creating Custom Notification Methods Notification Methods are written in Perl Methods have access to the Contact s details entries Contact Name, DutyTime, , Location, Mobile, Pager, Phone, TimeZone Methods have access to the NMIS Event Details Ack, details, element, escalate, level, node, time, notify Notifications are stored in /etc/local/nmis8/lib/notify/ Start by copying an existing Notification method (i.e. sms.pm or critical.pm) Adjust the event matching criteria as well as what needs to be done Test by assigning to the T0 for every event and watch output in event.log OPEVENTS 8

9 Opmantek Application Flow Master opreports opha metadata NMIS metadata opcharts opevents reports metadata detail-link meta-events Poller metadata metadata opflow Collector opreports opha summary api opcharts NMIS opevents opconfig events opflow service monitor SNMP / WMI SNMPtrap syslog cli data Netflow Data Subnet opevents Advanced Fault Management and Operational Automation WHY Expands on efforts already done through NMIS, and scientifically improves automated response thereby decreasing workload and improving operational efficiency Enhances and builds-on NMIS Thresholding, Escalation and Notification systems Support whitelisting and blacklisting of events Handles event correlation, deduplification, event storms, and event flap Allows application of event Actions, or responses to events Supports flexible escalation and notification Supports custom templates per contact 9

10 Event Processing Flow These are all background processes Apply Archive List Apply Blacklist Apply Whitelist Correlate Events into Outages Deduplicate Events Determine Priority Conduct Actions Start Escalations opevents Configuration Files Whitelist/Blacklist/Archive EventListRules.nmis Correlation/Suppression EventRules.nmis Policies/Actions/Escalations EventActions.nmis 10

11 Event Correlation Rules are defined in: EventRules.nmis All correlations are time based, then grouped as defined Group events by location, Group, customer, etc. and roll-up into synthetic event Define a window of time, and minimum number of events for synthesis Inhibit correlation firing for defined time window Delay action processing following synthetic event firing Synthetic events can be rolled-up into other synthetic events Event Actions All Event Actions are defined in: EventActions.nmis Actions are stored in the Script section of EventActions.nmis Actions can be called from any section, i.e. Policy, Escalate as script.scriptname() Actions can do anything, from troubleshooting to remediative in nature 11

12 Event Escalation Event Escalation Rules are defined in: EventActions.nmis Escalations are stored in the Escalate section of EventActions.nmis Escalations can be called from any section, i.e. Policy, Script as escalate.policyname() Escalations can call any Policy Action; i.e Script, Log, Action, Exception - an Escalation policy cannot call another escalate.policyname() Escalation Timing is in seconds from when Event that called the Escalation started Escalations run while the event driving the policy rule is in effect All rules in IF statement must evaluate as true to match event Notification Event Notification Rules are defined in: EventActions.nmis conf/opcommon.nmis sets global parameters conf/contacts.nmis symlinked to nmis8/conf/contacts.nmis, but can be separate conf/event s.nmis defines which s templates to use for a particular Contact 12

13 CONTACT FOR FOLLOW UP Commercial enquiries: Tom Wiri Account Executive +1 (512) Technical enquiries: Mark Henry Senior Engineer +1 (207)