Optanix Platform The Technical Value: How it Works POSITION PAPER

Similar documents
The Optanix Platform. Service Predictability. Delivered. Optanix Platform Overview. Overview. 95% 91% proactive incidents first-time fix rate

The App Economy Requires Mainframe Intelligence

Service management solutions White paper. Six steps toward assuring service availability and performance.

The Optanix Partner Program. Built for MSPs, by an MSP

SUPPORT SERVICES FOR DELL EMC VXBLOCK SYSTEMS, VBLOCK SYSTEMS, AND VXRACK SYSTEMS

Mastering the operational complexity of IoT Applications

Implementing ITIL Best Practices

A technical discussion of performance and availability December IBM Tivoli Monitoring solutions for performance and availability

Kaseya Traverse Predictive SLA Management and Monitoring

Nectar Converged Management Platform

Best Practices for IT Service Management in 2017+

Going Beyond AIOps to Accelerate IT Transformation

Solutions Overview. Application-centric Infrastructure Performance Management

BMC FootPrints. Service Management Solution Overview.

Dramatically Improve Service Availability Prioritize issues and prevent problems with consolidated event monitoring and service automation.

E-Guide. Sponsored By:

Goliath Performance Monitor For Hospitals Using Cerner

AppManager + Operations Center

Pega Upstream Oil & Gas Capabilities Overview

IBM Service Management solutions To support your business objectives. Increase your service availability and performance with IBM Service Management.

AGILE ITIL SOFTWARE. Data Sheet AGILE ITIL SERVICE DESK AND ITSM JUMP START YOUR SERVICE DESK ITIL CERTIFIED PROCESSES WHOSE ITIL?

CA Network Automation

Kaseya Traverse Unified Cloud, Network, Server & Application Monitoring

Become a truly service-oriented organization

Let s Get Real About Self-Driven IT Ops Jim Kokoszynski, VP Software Engineering, CA Technologies

Cisco Tidal Enterprise Reporter

SysTrack Workspace Analytics

IBM Tivoli OMEGAMON XE for. WebSphere Business Integration. Optimize management of your messaging infrastructure. Highlights

Secure information access is critical & more complex than ever

APPENDIX 2A.1 IT SERVICE MANAGEMENT AND LIFE CYCLE MANAGEMENT TOOLS

Automated Service Intelligence (ASI)

Riverbed SteelCentral

Assuring Service Quality Despite Limited Resources

You can plan and execute tests across multiple concurrent projects and people by sharing and scheduling software/hardware resources.

RELEASING HIGH-QUALITY APPLICATIONS AND INFRASTRUCTURE FASTER WHITE PAPER OCTOBER 2017

Managing Digital Experience with Riverbed SteelCentral

NetIQ AppManager Plus NetIQ Operations Center

IBM Tivoli Service Desk

Oracle Management Cloud

Meeting the New Standard for AWS Managed Services

Integrating Configuration Management Into Your Release Automation Strategy

Social IT Operations Management: A new approach for solving persistent IT problems

Asset Performance Management from GE Digital. Enabling intelligent asset strategies to optimize performance

THREE STEPS TO MORE EFFICIENT IT AND FASTER DIGITAL TRANSFORMATION

Disaster Recovery Orchestration

in Action Modern Citizen Service Human Centric Innovation Mahmoud Zaher Sales Consultant - Customer Experience Solutions

Gain strategic insight into business services to help optimize IT.

Configurable Policy Enforcement. Automated Remedy Actions. Granular Reporting - Scheduled and On-Demand

Windpark Manager. Brochure. A Comprehensive, Integrated Solution for Technical Operations Management of Wind Parks

Data Protection Management (DPM)

INTEGRATION BRIEF DFLabs and BMC Remedy: Streamline Incident Management and Issue Tracking.

Resolve End User Experience Issues for Any Citrix or VMware-Delivered Application

Resolve End User Experience Issues for Any Citrix or VMware-Delivered Application

Resolve End User Experience Issues for Any Citrix or VMware-Delivered Application

Enabling a Comprehensive Platform for BCMP that integrates People, Process and Technology

Social Networking Advisory Services

MCSE: Private Cloud Training Course (System Center 2012)

Evanios Capabilities. The Top 16 Things We Do Really Well TRUSTED BY

End User Experience Monitoring and Management for Hospitals Using Citrix and Cerner

Goliath Application Availability Monitor for Hospitals Using Cerner

Optimizing Service Assurance with Vitria Operational Intelligence

Achieving Application Readiness Maturity The key to accelerated service delivery and faster adoption of new application technologies

>>Business Service Management

Do you want to more proactively ensure IT service reliability while boosting infrastructure and operational agility?

COGNITIVE QA: LEVERAGE AI AND ANALYTICS FOR GREATER SPEED AND QUALITY. us.sogeti.com

Moving to Service Centric Management with HP OMi

You can plan and execute tests across multiple concurrent projects and people by sharing and scheduling software/hardware resources.

Intelligent communications Empowering smart business

Five-Star End-User Experiences Require Unified Digital Experience Management

The innovation engine for the digitized world The New Style of IT

Enabling Real-time Operational Intelligence

Comprehensive approach for Artificial Intelligence for IT Operations transformation Deloitte and Moogsoft partnership

Why Machine Learning for Enterprise IT Operations

BMC ProactiveNet Performance Management: Delivering on the Promise of Predictive Control Across the Total IT Environment SOLUTION WHITE PAPER

Business Risk Intelligence

ITSM + ITOM = Outsmart Service Outages

Building a CMDB You Can Trust in a Complex Environment

Settling the Breadth vs. Depth Debate. How End-to-End Monitoring and Continuous Mainframe Tuning Help Drive a Flawless Customer Experience

APM Health Classic from GE Digital Part of our On-Premise Asset Performance Management Classic Solution Suite

HP Operations Analytics

How MSPs Deliver Innovation and Cost Reduction through Automation

SOLUTION BRIEF REDUCING SUPPORT CHALLENGES BY MINING LOG DATA SOLUTION BRIEF REDUCING SUPPORT CHALLENGES BY MINING LOG DATA

Cisco Intelligent Automation for Cloud

Solution White Paper Drive Radical Business Value with a High-Speed IT Organization

ConvergeOne Eases UC Growing Pains With Nectar s Advanced UCMP Monitoring and Diagnostics Capabilities

S T O R A G E M A N A G E M E N T. Veritas CommandCentral and ITIL. Key Concepts and Guidelines in Considering a Storage Management Solution

The Executive Guide to Digital Intelligence for Oil and Gas

artificial intelligence in action

Cisco Intelligent Automation for Cloud

RightITnow ECM. Overview. The IT Operations Management Solution

Accelerating application management services automation Time to break out the bots?

Increase IT Transparency & Steering Effectiveness by integrating your CMDB and Architecture Repository

SYNTHETIC ACTIVE MONITORING. Copyright 2015 TestPoint All Rights Reserved

THE FUTURE OF SERVICE IS COGNITIVE

Agile Infrastructure Monitoring for the Application Economy

Enabling automation through intelligence

Modernise IT Operations and Service Management. Simon White Solution Architect, IT Operations Management Practice, Australia/New Zealand

Machine & Equipment Health from GE Digital. Part of our Asset Performance Management suite

ServiceNow Order Form Product and Use Definitions

The power of the Converge platform lies in the ability to share data across all aspects of risk management over a secure workspace.

Transcription:

Optanix Platform The Technical Value: How it Works POSITION PAPER

Table of Contents The Optanix Clean Signal... 3 Active IT Managed Services... 4 Data Acquisition and Monitoring... 6 The Ingestion Engine... 6 The Snapshot Processor... 7 Synopsis... 7 Root Cause Analysis and Remediation Information... 8 The Correlation Engine... 8 The Decision Engine... 9 Synopsis... 9 Case Management, Visualization and Notification Modules... 10 Case Management... 10 Visualization... 11 Notification... 11 Business Intelligence... 11 Synopsis... 11 Automated Workflows... 12 Glossary... 13 Optanix Product Definitions... 13 About Optanix... 14 About the Optanix Platform... 14 2

The Optanix Clean Signal The problem for today s IT organizations is that operational infrastructures generate tidal waves of events often millions of unqualified raw events and associated alerts per day inundating IT teams beyond their ability to manage, even with an army of engineers. That flood of event data puts those IT teams in reactive mode and slows time to resolution. And because of this, things still remain unclear: Which business services and users are impacted and to what degree? What is really causing this event? What impact will this have on my SLAs? Could this have been avoided? Can we fix this only once? With Optanix s patented technology, millions of raw IT events are captured and processed to pinpoint root cause and distill true business impact into a single event, generating a Clean Signal. The Clean Signal accelerates remediation, with a proven 91% first-time fix rate for more predictable service performance and availability. With Optanix, IT teams can fundamentally shift from a passive IT service delivery model where IT constantly reacts to the tidal wave of events with multiple tools that just manage symptoms to an active service delivery model that delivers predictable infrastructure and known state. 3

Optanix s unique Clean Signal is the result of filtering out irrelevant events to reveal true root cause and its true business impact in a single, actionable event. We automatically pinpoint, verify and validate the cause of the event, and actively remediate: preparing the right information and getting it to the right person. The Clean Signal also informs meaningful role-based visualizations, both desktop and mobile. With Optanix s unique Clean Signal, IT teams see the whole infrastructure picture, focus on what matters most to the business, actively remediate only real issues and drive more predictable outcomes. Active IT Managed Services The future of business success requires transforming to a new IT service delivery model that is fast, predictable, always on and secure. Instead of a passive model, rife with service disruptions and uncertainty, IT must adopt an active IT service delivery model that delivers predictable outcomes and known infrastructure state, moving from: Managing a tidal wave of events A Clean Signal, pinpointing root cause/business impact Slow reaction time, flying blind Accelerated, first-time remediation Pointing fingers, staff frustration Full accountability, empowered people Multiple, overlapping tools Single, unified platform The power of the Clean Signal is at the heart of the Optanix approach towards an active IT infrastructure management model. The Optanix Platform utilizes patented snapshot processing and dependency analysis to identify root cause events. 4

Whereas many other systems boast similar capabilities, the Optanix Platform goes further by verifying and validating the root cause. The Decision Engine is then used to determine remedial actions based on the built-in knowledge base, which houses intelligent rules and workflows. The Optanix Platform has the power to automatically remediate the issue or to open a ticket with the appropriate support person. The outcome is a Clean Signal that: Identifies the root cause of the issue Verifies and validates the source of the problem to confirm it still exists Determines the remedial action that needs to be performed The Optanix Platform has the ability to utilize the Clean Signal to proactively respond to remediate the issues. The problem is often remediated before it escalates and then can be reported as resolved. Therein lies the true nature of the platform s active model towards managing the IT infrastructure, and the results are impressive: 95% proactive response rate (26% industry average) i 91% first-time fix rate 300% more proactive response than the industry average ii 87% false positives identified and discarded 50% reduction in incident remediation time The focus of the platform is not to automate everything, but to automate routine tasks (e.g., event triage, reboot cycling). Then it routes the information to the right support professional for remediation/reporting. 5

Data Acquisition and Monitoring For the Optanix Platform to provide full visibility, it requires the ability to consume large amounts of time-sensitive data from applications, systems, devices, other domain-specific management solutions and user-generated sources. Gathering data from every possible source in the ecosystem is the first step to successfully managing the IT infrastructure. The evaluation of these data points to gain meaningful insights can only be accomplished once this key step is efficiently executed. The Ingestion Engine The Ingestion engine performs the crucial task of data acquisition. The process of discovery begins with first establishing what devices are in the network. Customized algorithms that focus on Layer 2 and Layer 3 are utilized to build comprehensive topology information. This information is used to understand the dependency and relationships between network elements and devices. For each device, the Optanix Platform uses best practice templates to establish what the device is, what monitoring is needed and what mission-critical services are on that device. 6

By taking into account the scope and scale of each source, the Ingestion Engine determines the best way to monitor the infrastructure. This reduces the diagnostic churn that adversely impacts performance in competitive approaches. The Optanix Platform can seamlessly aggregate data from multiple sources and integrate with other monitoring solutions via industry standard APIs, allowing legacy monitoring solutions to be leveraged in an entirely new way. The Snapshot Processor Once raw events are gathered from network elements, the Snapshot Processor determines changes in the network state. A snapshot is essentially a picture of the network state at a particular point in time. The Snapshot Processor compares consecutive snapshots to identify the changes in the state of the network. These are then used to generate status change events that focus directly on issues in the network. The snapshot comparison-based approach provides a comprehensive view of the entire network as it changes over time and rapidly brings attention to meaningful data. This approach effectively handles the tidal wave of events that bogs down other solutions, which struggle to cope with the exponential onslaught of incoming network events. Synopsis With Optanix, you always know where you stand. The Optanix Platform collects high volumes of event data from any source, from end to end, using the Ingestion Engine. The Snapshot Processor continually audits these data points to generate an accurate picture of the current state of the IT infrastructure. The Ingestion Engine and the Snapshot Processor together serve to efficiently and continually capture meaningful data. The outcome of these processes are status change events, which are then passed on to the next two stages of the process, the Correlation Engine and then the Decision Engine. 7

Root Cause Analysis and Remediation Information The Optanix Platform uses time, topology and event-based correlation techniques to determine root cause events. Once root cause events are identified, automated workflows are used to verify and validate the issue. Then, appropriate remedial actions are determined and recommended. These activities are performed within the Correlation and Decision Engines. The Correlation Engine The Correlation Engine processes the status change events to determine root cause. A variety of techniques are used, including: Topology-based correlation, which understands the relationships between network elements, devices and services Dependency-based correlation, which understands how one entity affects the state of another upstream or downstream device Element-level event correlation, which understands the sequence between events from an entity. For example, if a device goes down, restarts and comes back up, event correlation clears the original event. The output of the Correlation Engine is clearly identified root cause events. 8

The highly efficient Correlation Engine performs unmatched root cause identification that automatically determines which events are meaningful and which are just noise. This dramatically reduces event volumes by up to 1,000,000 times, creating actionable intelligence that pinpoints the real reason underlying business service issues. The Decision Engine These root cause events are then fed into the Decision Engine. A state machine based intelligent analysis is conducted to generate verified valid events. In order to validate these events, the Decision Engine verifies that the issues still exist by performing on-demand polling of network devices. These verified valid events provide significantly improved capabilities that confirm the cause of issues in a fluid network environment. This reduces the amount of remediation churn often found in alternate solutions. Verified valid events point to the real reason behind business service issues. The Decision Engine further determines business services affected and remediation activities to correct the issue. These items collectively constitute the Clean Signal. These capabilities are powered by Optanix s patented Advanced Logic Profiles (ALPs) intelligent rulesets and automated best practices for managing specific types of business services and IT technologies. ALPs encompass Optanix s extensive IT support experience hundreds of person years spent managing real-world customer environments, which results in more than 2 million built-in events rules. Synopsis Once the Optanix Platform generates the Clean Signal, it has information about the root cause, the services affected and the suggested remedial action. The Optanix Platform can also provide the ability to automatically resolve issues. Using its powerful orchestration capabilities, the platform can perform a wide range of remedial actions, such as resetting servers, restarting applications and reconfiguring devices. Support staff can trigger these actions in response to issues or the Optanix Platform can trigger them automatically as defined in the ALPs providing zero-touch remediation of common service and infrastructure issues. This dramatically shortens Mean Time to Resolve (MTTR), avoids IT staff having to manually access IT devices, and ensures that remediation actions are performed consistently and accurately. 9

Case Management, Visualization and Notification Modules In addition to best-in-class performance when it comes to problem identification and remediation, the Optanix Platform provides Case Management, Visualization and Notification Modules resulting in a highly integrated and comprehensive solution. Case Management Case Management in the Optanix Platform is built around core principles of simplicity, ease-of-use and flexibility. This approach results in rapid access to information with a minimum of wait-time and latency. The functionality includes configurable process workflows, notifications and escalations, and extensive collaboration capabilities to provide Incident, Problem and Change Management. The Platform provides the ability to build customized workflows to suit the needs of internal and external customers. The built-in workflows are based on best practices and enforce ITIL compliance. 10

The platform can automatically open incident records based on validated events, or IT staff can open records manually providing a consistent way of managing all incidents, no matter what the source. The platform s Change Management module creates a structured environment for changes and dramatically lowers the risk of service interruptions. Using its workflow engine, the platform automatically routes change requests through the end-to-end change management process, including evaluation, approvals, implementation and verification. Throughout this process, it tracks the status of each change request, providing complete visibility for stakeholders as changes are implemented. Visualization The Optanix Platform comes with a wide range of role-based reports and dashboards, providing complete visibility of operational and service-level history. Real-time displays give immediate information about the status of business services and IT infrastructure, how issues are impacting business processes, and what actions are in progress to resolve them. The platform includes detailed reports and information that support teams can utilize to investigate issues, take informed action and restore services. All views are context-specific to the device or entity selected. The platform only displays relevant information, based on the device or entity type, so that noise is kept to a minimum. Notification The Optanix Platform provides for tiered escalation of notifications. This ensures that the proper support person is informed. Notification and escalation rule-sets can be predefined to ensure that no alert goes unacknowledged. In addition, Level 1 responses can be highly automated. Business Intelligence The Optanix Platform serves as the repository for the history of network events and the complete lifecycle of issues including business impact, assigned engineers, remediation tasks and so forth. This provides the ability to mine the data in order to glean insights into trends and chronic issues. Synopsis As the Optanix Platform manages business services, IT infrastructure and support processes, it builds and updates a unified management database that contains comprehensive information about the data center and network infrastructure and their operational histories. This information powers the platform s reports and dashboards and allows the early identification of trends that have the potential to disrupt the business in the future. Analyzing this information continuously improves the IT service delivery infrastructure and management processes, enabling key capabilities such as problem management and vendor score carding. 11

Using the Optanix Platform s integrated Incident, Problem and Change management, support teams dramatically increase their productivity, accountability and compliance, typically reducing MTTR by 50%. Case management, reports, dashboards and other visual tools can be accessed securely from anywhere using the Optanix Mobile App keeping support teams and management in the loop 24 7. Automated Workflows The power of the Optanix Platform lies in its ability to generate a Clean Signal that points to the root cause, the business impact and remedial information. Automated Workflows are key to generating the Clean Signal. Automated Workflows are powered by Advanced Logic Profiles (ALPs). ALPs provide intelligent rulesets and troubleshooting processes that enable best-practice methods for managing specific types of IT services and technologies. The Optanix Platform comes with an extensive library of ALPs, based on many years of combined expertise and experience of Optanix support staff and engineers. And its library of ALPs continues to evolve, ensuring that the platform stays abreast of the latest applications and IT technologies. ALPs are modeled within the Optanix Platform s Decision Engine. The Decision Engine provides the primary mechanism to execute the logic defined by the workflows using state machines. The state machines also interact with each other to accomplish more complex diagnostic analysis as needed. When it comes to determining root cause, these workflows are used to troubleshoot and collect additional information from affected network devices. Additional tests are also conducted to not only determine root cause, but to also verify and validate the true nature of the underlying issue. ALP logic works through different failure scenarios until it identifies the exact reason for the service or infrastructure issue. This drives down support costs and accelerates MTTR by automating timeconsuming investigative work. The Optanix Platform provides the ability to knit together network devices and elements to define the business services and entities. For example, we can determine that mail services are down rather than simply report which specific devices are down. This entity modeling is embedded in the Optanix Platform to highlight the true nature of the business services impacted by the outage. Additional workflows are then used to determine the correct set of actions to remediate the problem. Using its powerful orchestration capabilities, the platform can perform a wide range of remedial actions, such as resetting servers, restarting applications and reconfiguring devices. Support staff can trigger remediation actions in response to issues, or the Optanix Platform can trigger them automatically. This provides zero-touch remediation of common infrastructure and service issues. This dramatically shortens MTTR times, avoids IT staff having to manually access IT devices, and ensures that remediation actions are performed promptly, consistently and accurately. 12

Glossary Optanix Product Definitions Case Management Module Provides the ability to manage an issue through the entire remediation lifecycle including ticket creation, resource assignment, status tracking and escalation. Generates notification requests in order to provide alerts. Clean Signal Essential information outside of the noise associated with raw data. Pinpoints the root cause, business impact and remediation information in one single event. Correlation Engine Performs dependency analysis on status change events to gain an understanding on the relationships between events. Uses a variety of time, topology and event based techniques. Decision Engine A state machine-based intelligent module processes root cause events and then generates verified valid events. Ingestion Engine The data acquisition layer that interacts directly with network elements in order to learn of issues, events and alarms. Notification Module Processes status requests from other modules to generate notifications to users and other systems. Passive vs. Active The passive model is based on symptomatic analysis leading to a reactionary approach to IT operations management. The active model is based on Clean Signal root cause analysis that delivers predictable outcomes and a known infrastructure state based on a proactive approach. Predictable IT Service Delivery The IT infrastructure behaving in a way that is expected. Facilitates governance, economics, asset management and business success. Root Cause Events The initial cause of either a condition or chain of events that leads to an outcome or effect of interest. The output of the Correlation Engine that processes status change events in order to determine the core issue causing the status changes. 13

Snapshot A mirror image of the network state at a specific point in time. Status Change Events The output of the Snapshot Processor that represents changes in the network state. Snapshot Processor Compares consecutive snapshots in order to determine the change in the network state. Visualization Module Provides the service management, reports and dashboards to configure, monitor and understand the state and health of the underlying infrastructure. Verified Valid Events The output of the Decision Engine Module that takes root cause events and performs validation to verify the root cause events ahead of remediation. About Optanix Optanix is leading the advancement of IT service predictability in today s hyper-connected digital economy where reliable service delivery has never been more vital. Hundreds of customers rely on the Optanix Platform and IT Management-as-a-Service (ITMaaS) to enable unbeatable service availability that leads to positive business outcomes. Optanix solutions are delivered through industry leading channel partners who rely on Optanix s extensive IT automation experience. About the Optanix Platform The Optanix Platform is the only comprehensive, integrated solution designed to handle all aspects of managing IT environments. It uses patented automation and correlation processes to pinpoint the actionable root cause and business impact of service issues while reducing event noise by a factor of more than 1,000,000 to one. And by automatically routing only true root cause incidents to the appropriate support team in just seconds, the platform reduces incident remediation times by 50% and helps support teams respond proactively 95% of the time. i Managing IT From the End User Perspective in 2006, Forrester Research, Inc., February 2, 2007. ii ibid Optanix One Penn Plaza, Suite 3310, New York, NY 10119 +1-844-303-4011 info@optanix.com Copyright 2017 Optanix. All rights reserved. Optanix information is protected by U.S. and international copyright and intellectual property laws. All marks are property of their respective owners. 14 v12-17 www.optanix.com