Insights. Telco Cloud

Size: px
Start display at page:

Download "Insights. Telco Cloud"

Transcription

1 Insights Telco Cloud

2 Table of Contents 1. Providing an innovative platform for digital service transformation Service Management for the Telco Cloud Developing a blueprint for zero touch, end-to-end service orchestration across hybrid and multiple networks Digital OSS for the Digital World Simplify Operations for the Telco Cloud Assuring the Telco Cloud Intelligence for the Telco Cloud MYCOM OSI Page 2 of 28

3 1. Providing an innovative platform for digital service transformation Written by Mounir Ladki, President and CTO, MYCOM OSI As per Gartner, CSPs will derive up to 20% of their revenue from digital services and adjacent market opportunities by end of Through use of integrated assurance and analytics, CSPs can now monetize their network and its services better, especially as they offer the connectivity of their networks to industry verticals for IoT. With intelligent analytics based on machine learning, operators can monetize their investments in big data. Indications from market analysts such as Gartner are that revenue associated for IoT solutions will grow to USD 32 billion by Also, large-scale big data analytics is becoming the cornerstone for CSP internal and external monetization strategies. Intelligent integration and advanced analytics can ingest raw data and inject smart data into big data lakes and big data hubs. These smart insights drive both operational efficiencies as well as data monetization. CSP environments can be complex, custom and challenging. As cloud-native technologies are being imported into telecom ecosystems, practical and measured transformation steps are required to consolidate their existing investments and proactively migrate with low risk within comfortable timescales. Current CSP trends: towards building fully automated systems CSPs fundamentally want to create new revenue streams by offering digital services. To do that, they need to transform their network and operations to the telco cloud so as to deliver the required agility, scalability and cost benefits. This entails virtualizing large parts of their infrastructure, automating their operations, implementing architectures like NFV (network function virtualization) and SDN (software defined networking), deploying on cloud infrastructures and using agile cloud operations and management processes. It also means that operations are transforming from the largely manual and reactive operations of today to real-time, fully automated systems using intelligent predictive analytics across integrated network, service and customer data to anticipate problems and identify opportunities. MYCOM OSI Page 3 of 28

4 However, several significant issues need to be carefully considered by CSPs as they plan their digital transformation and telco cloud deployments. First, this transformation will not happen as a 'big bang' but will be executed as a stepby-step journey. Over the next ten years, networks will continue to be hybrid: partly physical and partly virtualized. For this reason, it is important to have a single assurance solution that correlates and manages both the virtualized/telco cloud part of the network and the traditional physical part. Second, while virtualized networks might include their own management and orchestration capabilities, however, only a higher level assurance system has visibility of the entire hybrid network and takes network/service quality decisions using network-wide correlations, policies and closed loop automation. Thirdly, keeping track of the configuration and topology of the hybrid network will be very challenging due to its real-time and dynamic nature. Advanced hybrid network and service models need to be continuously fed by automated real-time discovery of all the changes in the network. Finally, although CSPs will deploy big data and machine-learning/ai capabilities to drive predictive analytics, deep telecom expertise is still required to seed the systems with the optimal algorithms and network characterizations for end to end predictions and sending commands to the virtualized and physical orchestrators. OSS-based automation Automation of operations centers is key to achieving the benefits of the telco cloud. CTOs of major operators today have a vision of 'zero touch operation center' which expects the entire process of monitoring, analyzing and remediation to be conducted automatically, without human intervention. Automation also plays an important role in reducing capex. With a combination of automation and analytics, CSPs can be guided to make optimal investments in order to get the highest return on investment. Specific capabilities that automatically identify the areas where CSPs need to invest in capacity based on subscriber consumption, type of subscribers and the return potential of subscribers can be used. Through such optimized investments, productivity achievements and agile operational benefits, the overall profitability of the CSPs is positively impacted. MYCOM OSI Page 4 of 28

5 Facing digital transformation Although the end state of digital transformation and a telco cloud is compelling, operators are rightly concerned about how to get there whilst leveraging their existing investment in technologies, processes and people. Implementing new architectures and technologies that require new investments and skills is also a mammoth task and it is critical to have a clear step-by-step transformation process to leverage the CSPs existing assets and to ensure further investment is future-proofed. 5G and IoT on the horizon As CSPs extend their networks for connected industry verticals, the low latency and high data speeds of 5G enable them to deliver enhanced mobile broadband, ultrareliable, low-latency and massive machine-type communication services. With 5G, I expect telecom to become embedded in industry verticals and their industrial processes. CSPs then have the opportunity to deliver intelligent analytics and connectivity services to the industry verticals. A key concept of 5G is network slicing where certain SLA (service level agreement) targets like latency or bandwidth are guaranteed for specific applications such as connected cars, smart energy or connected factories. Closed-loop assurance systems are required to assure and orchestrate the 5G slice in real-time as the CSPs are expected to maintain very stringent SLAs. IoT management involves ensuring IoT data reliability and assuring that the IoT network and devices are operating in a faultless manner, thus delivering on the promise of 100% availability and reliability. Enabling the CSPs by an assurance platform that offers centralized intelligence and is highly automated to support the on-demand needs of 5G slices will lead to new revenue streams for the CSPs. By offering industry-specific analytics, such an assurance platform can enable the CSP to become a managed service provider who can guarantee the end-to-end availability, quality and analytics for an IoT player. Take a smart energy service, for example, where a cloud platform collects data from smart meters and then automatically sets the thermostats and various parameters at the consumers' house to reduce the energy bill. For this service to work, you need to have all the devices in the home working - such as sensors and smart meters - and all of these are connected to a central box which needs to be working. Then the connectivity between the central box and the cloud needs to work, as does the cloud platform. MYCOM OSI Page 5 of 28

6 Through a suitable assurance platform, the CSP can ensure the integrity, availability and quality of end-to-end IoT connectivity. They can also become the first line of support for the IoT services and share part of the revenues with the IoT players. Published in Telecom Review on 12 th June, This is an edited version of the original article MYCOM OSI Page 6 of 28

7 2. Service Management for the Telco Cloud Written by Sandeep Raina, Product Marketing Director, MYCOM OSI There is a resurgence in the use of Service Quality Management (SQM) for digital service operations. The reasons for SQM s growing importance are the introduction of Internet of Things (IoT), which gained significant momentum in 2016, and the anticipated rollout of NFV-based services in Both reasons expect the current functionality of an SQM system to be extended to cover the high speed and scale demands of a digital service environment. Both IoT and NFV, although essentially massive network transformations, will cause a tremendous impact on service transformation. While IoT technology is being introduced to ensure that manufacturing, cars, homes, cities and devices become more efficient and reliable, NFV enables customers to consume faster, on-demand and dynamically personalized/contextualized services such as IPTV, video streaming, mobile gaming and rich messaging. As VoLTE and ViLTE become the immediate technology levers for the launch of the digital services, stringent SLAs will be formed to offer OTT-like services and benefits to customers. And the verticals that offer IoT services will need far higher support to maintain high reliability mission-critical connections between the IoT devices. SQM can help the CSPs in addressing the challenges of a new NFV and IoT service environment. The following key changes in the network are shaping the re-definition of SQM in order to make it suitable for the digital environment: 1. Virtualization and SQM: The rising importance of SQM for NFV can be attributed to: The higher agility in creation, delivery, alteration and retiring of services: This inevitably means that managing and maintaining QoS will need to be equally responsive and agile. The iterative deployment and tearing down of services expects the Service Quality Management systems to monitor short-life services, lasting from a few days to a few hours, driven by events, location, customer context, etc. Dynamic network resources: The dynamic adjustments to network elements, e.g. capacity scale-up and scale-down, topology re-configuration and traffic route optimization, have an immediate impact on the offered services. SQM needs to respond and align to these network changes MYCOM OSI Page 7 of 28

8 Hybridity of networks: In the hybrid - physical and virtualized - networks, digital services will be delivered over both parts. An SQM s system capabilities need to extend across all network types for an unbiased vendor-independent reporting 2. Internet of Things and SQM: With IoT introduced to communication networks, service providers have the options of becoming IoT service providers, managed IoT service providers or simply bearers of the IoT traffic. In each case, the monitoring and assurance of IoT services poses a key risk to the new business of the CSP, since the quality criteria of the IoT services can be much higher compared to the traditional communication services. In addition, because of the wide variety of users (energy, health, robotics, manufacturing, automotive, etc.), the Service Quality Management aspect will need to introduce new dimensions to address specificities of each of the verticals. SQM will, hence, re-define for IoT as follows: Assigning high importance to service reliability and service availability as key service KPIs Ensuring proactive maintenance in a high scale operational environment Faster service impact analysis to prevent network bottlenecks A mechanism (through automation) for fast reaction to potential service failures Visualization and prediction (through analytics) of service usage and geographic distribution by consumers and devices, in order to support creation of new IoT services Role of automation and analytics in managing NFV/IoT networks In NFV and IoT environments, Service Quality Management needs to be more proactive, predictive and capable of offering rapid root cause analysis (RCA). Although RCA was ensured in traditional SQM when service degradation happened and, in many cases, a service impact or what-if analysis was offered with it, the need to enhance these capabilities has increased significantly. Part of this requirement can be achieved by adding analytics to the SQM information, which provides more accurate failure prediction and a deeper assessment of service impact. Additionally, automation across the SQM outputs helps in managing configurations. Also, by automating root cause analysis the parent alarm can be quickly identified. Using service modeling and auto-discovery, the relationship with underlying network elements can be quickly ascertained and eliminated, reducing MTTR. MYCOM OSI Page 8 of 28

9 However, an integrated approach of analytics, automation and SQM requires some drastic changes in the way service data is visualized and actioned in the Operation Center today. The introduction of NFV with network functions and of services hosted on common resources inherently helps to achieve this integration to an extent. Use of open REST APIs also helps in connecting the OSS layers. Finally, hosting of OSS functionalities (analytics, automation and SQM) in the cloud can also accelerate the integration of the required functionalities of the Operation Center. Underlying technologies As discussed, for a next generation Digital Service Provider/Telco Cloud Service Provider, virtualization of network functions enables the creation and deployment of new services dynamically, as the time is reduced down from a few months to a few days. New Telco Cloud service assurance systems are being evolved, of which SQM forms a key component. The architectures of these next generation systems are based on REST APIs, Big Data cluster and OpenStack capabilities. Other than the introduction of the new technologies to the underlying platform, it is important to develop a micro-services architecture, which uses DevOps-enabled iterative processes to quickly respond to customer service needs by developing services faster. This is how the customer expectation of using new personalized/contextualized services every week or every few days will be realized. This also helps in conducting root cause analysis accurately and resolving customer issues quickly. The SQM system should also integrate well with the Lifecycle Service Orchestration ecosystem to offer closed-loop assurance; this involves integrated dynamic inventory, service catalogue-driven modeling and policy-driven service orchestration. In summary, for successful launch and long-term assurance of services in a hybrid - physical and virtualized - network, to which there will be an added layer of IoT services, only a re-defined Service Quality Management system (dynamic, predictive and capable of offering rapid RCA), will assure the expected digital service revenue. Published in Light Reading on 2 nd May, This is an edited version of the original article Also available on MYCOM OSI Page 9 of 28

10 3. Developing a blueprint for zero touch, end-to-end service orchestration across hybrid and multiple networks Written by Mounir Ladki, President and CTO, MYCOM OSI The telecom industry is witnessing the start of a profound transformation process that will allow Communication Service Providers to evolve into Digital Service Providers. The promise is to utilize new cloud, software, IoT and big data technologies to provide intelligent connectivity and value added digital services to consumers, enterprises and various industry segments. This includes the connectivity and management of smart factories (Industry 4.0), smart connected vehicles (autonomous driving), smart grid networks or SD-WAN networks for enterprises. This also includes digital services to consumers, such as connected homes, virtual-reality gaming and rich high-definition multi-media entertainment catalogues. The promise of revenue growth through monetization of such services will be made possible by new business and pricing models developed jointly with various industrial partners and SLAs delivered to enterprise customers. Recent reversal of Net Neutrality rules also opens the door for new consumer service monetization policies such as tiered pricing and QoS differentiation. However, CSPs need to execute the transformation to give them the ability to configure, deliver, manage and assure on-demand digital services in real time. This means: delivering network capability as a service, assuring very stringent SLAs, managing an extremely high availability and high quality network and adjusting in realtime to consumer and enterprise demands. The transformation will accelerate in the next 2 to 3 years with the migration to Telco Cloud networks, adoption of NFV/SDN, implementation of big data lakes and analytics and integration with IoT ecosystem. However, a key component for the success of the transformation and the monetization will depend on the efficiency of the real-time management, orchestration and assurance architecture to be put in place. In particular, Next Generation Service Assurance will play a critical and central role in enabling this monetization opportunity as it will be the backbone for keeping the complex end-to-end hybrid networks at peak performance, assuring the availability, MYCOM OSI Page 10 of 28

11 reliability and quality of on-demand digital services and meeting the very stringent SLAs required by industry verticals and enterprises. This Next Generation Service Assurance capability will act as the automated governance layer on top of the various NFV orchestrators, SDN controllers and physical/legacy network managers. It will enable a smooth migration from physical networks into the Telco Cloud. Some of its key characteristics are: Seamless and instant navigation from customer/service to network layers through a common integration of Performance Management, Fault Management and Service Quality Management into a common Service Assurance CMDB Service Assurance integration with Service Lifecycle Orchestration for dynamic event-based integration with Inventory to support dynamic configuration changes. This includes integration (using TOSCA modeling language) with Service Catalogue to enable DevOps catalogue- driven service management Interworking of Service Assurance with a variety of proprietary and open source orchestration ecosystems (OPNFV, ECOMP, ONAP etc.) to allow futureproof evolution. A flexible and modular (micro-services) approach and the use of open standards and APIs allow the system to be agnostic to Open Source ecosystems or commercial off-the-shelf products used for orchestration Adoption of Open APIs, in particular the TMF APIs, to reduce integration and evolution costs. Open APIs expose raw and smart data around alerts, timeseries (KPIs/KQIs), problems, troubles, changes, analytics and policies New architecture ensuring automated real-time operations from the cloud. This requires full cloudification or orchestrability of Service Assurance in the cloud, enabling DevOps operations Advanced automation to deliver a true closed-loop Service Assurance Orchestration capability that enables real-time operations and massive Opex gains leading to Zero Touch Operation, Orchestration and Maintenance (ZOOM) Simplification and automation of Operation Center processes is essential to achieving success in the Telco Cloud environment. Service Assurance systems that offer automation techniques, especially for NFV orchestration are critical for detecting service policy violations, managing configurations, automating root cause analysis and remediation, reducing MTTR. Through integration of analytics and automation, data can be visualized and actioned in the Operations Center in a predictive manner by pre-empting problems. The combination of analytics and automation can help in identifying untapped capacity in MYCOM OSI Page 11 of 28

12 the networks. Large Opex savings can be achieved by automating the process of reducing or re-organising capacity bottlenecks in the network. A few use cases are suggested below for the Next Generation Service Assurance system: Closed-loop QoS-driven orchestration in hybrid networks: Using specific performance and fault data from both PNFs and VNFs, QoS policies are triggered, which in turns trigger an automated workflow that sends notifications towards the CSP Orchestrator system Prediction of SLA breaches: Machine learning, integrated with analytics based on performance/fault data, offers powerful predictive management capability in protecting customer SLAs Service impact analysis: Service impact modeling mapped to underlying network resources results in visibility for specific service chains (examples are VoLTE and Mobile Broadband) Automating outage recoveries: Network outage faults are solved much faster by publishing NFV/SDN generated alert data into CSP s Telco Cloud Kafka cluster and then automating the outage recoveries Assurance of 5G network slicing for IoT services: By using analytics to forecast problem patterns, prevent IoT network, service and device failures in urll (ultra-reliable and low latency) and mmtc (massive machine-type) 5G networks A simplified zero-touch Operation Center provides many benefits. However, it does require integration and evolution of the existing OSS components. A step by step evolution of the existing ecosystem of a typical CSP will include the following: Rollout of a cloud-based Next Generation Service Assurance data platform with a converged Service Assurance CMDB Migration to Next Generation Unified Performance Management and Unified Fault Management pointing into the converged Service Assurance CMDB Roll out of Digital Service Quality Management and integration with Telco Cloud ecosystem, including dynamic closed loop integration with Service Lifecycle orchestration Introduction of advanced analytics and Machine Learning to enable predictive operations Integration with big/smart data lakes Integration with IoT management platform for monetization of IoT managed services MYCOM OSI Page 12 of 28

13 Currently, MYCOM OSI is developing such a blueprint for the Next Generation Service Assurance system for a Tier 1 digital service operator to enable their Service Management Center to monitor digital services including VoD, VoLTE, and ViLTE, and extending it for NFV and 5G. The Next Generation Operation Center will include proactive surveillance with automated service impact and root cause analyses, with dynamic service orchestration for physical, logical and virtual infrastructures. Published in TM Forum Inform on 11 th April, This is an edited version of the original article Also available on MYCOM OSI Page 13 of 28

14 4. Digital OSS for the Digital World Written by Sandeep Raina, Product Marketing Director, MYCOM OSI Digital businesses and digital networks are all about high speed communication, dynamic network orchestration and on-demand services. Without an overhaul of current Operation Support Systems (OSS), service providers will be up against big operational challenges to realize their digital business objectives. The digital networks are being laid out in such a way that the Telco Cloud service providers can offer competitive, advanced digital services to their customers over predominantly cloud based (virtualized) networks, managing high volumes of data at higher speeds over reliable connections. These Telco Cloud networks will be a multilayer composite of NFV/SDN, IoT and 5G networks. Challenges of a digital ecosystem New management methods have to be introduced to deal with the digital ecosystem, which will require building up the current OSS capabilities, especially for reliability and real-time orchestration. Moreover, far more interoperation between the OSS layers is needed than ever seen before. The Operation Centers would need to shed their reactive approach and adopt proactivity to focus on early problem resolution. Not only does this do away with laborious operational processes, it also speeds up the root cause analyses and corrective actions. To make services agile and their management dynamic, the digital OSS should efficiently integrate network topology, inventory and assurance processes. And since customers will demand dynamic SLAs, the OSS will have to deal with on-demand capacity configuration and dynamic topology changes too. In addition, to keep pace with the service agility of an NFV environment, OSS requires deeper understanding of customers/devices/apps and their behavior. With IoT already happening around us, management of its network and devices will need special attention. Most of the IoT devices will serve time-critical or life-critical applications over IoT networks to serve industries like healthcare, automotive, energy and smart cities. Because of low human interaction, the feedback, reliability and availability of the IoT networks is expected to be high. The SLAs that the IoT service provider signs with its MYCOM OSI Page 14 of 28

15 customers (industry verticals) will expect high QoS and reliability. For all of this, OSS systems are also expected to crunch billions of disparate IoT events and data streams handled by the new network elements (Sensors, Controllers, IoT gateways, etc.). This requires the OSS systems to significantly notch up their scalability and reliability. During the disruptive transitions to NFV and IoT, which will last over the next 3-5 years, network elements will steadily virtualize. Balancing and monitoring capacity across the physical and virtualized parts of the hybrid network will be a critical activity. The digital OSS system will need to deal with the varying composition of the hybrid network while planning and optimizing capex. The above challenges clearly indicate that the OSS is no longer a simple monitoring, reporting and troubleshooting system. It is an intelligent system that must provide predictive trends of the network, services, customers and devices. It must quickly evolve to feed an automated zero-touch Operation Center and introduce intelligent techniques of listening and reacting to the digital networks, and supporting hyperconverged services. Key OSS actions To make digital business successful, OSS will need to quickly evolve and start supporting the expected dynamicity, speed and scale of digital networks. The necessary steps in upping the capabilities of OSS are: 1. Consolidate: The digital OSS stack needs to be far more conjoined and highly efficient, using common information models. To make services agile and their management dynamic, OSS need to employ open APIs for integrating the OSS layers and to integrate with the ecosystem (e.g. Big Data and BSS components) 2. Predict: The digital OSS system will have in-built analytics to support proactivity and prediction of network/service/customer/device/apps problems. Using machine learning, data will be manipulated and patterns identified to reduce the number of customer-impacting problems and predict impending problems with a higher degree of accuracy 3. Automate: To realize a fully automated, zero-touch NOC/SOC in the future, closed-loop corrective actions will be introduced using complex algorithms and machine learning. And since customers will demand dynamic SLAs, the OSS will support on-demand capacity configuration and dynamic topology changes. This will be supported by real-time network feedback and automatic configuration 4. Strengthen: With the introduction of new OSS technologies and functionalities in incremental steps, the present-day OSS can be readied for the immediate future. This includes introducing of big data clustering MYCOM OSI Page 15 of 28

16 techniques, service-oriented (Microservices) architecture, cloudification, realtime analytics and closed-loop automation using machine learning OSS will play an instrumental role in supporting the required dynamicity, speed and scale of the digital services as the digital network is played out; a multi-layer composite of NFV/SDN, IoT and 5G. With the above transformation, the digital OSS will equip the Telco Cloud service provider for the next 10 years and help realize digital business benefits faster. Published in RCR Wireless News on 27 th March, This is an edited version of the original article Also available on MYCOM OSI Page 16 of 28

17 5. Simplify Operations for the Telco Cloud Written by Sandeep Raina, Product Marketing Director, MYCOM OSI As CSPs undergo digital transformation - which means offering digital services through a Telco Cloud environment they face a whole set of operational challenges, including operating a new virtualized network, focusing on customer experience more than before and providing high reliability and availability for upcoming IoT services. Simplifying the operations can help in tackling these challenges. The complexity of the operations in a Telco Cloud owes to the following: 1. The high speed at which digital services will be deployed: Digital services require real-time dynamic deployment, adaptation and customization. Automation of many Operations Center processes, including monitoring, orchestration, feedback, audit and messaging, are needed to support this 2. Running a hybrid network: part virtualized, part physical: Since the process of virtualization will take 3-4 years to stabilize, extra vigilance is required as new nodes/vnfs are added/removed. Seamless operations will require systems that quickly adapt to the network changes 3. Dynamic services need constant and consistent management: Policybased management is required for constant and rapid management, leading to automated simplified configuration in a virtualized environment 4. Additional attention to IoT operations: IoT traffic is expected to run on highly reliable and error-free networks, which drive expectations or objectives for the IoT network, service and devices to have minimum failures. Every new piece of equipment, software and device will bring its own failure points and requires upping of the fault management to ensure reduction in the number of faults 5. Impact on life-critical or mission-critical communication: In a hyperconnected world, failed devices or connections might not only breach SLAs with massive penalties, but, more importantly, impact lives. Although complex mesh topologies with high availability and redundancy will serve to minimize failures, they still require a highly efficient system to discover, interpret and manage the faults 6. Operation Centers need to be more proactive and predictive: This comes from the need to minimize performance degradations, prevent failures and eliminate critical customer-impacting problems MYCOM OSI Page 17 of 28

18 Integration and consolidation of OSS components is the first step towards simplification of the Telco Cloud operations. Automation including machine learning is the next. Integration and consolidation: The introduction of NFV with network functions and services hosted on common resources inherently helps to achieve the required integration to an extent. Open REST APIs also help in connecting the OSS layers. Finally, hosting of OSS functionalities (analytics, automation and SQM) in the cloud can also accelerate the integration of the required functionalities of the Operation Center. Introducing topology-based root cause analyses integrates services with the underlying network, closing the remediation loop Automation: Automating the Operation Center means encapsulating the best practices for standard operating procedures and using machine learning to derive or improve them. This frees up resources by automating and orchestrating complex processes across multiple domains and functions. Not only does it reduce human error and increase employee productivity, but it also greatly simplifies complex operations involving a large number of processes. The simplification benefits can be reaped by various functions, including planning, optimization and business teams. The highest level of automation would lead to the desired zero-touch Operation Center. Building a zero-touch Operation Center for the Telco Cloud will require the following key steps: Automating critical OSS actions Exploiting machine learning for efficiency Self-healing and optimization by feedback loop Here are some suggested use cases for the simplified (Integrated and Automated) zero-touch Operations Center: 1. QoS-driven orchestration in hybrid networks: Using integrated performance and fault data on network/services, QoS policies can be derived and operated to orchestrate both physical and virtualized (hybrid) networks. This requires an integrated SQM/automation/orchestration system 2. Management of end-to-end IoT: Managing IoT traffic by using analytics to forecast patterns and prevent IoT network, service and device failures. This includes building dashboards for service availability, incident and unavailability breakdown by location and geolocation-based service impact 3. Prediction of SLA breaches: Machine learning, when integrated with analytics based on performance/fault data, offers powerful predictive management capability to anticipate problems and helps in protecting customer SLAs MYCOM OSI Page 18 of 28

19 4. Service impact analysis and root cause analysis: With SQM integrated with fault data, faster service impact visualization is possible for the Telco Cloud. Also by automating root cause analysis problems can be quickly identified to reduce mean time to repair 5. Automating outage recoveries: By automating fault management, network outage recoveries can be accelerated. Additionally, by integrating fault management with the OSS ecosystem (Trouble-Ticket, Inventory, Orchestrators, SQM, CRM, Work Force Management, etc.) problems are reported and solved much faster A simplified zero-touch Operations Center provides many benefits. However, it does require drastic changes in the way OSS components integrate and interact with each other and how network/service data is visualized and actioned in the Operation Center. Introducing analytics, machine learning, messaging bots, automated RCA and orchestration will simplify the operational complexities of the hyper-converged network and its services. Published in Vanilla Plus on 28 th February, This is an edited version of the original article Also available on MYCOM OSI Page 19 of 28

20 6. Assuring the Telco Cloud Written by Sandeep Raina, Product Marketing Director, MYCOM OSI As CSPs undergo digital transformation - which means running their business out of a cloud, selling digital services and operating like web-scale internet companies - assuring the Telco Cloud business will take high priority. With networks virtualizing, services digitalizing and IoT looming large, the business risks are much higher than envisaged. Assurance of the Telco Cloud network and services will be key in assuring the new Telco Cloud business. The Telco Cloud is defined as a virtualized telecom infrastructure to run digital services and agile operations. Accuracy, speed and error-free operations of the Telco Cloud are critical to the success of a digital business. Clever solutions, which derive and offer customer intelligence in addition to service assurance, will play a critical role in the success of the Telco Cloud business. Certain OSS concepts will need revisiting to make them relevant to the Telco Cloud unknowns. These are Service Quality Management, creating a faultless Telco Cloud and enabling the digital service provider to be an intelligent platform. Service Quality Management for the Telco Cloud Service Quality Management (SQM) is not a new concept; however, it will be an important one in the coming years. Current SQM focuses on proactive monitoring of customer-facing services, which have not always required reliable, secure, fast and always-available networks. However, with the anticipated increasing rollout of Telco Cloud services in 2017, the current functionality of an SQM system will be stretched to cover the higher speed and scale of a digital services environment. NFV, the underlying technology of Telco Cloud, has partly evolved as a consequence of the growing appetite of the consumers for faster, on-demand and reliable services. Some of the most popular digital services in the new networks will be video streaming, telemetry, mobile gaming and home automation. In addition, NFV is associated with demanding SLAs between the Service Provider and its customers. With VoLTE, ViLTE and other advanced communication services launched as digital services, high levels of corporate SLAs will be required to compete with the slick services offered by the OTT providers. The SLA situation worsens with IoT, where inter-communicating sensored devices become the new customers and MYCOM OSI Page 20 of 28

21 may make high demands on reliability and availability, if they are mission-critical connections like autonomous cars or remote surgery. The importance of SQM in the Telco Cloud can be assigned to the following key reasons: Since virtualization/telco Cloud promises higher agility in creation, delivery, alteration and retiring of services, it gives rise to a proportional need for agility in managing and maintaining QoS. The iterative and continuous deployment and tearing down of services expects the SQM systems to monitor highrevenue short-life services which might only last from a few days to a few hours The virtualization of network functions introduces elasticity and dynamicity of network resources. The dynamic adjustments to network element capacities (scale-up and scale-down, topology configuration, redirecting of traffic routes, etc.) have an immediate impact on the services offered. The SQM needs to respond to such changes much faster now In the inevitable hybrid - physical and virtualized - networks that must exist, digital services will be delivered over both parts. Therefore, the Service Quality Management system should act as an overarching Manager of managers for uniform, unbiased reporting and orchestration In its digital avatar, SQM helps CSPs to address the new service challenges posed by the Telco Cloud. A Faultless Telco Cloud The cloud-based digital services are expected to run on highly reliable and error-free networks. Digital services require real-time dynamic adaptation and customization of the communication network, which drive expectations or objectives for network/service/device failures to be reduced to a minimum. Moreover, in an IoT connected world, failed devices or connections might not only breach SLAs with massive penalties but, more importantly, they might impact lifecritical or mission-critical communication. Although complex mesh topologies with high availability and inbuilt redundancy will reduce the impact of such failures, they still require a system to discover, interpret and manage the faults. With the network disruption induced by NFV and IoT, every new equipment, software and device will bring its own failure points. For this, the traditional network/device fault management will need to be raised to the next level. MYCOM OSI Page 21 of 28

22 Other than the mentioned technology turn (NFV and IoT), the revamping of fault management is necessitated by the demand for higher speed of service delivery and problem resolution. Monitoring and assessing the impact of failures on the new network elements and user devices is critical, especially when services are time-critical and, in many cases, life-critical too. This justifies the evolution of current NOC/SOCs to a zero-touch Operations Center, where extensive automation will speed up the reporting, fault-finding and remediation. By feeding fault data to Service Quality Management systems, CSPs can instantaneously understand the impact of faults on services and, with the use of predictive algorithms, prevent faults from occurring. Many use cases can be served through a highly automated, predictive fault management system: Telco Cloud orchestration: This uses fault data in SQM system, which highlights policy violations, followed by Automation/Orchestration across physical as well as virtualized networks Predicting IoT failures: It requires managing IoT traffic by using analytics on top of fault data, to forecast patterns and prevent IoT network/service/device failures. This includes building dashboards for service availability, incident/unavailability breakdown by region/location and also geolocationbased service impact Protecting SLAs: When integrated with fault data, machine learning offers powerful predictive capability to anticipate problems and helps CSPs in protecting their customer SLAs Service impact: With SQM based on faults, faster service impact visualization is possible for the hybrid, NFV and IoT networks Zero-touch Operations Center: Automating network outage recoveries, device configuration and integrating fault management with OSS ecosystem (Trouble-Ticket, Inventory, Orchestrators, SQM, CRM, Work Force Management, etc.) will lead to an automated, zero-touch Operations Center Automation of Operations Center processes is key to achieving success in the virtualized and digitalized Telco Cloud environment. CSPs are working towards realizing a fully automated, zero-touch Operations Center using closed-loop corrective actions, complex algorithms and machine learning. And to support the dynamic SLAs of the Telco Cloud, the OSS is expected to support on-demand capacity configuration and dynamic topology changes, which can happen only through automated real-time network feedback and automatic configurations. MYCOM OSI Page 22 of 28

23 Analytics to evolve to an Intelligent Platform CSPs are ready to shake off the label of being the Dumb Pipe, through the use of sophisticated analytics of the massive and valuable data traversing their networks. As digital service providers, they are looking at monetizing customer behavioral data as well as connectivity, as they aggressively launch new digital services to challenge the growing popularity of OTT services. Analytics deliver trends on performance, capacity and faults using machine-learning tools. But more than the operational benefits of analytics, they provide critical intelligence which can be used for network monetization and service personalization, by understanding the usage of the Telco Cloud, the services it offers, its customers and devices. As an example, CSPs can proactively identify low-congestion zones/locations (Free Zones) and rapidly fill spare capacity with revenue-generating traffic from new service offers such as video streaming, mobile TV or smartphone apps, contextualized by location, time and customer need. In addition, with CSPs extending their business to become IoT service providers, machine-learning based analytics will be popular to manipulate Big Data and generate critical business intelligence for each of the IoT industry verticals. Underlying Technologies to make Telco Cloud Management successful The next generation virtualization/telco Cloud promises the creation and deployment of new services in shorter time periods, down from a few months to a few days. To respond to this need, CTIOs are now developing new architectures for service assurance, of which SQM, automation and analytics form a key component. The architectures are based on open APIs, Big Data clustering and OpenStack capabilities. Other than the introduction of these new technologies to the underlying platforms, it is important to develop a micro-services architecture, which uses DevOps-enabled iterative processes to quickly respond to customer needs by developing services faster. This is how the customer expectation of using new features every week or every few days will be realized. This also helps in conducting root cause analysis faster and resolving customer issues quickly. An integrated approach of analytics, automation and SQM requires some drastic changes in the way data is churned, visualized and actioned. For a successful launch of the Telco Cloud, long-term assurance of digital services and the creation of business value out of data, it is critical to re-define features of Service Quality Management, zero-touch predictive Operation Centers and analytics for data monetization. MYCOM OSI Page 23 of 28

24 Published in Light Reading on 20 th February, This is an edited version of the original article Also available on MYCOM OSI Page 24 of 28

25 7. Intelligence for the Telco Cloud Written by Sandeep Raina, Product Marketing Director, MYCOM OSI The Telco Cloud, defined as virtualized infrastructure to run digital services and agile operations, has not only paved the way for digitalization of CSP businesses, but also opened up new avenues for data monetization. Through 2017, CSPs will continue to make investments in analytics to drive focused investments in network and services, manage customer experience better and personalize services. To support this, data monetization solutions that offer customer intelligence out of a Telco Cloud and how it can be monetized will be critical and much in demand. The solutions will include clever manipulation and reporting of data for innovative services and increased customer satisfaction. Analytics based on machine learning and artificial intelligence will introduce novel ways of data exploitation, aimed for unprecedented customer experience. The Next generation of OSS systems need to inherently provide analytics that are proactive, predictive, descriptive, diagnostic and prescriptive for both network and business functions. They benefit all teams of the service provider: Operations, Planning, Marketing, Customer Care, Data Scientist, CTO, CIO, CMO and Chief Digital Officer (CDO). Analytics can be broadly classified into the following 2 categories: 1. Connectivity analytics for Telco Cloud network and IoT network 2. Intelligence analytics for Telco Cloud customers and IoT verticals Connectivity analytics Analytics with automation capabilities can solve many investment needs and network related problems. To optimize infrastructure expansion, CSPs need to utilize analytics that add business value to decisions for planning new sites and capacity upgrades. Analytics help in the identification of revenue-generating locations, capabilities of handsets, customer behavior, uplink and downlink video traffic, consumption of video/conversational services, etc. Using analytics (real-time and non-real-time), CSPs can trigger processes for proactive or prescriptive actions to solve capacity issues. Analytics can also help in failure prediction and assessment of service impact, followed by automated root-cause analysis. MYCOM OSI Page 25 of 28

26 With explosive IoT growth around the corner, service providers need to create intelligence out of the IoT traffic that will flow through their Telco Cloud networks. With billions of dollars predicted for the IoT industry, intelligence on the usage and preferences for IoT will be critical through use of analytics to forecast patterns and prevent IoT network/service/device failures. This includes building dashboards for service availability, incident/unavailability breakdown by region/location and also geolocation-based service impact. Some of the top use cases for connectivity analytics are: Predictive troubleshooting: Care and Operations Prediction of faults: Operations Proactive messaging to customers: Care Connectivity analytics offer deep analysis of standard KPIs and identify patterns, root causes and opportunities or risks related to network behavior. Using machine learning, users can explore large volumes of history to find patterns of behavior and create correlations, trouble tickets or remediation tasks. Automating workflows from repetitive problems in the network can reduce the number of alarms to action, ultimately leading to reduced operational costs and Mean Time to Repair (MTTR). Intelligence analytics Telco Cloud service providers consider data and analytics as their top investment priorities to improve customer experience, to contextualize and personalize services for customers. Monetizing data helps in directing marketing campaigns for maximum business impact, designing new digital services, providing information to external advertisers and agencies. Also, analytics are seen increasingly important to understand the experience, needs and behavior of the IoT industry verticals. For all of the above mentioned reasons, marketing/sales analytics will be treated by the Telco Cloud provider as shareable assets, which will need to be assured, managed and sold as services. As digital Telco Cloud business expands into newer areas, analytics sharing will be the norm, however there will be complexities in the generation and distribution associated with this. Some ways through which the complexity can be reduced is by splitting the intelligence analytics into the following buckets: Service intelligence: Machine learning, when integrated with service data, offers powerful predictive service quality management capability to anticipate problems and helps service provider in protecting their SLAs. This is truer for MYCOM OSI Page 26 of 28

27 NFV and IoT environments, where services need to be managed proactively and improvised dynamically Customer intelligence: This includes advanced customer data discovery and profiling to predict customer behavior, such as churn and customer s next action. It includes also creation of personalized offers and campaigns that target for maximum results, segmenting customers by value and creating/optimizing tariff plans Device intelligence: These analytics help in tracking usage of popular devices and apps as well as in the reduction or removal of undesirable devices or content. IoT device analytics are used to market IoT services for IoT providers of energy, smart cities, connected home/vehicles, manufacturing and asset tracking. These include IoT-user and IoT-device behavior, trends and predictions In summary, analytics not only make the Telco Cloud provider an intelligent platform, but also support new revenue and facilitate automated actions for remediation. With connectivity and intelligence analytics, the Telco Cloud provider can fuel the growth of digital services, as well as the IoT services. With the use of machine learning in analytics, many operational problems will be solved automatically and customer experience will be improved through targeted messages, campaigns and services totally tuned to the customer need. Published in Vanilla Plus on 10 th February, This is an edited version of the original article Also available on MYCOM OSI Page 27 of 28