Success with Big Data Analytics

Size: px
Start display at page:

Download "Success with Big Data Analytics"

Transcription

1 White Paper Success with Big Data Analytics Competencies and Capabilities for the Journey Jason Danielson, Solutions Marketing Manager, NetApp June 2016 WP-7233 Abstract Big data analytics creates sizable value for companies in all industries. Value is created through a better customer focus, from increased operational excellence, or from entirely new businesses. However, many companies underestimate the cost, complexity, and competencies to get to that point, and many fail along the journey. Smart companies reduce the cost, complexity, and competency gap by relying on NetApp and NetApp partners for their big data infrastructure. Because enterprise storage building blocks and associated service capabilities provide maturity and cost-effectiveness at the data management level, across all big data uses and technologies, companies are freed to focus on developing business-facing capabilities, which requires mastering competencies at the application and data science levels. Big data analytics technologies discussed in this white paper include Splunk, NoSQL databases, Hadoop, Solr, and Spark.

2 TABLE OF CONTENTS 1 Situation Challenges Playbook for Success with Big Data Analytics Strategy Design and Deployment Data Management and Governance Operations Security and Compliance Program Management Partners Conclusion...10 Glossary...12 LIST OF FIGURES Figure 1) Competencies required for success with big data analytics Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

3 1 Situation 1.1 The Promise of Big Data Analytics Big data analytics is the process of examining datasets that are characterized by a greater volume, velocity, or variety of data types than those found in traditional business intelligence and data warehouse environments, with the purpose of uncovering hidden patterns, unknown correlations, market trends, customer preferences, and other useful business information. These analytical findings can lead to more effective marketing, new revenue opportunities, better customer service, improved operational efficiency, competitive advantages over rival organizations, and other business benefits. Companies across all industries increasingly view data as a critical production factor similar to talent and capital. They realize that capturing and blending more data sources than ever before across many different domains create economic value. For instance, financial services companies collect, price, and disburse capital across their various lines of business, from granting credit to providing insurance to making capital markets work. The volatility and disruption the industry has experienced over the last few years have spurred banks and insurers to unlock the value inherent in the data their businesses generate. They are looking for real-time, actionable insights that help them better understand customers, price risks, and spot fraud. This means gathering, analyzing, and storing millions of transactions, interactions, observations, and third-party data points per minute. Existing systems such as relational databases or enterprise data warehouses are high performing but often not suited for the volume, velocity, and variety of data. It is no wonder that financial services companies have been among the early adopters of big data, embracing technologies and solutions such as NoSQL databases, Hadoop, Spark, and Splunk. (For a definition of these technologies, see the glossary at the end of this white paper.) They are leveraging the power of various big data technologies to transform the customer experience and improve profitability. 2 Challenges Anecdotal evidence suggests that 50% of enterprises that embrace big data struggle to create business value, evidenced by many abandoned, stranded Hadoop efforts. Only 10% become truly successful, having developed and mastered the many competencies required after a long, arduous journey. 2.1 Why Is Success with Big Data So Hard to Achieve? In our own experience, enterprises struggle with several aspects of scaling their use of big data: Big data analytics is treated as a technology project, not as a cross-functional transformation of the business. Enterprises typically start with a small implementation to solve a specific immediate problem in a single department or line of business. Small environments tend to grow organically as more people become aware of the availability and value of the solution. In their quest for agility, IT staff often fail to consider what it would take to make such a solution available for broader use across the enterprise. For instance, Hadoop s ability to interpret data on read is initially seen as liberating and encourages dumping any and all types of data into this Hadoop data lake, akin to a long-term parking lot. However, if metadata (that is, data describing the nature and provenance of the raw data) is not defined from the beginning, the data lake quickly becomes a data swamp of which it is difficult to make sense. Similarly, many companies strive to assemble a customer golden record to provide a 360-degree view of the customer relationship, which becomes the foundation for highly targeted segmentation and personalized calls to action. However, that works only if the different dataproducing departments (for example, sales, production, service, and marketing) collaborate, and if the handoffs to the data-consuming processes are defined (for example, segmentation for the campaign management system or next-best activity that a call center agent can suggest). As siloed line of business big data deployments start to grow, centralized IT environments often get asked to assume ownership of the big data infrastructure to address the scaling challenges. The technology vendor ecosystem is fragmented and rapidly evolving. There is a perception that the big data market is characterized by a high pace of innovation delivered by many best-in-class 3 Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

4 software companies. Although NoSQL and Hadoop deliver compelling capabilities, many enterprises underestimate the complexity of getting these technologies to work smoothly in business-critical environments. The consolidation that has happened in more mature areas of technology (for example, IBM, Oracle, Microsoft, and SAP) has yet to happen in the big data space, which is characterized by more than a dozen NoSQL databases, several Hadoop distributions, and rapid advancement in newer Apache projects such as Spark and Storm. New technologies and architectures also call for new skills. There is a lot to learn across the entire lifecycle of big data initiatives and technologies. Many enterprises lack a strong digital leader who is able to align business needs with technology capabilities. The architectures, technologies, and vendors selected need to align with those evolving business needs. New skills need to be developed, often through external hires, across a range of roles ranging from architecture to infrastructure engineering and operations to data science to application development. Moreover, when analyzing data with high volume, velocity, and variety, great skill is required to assess the veracity (that is, quality and trustworthiness) of this data. 3 Playbook for Success with Big Data Analytics Enterprises across the world are increasingly using the agile approach to software development that was perfected in Silicon Valley to design innovative big data systems and build data-driven solutions. In our view, enterprises that truly succeed are agile and deliver business results quickly without compromising on the future. The apparent conflict between agility versus scalability to meet enterprise-wide needs is resolved by those enterprises that take a realistic look at the competencies required for success with big data. Those enterprises realize that the journey toward business value can be accelerated and derisked by relying on partners that can handle the inherent complexity. Those partners bring critical competencies in areas such as big data strategy, design and deployment, data management and governance, operations, security and compliance, and program management. Figure 1 illustrates the seven areas of competency that are essential for success with big data. Figure 1) Competencies required for success with big data analytics. Competencies Required for Success with Big Data Analytics Strategy Design & Deployment Data Mgt. & Governance Operations Security Partners Program Mgt. The following sections cover the best practices that NetApp has developed and that ensure successful business outcomes for our customers. 4 Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

5 3.1 Strategy To succeed at this stage, NetApp customers address the following areas: Business strategy and alignment Use case definition and prioritization Business Strategy and Alignment Those companies that succeed with big data in a big way are those that have built a strong case for change. Those companies realize that data becomes a critical production factor similar to talent and capital. They also realize that integrating big data analytics capabilities into the many aspects of the business is a multiyear journey that cannot start soon enough. Often, it is about a completely new architecture with new applications, platforms, tools, and capabilities, augmenting many organizations current best-in-class setup. This results in better insights into potentially all aspects of a company s business, generated across a range of valuable use cases targeting revenue growth, cost savings, or better customer outcomes. However, some use cases require changes to processes, systems, and metrics in order for insights to become fully actionable. Big idea: NetApp helps your organization create business alignment, an important prerequisite to success. Several thousand petabytes of data worldwide rely on NetApp. We are experts at proactively advising large organizations on how to manage their data better. Superior availability, performance, and ease of deployment, for instance, directly affect business value. In addition, NetApp and our partners bring a proven approach on how to extract business value from that data across the enterprise Use Case Definition and Prioritization Deciding which use cases to pursue in what order is not just a matter of business impact. Organizations need to make sure the use case presents a manageable execution challenge. In other words, the ease or difficulty of implementation needs to be aligned with the skills present at a point in time. That ease of implementation is defined by factors such as business ownership, availability of data sources, technology complexity, compliance complexity, skills, and organizational complexity. Big idea: NetApp accelerates time to value and reduces execution risk. Our partners 1 bring a proven approach to onboard successive waves of users, use cases, and analytic applications, which allows you to scale more quickly. Jointly, we provide project leadership, best practices, and robust plans to minimize surprises. NetApp partners can fill roles as needed, accelerating your learning curve, and transfer knowledge on the job until internal resources can perform their roles. 3.2 Design and Deployment To succeed at this stage, NetApp customers address the following areas: Target architecture and solution design Infrastructure agility and resilience Target Architecture and Solution Design Traditionally, technology adoption has happened in silos. That silo approach to implementation was characterized by domain-specific solutions, proprietary islands, with vendor lock-in. Siloed point solutions meant that the data a company has is not usable outside the silo. Raw data is collected for a single Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

6 purpose and discarded too soon. Therefore, management has an incomplete picture of what is going on. There is no single point of truth. Smart enterprises adopt more of a layer approach to technology, driven by an enterprise-wide unifying target architecture wherever possible. The main architectural principle of moving data once but sharing and processing it many times mandates shared storage building blocks. The physical instantiation of these building blocks can vary, serving data that is hot, warm, cold, or frozen, on the premises or in the cloud. What matters is an integrated approach to managing, operating, and securing these building blocks. Such an approach brings many benefits: Modular, modern architecture that supports the broadest range of applications and analyses Freedom to choose the best tool or processing engine for the job All data, across all time periods, joined and correlated across domains Shared data that is consumable for different use cases, often building on each other Multiple lenses on the same data, with team-specific views Ability to serve new use cases quickly and affordably Fast learning curve, making it easier to train, retain, and develop staff Big idea: NetApp offers an enterprise architecture with validated storage building blocks stretching across new deployments as well as in-place analytics on existing data, which guarantee lower total cost of ownership (TCO) and risk than commodity servers with internal drives. The NetApp approach brings: A mature solution architecture that includes validated designs, technical reports, and complete runbooks, which shortens time to value, increases deployment stability, and reduces consulting expenses Ability to handle both unstructured and structured data with the portfolio of products Reduced operational complexity, including speed of provisioning capacity and users Consistent enforcement of data security, privacy, governance, and compliance A dramatic reduction in the power, space, and skills required Accelerated testing and development of big data solutions by making sure of seamless data movement between on-premises and public cloud environments Infrastructure Resilience and Agility Unfortunately, standard server-based deployments that utilize internal disks provide less resilience and performance than it might appear on day one of a deployment. Performance under failure conditions can cripple day-to-day operations. Network costs and complexity increase due to replication and failure redistribution models. Managing disk failures and replacements is decentralized and prone to errors. As these big data deployments grow, storage management becomes complex. Moreover, commodity internal drives are less agile than required. Agility is about accommodating future growth and making the big data infrastructure consumable for different use cases. This requires processing power and storage requirements to scale independently of each other to address evolving business needs. Evidently that is impossible with commodity internal drives. Capital efficiency is low because the load balancing and scale that a shared service provides are more difficult to achieve. Big idea: NetApp provides enterprise readiness across all aspects of resilience and agility, spanning operations, governance, integration, and security. NetApp accelerates time to productivity of your big data infrastructure while allowing you to meet ever-changing business demands. Specifically: No longer does a single disk failure cripple a node and immediately affect overall infrastructure performance. Recovery from disk failures is dramatically improved due to the implementation of dynamic disk pools (DDP), which harness the performance of a large pool of drives to intelligently 6 Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

7 rebuild the lost data. Performance is only negligibly affected, and for an order of magnitude less time than with internal storage. For instance, in a recent test, a single drive failure and rebuild process in one of the internal drives in a Couchbase cluster server had a significant impact on the cluster s capability to process requests from clients. The operations-per-second rate dropped by over 90%. However, with the NetApp EF560 and DDP, the impact was limited, and approximately 15 minutes after the initial disk failure, normal service was restored. 2 File system storage and compute are decoupled and can scale independently subject to workload requirements. This also eliminates the need for rebalancing or migration when new data nodes are added, thereby making the data lifecycle nondisruptive. NetApp storage also increases performance for many big data workloads. For instance, in a recent benchmark, Splunk on NetApp achieved search performance that was 69% better than Splunk on equivalent commodity servers with internal disks. 3 NetApp provided optimized performance and capacity buckets for Splunk s hot, warm, cold, and frozen data tiers. Moreover, because data is externally protected, additional performance and efficiency gains can be realized by reducing the amount of data replication, lightening the load on compute and network resources and reducing the amount of storage required just for data protection. The ability to do in-place analytics on existing NAS data using NetApp technologies can help save infrastructure cost and time of setting up a duplicate storage silo for analytics and provide faster time to insights. It also eliminates data movement. 3.3 Data Management and Governance Smart enterprises deploy out-of-the-box processes that instantiate best practices and therefore close the skills gap and reduce the potential for human error in the following areas: Data ingestion Metadata management Collaboration Data lifecycle management Data Ingestion Smart enterprises have a well-honed process for bringing new data sources into the data lake, called data ingestion, at both the batch and real-time layer. They have also established processes and tools to assess and improve data quality, sometimes referred to as veracity. Big idea: NetApp partner solutions, such as Zaloni s Bedrock Data Management Platform, provide managed data ingestion: simplified onboarding of new datasets, managed so that IT knows where data comes from and where it lands. It also allows automated orchestration of workflows to apply to new data as it flows into the lake and is key for reusability, sourcing, and curation of different data types Metadata Management Early big data use cases were often around higher volumes of structured data. After companies move into unstructured data use cases, making sure of data quality becomes very difficult without good metadata management. Metadata management cannot be an afterthought as more and more data sources feed into the shared storage environment. Smart enterprises keep track of what data is in the big data 2 Detailed report available on report number detailed report or search on our webpage for TR Performance data taken from NetApp E-Series for Splunk Enterprise 2015 Function1. 7 Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

8 platform: its source, its format, its lineage, and its quality. Data visibility and understanding are unified across traditional business intelligence and true big data environments such as Hadoop. Defining and capturing metadata allow ease of searching and browsing. Proper metadata management is the foundation of data quality. As data lakes grow in depth and importance to the business, the quality of the metadata becomes essential to make sure that the data poured into these lakes can be found, used, and exploited for years to come. Big idea: NetApp partner solutions can assist. Zaloni s Bedrock provides unified data management across the entire data pipeline from ingestion through to self-service data preparation. It ensures file- and record-level watermarking so you can see data lineage, movement, and usage. This ensures that consumers can search, browse, and find the data they need, reducing the time to insight for new analytics projects Collaboration Collaboration at its core is about coordination between data owners, data professionals (for example, administrators, developers, and data scientists), and data consumers. The more business critical the use cases, the more important that collaboration becomes. Smart organizations have found ways to address misaligned funding and incentives. For instance, department A is only able to tap into the wealth of data in the shared data lake if department A also makes its own data available to other departments, avoiding the free-rider problem. Moreover, a code of conduct might state that department A needs to provide advance notice regarding changes in the availability, quality, or format of its data, because department B may use A s data source for powering real-time recommendations at the point of sale. Some large companies have created homegrown internal social networks that break down these communication and incentive barriers. Big idea: NetApp partner solutions such as Zaloni s Bedrock embody best practices to foster collaboration and coordination. Specifically, Bedrock provides workflow and enrichment. Workflow covers tasks such as masking, lineage, data format conversion, change data capture, and notifications. Enrichment allows data professionals to orchestrate and manage the data preparation process Data Lifecycle Management As enterprises adopt cloud-based platforms for some big data workload, the gap between on-premises and cloud-based data management and governance has widened. Enterprises struggle in their desire to have a Hadoop data lake platform that integrates enterprise security and access policies with the performance and reliability offered by familiar enterprise storage platforms. Big idea: NetApp gives you the confidence that your analytics are always running on the right data, with the right quality, with mature and proven data governance and associated workflows. Specifically, NetApp provides advanced data lifecycle management, allowing for automated tiering of storage and migration from hot to warm to cold storage. In addition, data can be tiered and replicated to NetApp private storage (NPS), which enables analytics or for the most critical use cases backup and disaster recovery in the cloud. Moreover, the architectural principle of moving data once but sharing and processing it many times holds true even for traditional enterprise data because the NetApp NFS Connector for Hadoop allows inplace analytics on data sitting in NFS-addressable storage arrays. 3.4 Operations To succeed at this stage, NetApp customers address the following areas: Manageability Efficiency and performance 8 Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

9 3.4.1 Manageability Administering a large Hadoop cluster can be more complicated than many realize. There is complexity associated with manually recovering from drive failures in a Hadoop cluster with internal drives. Big idea: NetApp provides the SANtricity Storage Manager, which has often been commented on as the easiest to use and most intuitive interface in the industry. It features a combination of wizards and automation for common processes along with detailed dials for storage experts. It provides a centralized management GUI for monitoring and managing drive failures. The SANtricity operating system is also performance optimized, yet still offers a complete set of data management features such as Snapshot copies and mirroring. This makes it easy to meet service-level agreements with predictable performance. These are complemented by OnCommand Insight for health checks and the widely acclaimed NetApp AutoSupport Efficiency and Performance Manageability was a capability that came to the Hadoop framework relatively late (Ambari and so on). That is a particular concern in commodity hardware clusters because drives fail much more frequently than in enterprise-grade systems. Although this might have been a lesser concern in small clusters for batch-oriented analytical workloads, it becomes a much larger concern as Hadoop powers insights that drive quasi-real-time operational decisions and as clusters come to incorporate disparate hardware pools (for example, servers of different generations). Big idea: NetApp provides better efficiency, requiring less hardware, which also translates into savings in power and software licenses. NetApp provides better performance per dollar and per rack compared to internal disks. Rather than creating three copies for sustained data availability, NetApp is able to maintain data availability with just two copies because of DDP. Hardware-based RAID and a lower Hadoop replication count reduce network overhead, thus increasing the aggregate performance of the cluster. In spite of the lower replication count, NetApp achieves % reliability, versus 99.9% for commodity internal drives. As a result, throughput typically increases by 33% (that is, 33% less network traffic), and storage consumption decreases by 33%. In addition, licensing costs for Hadoop and associated components are reduced because of the lower node count. Additional efficiency is provided by the storage tiering that NetApp provides. Moreover, NetApp provides better resilience and reliability than internal drives during healthy and unhealthy modes of operation. Data nodes can be swapped out without downtime. Crucially, it can recover from a single node failure without disruption to running Hadoop jobs. NetApp E-Series and EF- Series with hardware RAID provide transparent recovery from hard drive failures. The data node is not blacklisted, and any job tasks that were running continue uninterrupted. 3.5 Security and Compliance To succeed at this stage, NetApp customers address the following areas: Data security Data privacy and compliance Data Security For early, isolated big data pilots and proofs of concept, an illusion of security exists because physical access to the data is largely restricted to a few friendly, trusted users. Data is largely owned by each department, and service levels are best efforts, typically for batch use cases. The new possibilities opened up by the data lake imply a steep learning curve for everyone involved in security. Internal security signoffs are often complicated. Best practice security needs to consider administration, authorization, audit, access, authentication, and encryption. 9 Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

10 Big idea: NetApp contributes to security by providing hardware-accelerated encryption. The benefit is a performance impact of less than 1 percent, compared to several percentage points with competing solutions Data Privacy and Compliance Common techniques for addressing data privacy and compliance issues include role-based access control, data masking, and tokenization. Any policies need to strike the appropriate balance between data privacy and business value, for instance, governing fine-grained, role-based access control on shared data in a data lake accessed potentially by hundreds of internal or thousands of external users. Big idea: NetApp partners help you manage the privacy and compliance of your big datasets. Although a full discussion of the data privacy and compliance regimes in different industries and jurisdictions is beyond the scope of this paper, a common requirement that is supported by the NetApp partner ecosystem is the ability to delineate all information associated with specific entities during specific time frames, for instance, in the context of freedom of information requests or litigation/legal discovery requests. 3.6 Program Management Those companies that truly succeed run big data initiatives as end-to-end cross-functional transformations, not as technology projects. Program management plays a pivotal role in that transformation. Big idea: NetApp Global Services provides seasoned program management capability to ensure successful outcomes. 3.7 Partners In the rapidly evolving big data space, no single company can provide everything, and NetApp is no exception. Big idea: What NetApp does provide is a comprehensive partner ecosystem across Hadoop, NoSQL, and analytic applications such as Splunk that collectively solves the big data analytics needs of the most demanding enterprises. The NetApp partner ecosystem is available on Solution Connection: 4 Conclusion In summary, NetApp helps you achieve success with big data analytics initiatives, no matter whether your role is on the business side as a business owner and consumer of big data insights or on the technical side as a developer or data professional. NetApp and partners help you create maximum business value with short time to market because NetApp s portfolio of solutions provides better and more consistent performance and is tested with Hadoop distributions, NoSQL databases, and applications such as Splunk and Spark. Additionally, there is a TCO advantage stemming from better performance and scalability, efficiency (storage, power, and licenses), and improved recoverability. Overall, NetApp provides a better balance of performance (with less hardware than competing solutions), capacity, and cost. Customers particularly value the independent scaling of storage and compute, performance tiering, and space efficiency (single source of data, no resync, and no copy). Specifically, our E-Series provides the following benefits: Realize better performance than internal drives during data rebuilds. 10 Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

11 Increase search performance by 69% versus commodity servers with internal disks. 4 Save on storage capacity by reducing replication factor. This reduces storage capacity requirements and maintains availability with less copies. Scale compute and storage independently to better match application workload. Enjoy single-interface management across storage environment. Maximize uptime of cluster through superior availability. Improve reliability with enterprise storage building blocks. Encrypt your data with no performance impact. Optimize performance and capacity for hot, warm, cold, and frozen data. Rest assured with world-class NetApp AutoSupport. 4 Source: Function1 report. 11 Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

12 Glossary Hadoop: Open-source software that provides the enterprise-wide data lake: Allows acquiring all data in its original format and storing it in one place, cost effectively and for long time periods. Allows different processing engines and schema on read. Provides mature multitenancy, operations, security, and integration. NoSQL: Nonrelational databases popular for big data and real-time web applications: Data models (for example, key-value, graph, or document) seen as more flexible than relational database tables Popular for high-availability, low-latency use cases Simplicity of scaling out horizontally using clusters of machines versus scaling up for relational databases Popular open-source NoSQL DB, including MongoDB, Apache Cassandra, Solr, and HBase Spark: Open-source software that provides a modern development environment and power user analytical environment for big data: In-memory high-speed analytics engine Advanced machine learning libraries Unified programming model across all processing engines Splunk: Software solution for searching, monitoring, and analyzing machine-generated data using a web interface: Captures, indexes, and correlates real-time data in a searchable repository from which it can generate graphs, reports, alerts, dashboards, and visualizations Horizontal technology, based on a proprietary NoSQL database, traditionally used for IT service management, security, compliance, and web analytics 12 Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.

13 Refer to the Interoperability Matrix Tool (IMT) on the NetApp Support site to validate that the exact product and feature versions described in this document are supported for your specific environment. The NetApp IMT defines the product components and versions that can be used to construct configurations that are supported by NetApp. Specific results depend on each customer s installation in accordance with published specifications. Copyright Information Copyright NetApp, Inc. All rights reserved. Printed in the U.S. No part of this document covered by copyright may be reproduced in any form or by any means graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an electronic retrieval system without prior written permission of the copyright owner. Software derived from copyrighted NetApp material is subject to the following license and disclaimer: THIS SOFTWARE IS PROVIDED BY NETAPP AS IS AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL NETAPP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. NetApp reserves the right to change any products described herein at any time, and without notice. NetApp assumes no responsibility or liability arising from the use of products described herein, except as expressly agreed to in writing by NetApp. The use or purchase of this product does not convey a license under any patent rights, trademark rights, or any other intellectual property rights of NetApp. The product described in this manual may be protected by one or more U.S. patents, foreign patents, or pending applications. RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS (October 1988) and FAR (June 1987). Trademark Information NetApp, the NetApp logo, Go Further, Faster, AltaVault, ASUP, AutoSupport, Campaign Express, Cloud ONTAP, Clustered Data ONTAP, Customer Fitness, Data ONTAP, DataMotion, Flash Accel, Flash Cache, Flash Pool, FlashRay, FlexArray, FlexCache, FlexClone, FlexPod, FlexScale, FlexShare, FlexVol, FPolicy, GetSuccessful, LockVault, Manage ONTAP, Mars, MetroCluster, MultiStore, NetApp Fitness, NetApp Insight, OnCommand, ONTAP, ONTAPI, RAID DP, RAID-TEC, SANshare, SANtricity, SecureShare, Simplicity, Simulate ONTAP, SnapCenter, SnapCopy, Snap Creator, SnapDrive, SnapIntegrator, SnapLock, SnapManager, SnapMirror, SnapMover, SnapProtect, SnapRestore, Snapshot, SnapValidator, SnapVault, SolidFire, StorageGRID, Tech OnTap, Unbound Cloud, WAFL, and other names are trademarks or registered trademarks of NetApp Inc., in the United States and/or other countries. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such. A current list of NetApp trademarks is available on the web at WP Success with Big Data Analytics 2016 NetApp, Inc. All rights reserved.