White Paper, March Building the Data-Centric Enterprise

Size: px
Start display at page:

Download "White Paper, March Building the Data-Centric Enterprise"

Transcription

1 White Paper, March 2015 Building the Data-Centric Enterprise

2 MapR Technologies, Inc. White Paper, March 2015 Executive Summary There are two types of companies in the big data space: 1) those that are born in big data to deliver a competitive edge through the software or the processes for enabling big data, and 2) those who have a mandate to simultaneously cut IT and storage costs, and to create a platform for innovation. This white paper is focused on the second group, and will provide the high-level information necessary to understand the approach and benefits of implementing an architecture to become an as-it-happens business one that can sense and respond in real time to its environment. The industry has talked about software-defined networks, clouds, and more. This has primarily focused on keeping the hardware fixed and rewiring it through software. Virtualization has been the other big theme that has enabled this. These generalized techniques will not go away, because they deliver many benefits. The problem is that the type of data, coupled with how data is constantly morphing, is influencing application, storage, and deployment architectures. This is why MapR believes that a major re-platforming of the enterprise architecture is now in progress, where data is at the center and influencing everything around it. At its founding, MapR uniquely focused on the larger picture beyond just Hadoop. The founders backgrounds put them in a position to anticipate the coming changes to both application, data, and systems architecture at the infrastructure level. Many people in the industry are working at creating and evolving component pieces needed for IT architecture in the age of big data. Our vision is to pull these many attributes and ideas together in a coherent view, which can be applied to re-platforming the enterprise and serves as a guide and set of principles on how to succeed. We have the pleasure to work with customers on the leading edge of technology, and they have helped us shape this vision and formalize it into a big-picture story of how CIOs should go about transforming their infrastructure for the future. If data is king, then for the first time, data is the primary influence in how applications will be created, and how the IT infrastructure will be designed and deployed.

3 2 MapR Technologies, Inc. Zeta Architecture Overview The goal of this architecture is to help organizations accelerate the data-to-action cycle. This is accomplished by removing all data silos, enabling support of pluggable compute models and execution engines, and by utilizing a common data storage platform that will simplify business continuity, security and compliance. This architectural approach is based on what is called the Zeta Architecture which lays out the foundational premise for the Data-Centric Enterprise. The Zeta Architecture is comprised of seven tenets: 1. Distributed File System 2. Real-time Data Storage 3. Pluggable Compute Model / Execution Engine 4. Deployment / Container Management System 5. Solution Architecture 6. Enterprise Applications 7. Dynamic and Global Resource Management Example technologies that fit into the Zeta Architecture

4 3 MapR Technologies, Inc. Zeta Architecture Benefits This new architectural approach has fewer moving parts than most enterprise architectures. Application architectures utilizing the Zeta Architecture will realize a significant number of simplifications. All applications will leverage the distributed file system. Processes for moving data around will no longer be required unless a specific use case calls for it. This leads to simplified testing, troubleshooting, and systems management. Resource utilization levels will go up, and this has a direct impact on data center costs as well as the total capacity requirements that drive capital expenditures for procuring more hardware. Implementing consistent platform-wide security and compliance policies are significant time savers that make it easier to audit the environment for any type of regulatory purposes. Business continuity also becomes consistent across all aspects of the business through redundancy, recovery and contingency planning, which are handled within the distributed file system layer. With a proper setup, these features can protect against disk failure, server failure, data center disasters, and even human mistakes. Protecting against human mistakes is enabled by this architecture, but it will require process changes to leverage the platform s capabilities. Big, Fast and Agile Building a real-time, as-it-happens business doesn t require just big or fast or big in one cluster and fast in another. It requires big and fast together. Scaling of the platform needs to happen in near real time. Resources must scale linearly, and the business must be able to expand or contract on demand. The real-time data storage must leverage the common distributed file system the same way as the pluggable compute models / execution engines. Enterprise applications must be able to read/write from the realtime data storage or the underlying distributed file system, or both, depending on the use case. Big and fast need to be rounded out with superior agility. Data that is created needs to be accessible immediately by the pluggable compute model / execution engine. No delays and dynamic control of compute resources means the business problem of the moment can always be handled in the moment. Isolation of resources and true multi-tenancy is required. The resource manager implements memory and CPU isolation, the distributed file system handles data isolation and security, and the compute model / execution engine works in tandem with the others to deliver job isolation. Resilience, redundancy and high-availability are required in this architecture, and can be directly enabled depending on the choices of the software used. Real-time business continuity can be accomplished through support for incrementally updating remote mirrors of data instead of full copies, which leads to the potential of real-time recovery from problems. The final piece of agility is contingency planning with multi-site capabilities and protection from unforeseen situations. Quantium is Australia s leading data-driven insights firm offering services in customer analytics, technology and decision support, media and market activation to many of Australia s largest companies, including Woolworths, National Australia Bank and Foxtel. Quantium selected the MapR Distribution including Hadoop to improve its ability to perform analysis on large volumes of data without having to spend time restructuring the data. Quantium needs to deliver the utmost value to its customers, said Gerard Paulke, Quantium enterprise architect. We determined that we would have significantly more agility by implementing the Zeta Architecture with MapR, which will allow us to better serve customers in a more real-time and cost-effective way. Quantium is currently working through their implementation of the Zeta Architecture and is already seeing the benefits from simpliflying their business processes.

5 4 MapR Technologies, Inc. Bottom Line All of the benefits outlined in this paper also lead to many financial benefits, such as the reduced effort needed to support multiple disparate enterprise architectures across the disciplines of data center monitoring, backup and recovery management, and contingency / disaster recovery planning. A common environment setup leads to simplified deployment management processes, which directly leads to the ability to get new software deployed more rapidly, shortening the data-to-action cycle for the business. All of these benefits also lead to faster time-to-market and enable more strategic investments in solving new business problems faster. Good enough won t be once your business users start experiencing the value of real-time big data. They will only want more, and faster. No matter what use case you start with, your journey will require realtime and enterprise-grade capabilities. It is paramount to remember that an as-it-happens business is as much about simplifying business processes for agility as it is about the technology that runs the business. Keep in mind that doesn t mean replacing all the tools the business depends on. It means rethinking how to operate and augment tools where necessary to become an as-it-happens business. The complete technical details of this enterprise architecture are documented in the Zeta Architecture White Paper. MapR delivers on the promise of Hadoop with a proven, enterprise-grade platform that supports a broad set of missioncritical and real-time production uses. MapR brings unprecedented dependability, ease-of-use and world-record speed to Hadoop, NoSQL, database and streaming applications in one unified distribution for Hadoop. MapR is used by more than 700 customers across financial services, government, healthcare, manufacturing, media, retail and telecommunications as well as by leading Global 2000 and Web 2.0 companies. Investors include Google Capital, Lightspeed Venture Partners, Mayfield Fund, NEA, Qualcomm Ventures and Redpoint Ventures. MapR is based in San Jose, CA MapR Technologies, Inc.