SAS ANALYTICS AND OPEN SOURCE

Similar documents
GUIDEBOOK ADAPTIVE INSIGHTS

G U I D E B O O K UPSERVE P R O G R A M : E N T E R P R I S E A P P L I C A T I O N S D O C U M E N T R 9 2 J U N E

IBM SMARTER COMMERCE: SPSS MUELLER, INC.

Successful Selling: Acing Advanced Analytics to Drive Commercial Growth

RESEARCH NOTE NETSUITE S IMPACT ON WHOLESALE AND DISTRIBUTION COMPANIES

5 top questions for finding the best construction accounting software BY FOUNDATION SOFTWARE

Getting Big Value from Big Data

20 Signs That Your Business is Ready for Managed Services. Find out when your business will truly benefit from a technology provider.

Making ITaaS Core to Your Business Plan

Your Business Needs Managed Services. Find out when your business will truly benefit from a technology provider.

20 Signs That Your Business is Ready for Managed Services. Find out when your business will truly benefit from a technology provider.

20 Signs That Your Business is Ready for Managed Services. Find out when your business will truly benefit from a technology provider.

Leica Geosystems. case study

THE RIGHT TECHNOGRAPHICS FOR B2B TECHNOLOGY MARKETERS

T H E B O T T O M L I N E

THE NICHE SITE CHECKLIST

G U I D E B O O K TENSORFLOW ON AWS. P R O G R A M : D A T A A N A L Y T I C S D O C U M E N T R D e c e m b e r

Key Points How to create an effective business plan

Optimize Your Cloud Performance With Proactive Support

RESEARCH NOTE IMPROVING ANALYTICS DEPLOYMENTS WITH IBM PARTNERS

GE Digital Executive Brief. Enhance your ability to produce the right goods in time to satisfy customer demand

A buyer s guide to data-driven HR. Which approach is best for you?

When big business meets big data, a dynamic approach to analytics is essential

Mainframe Development Study: The Benefits of Agile Mainframe Development Tools

invest in leveraging mobility, not in managing it Solution Brief Mobility Lifecycle Management

ProSupport Enterprise Suite. Support that accelerates your IT transformation

T H E B O T T O M L I N E

CASE STUDY: STANDARD CALIBRATIONS, INC (SCI)

Asset and Plant Optimization in a Connected Enterprise

A BUYER S GUIDE TO CHOOSING A MOBILE MARKETING PLATFORM

C A S E S T U D Y : I G T N E T W O R K S I N C.

LLamasoft Optimiza vs. Spreadsheets

T H E B O T T O M L I N E

SOLVING THE MARKETING ATTRIBUTION RIDDLE Four essentials of decoding the multitouch attribution, beyond the last click.

The Future Moves Fast: Are You Ready to Respond?

Managing Data to Maximize Smart Grid Benefits

ProSupport Enterprise Suite. Support that accelerates your IT transformation

Moving to the cloud: A guide to cloud business management technology

ChannelXpert Enables ITS InfoCom to Increase Profitability Through Business Insights

BHS CCD Exchange Success Story

MOVING BEYOND QUICKBOOKS: Why now s the time to graduate to professional financial management software

1. Search, for finding individual or sets of documents and files

Barriers and Benefits: Gaining Actionable Insight to Increase Innovation

Medical device company ensures product quality while saving hundreds of thousands of dollars

REMOTE LOCKBOX DELIVERS FLEXIBLE LOCKBOX PROCESSING OPTIONS FOR FINANCIAL INSTITUTIONS

Reporting for Advancement

EMPOWER YOUR ANALYSTS. GO BEYOND BIG DATA. Delivering Unparalleled Clarity of Entity Data. White Paper. September 2015 novetta.com 2015, Novetta, LLC.

IBM Balanced Warehouse Buyer s Guide. Unlock the potential of data with the right data warehouse solution

How to: Building a Thriving Consulting Practice

Is Your Reporting Living Up to Its Potential?

REPORT EXTEND THE VALUE OF SAP TO LABOR MANAGEMENT

RESEARCH NOTE THE STAGES OF AN ANALYTIC ENTERPRISE

Tech-Clarity Insight: The Best of Both Worlds for CAD. Taking the Pain Out of Multi- CAD Data within a Consolidated CAD Platform

A Forrester Consulting Thought Leadership Paper Commissioned By HPE. August 2016

Spotlight on Success. July Brendan Howe

5 Reasons Why ecommerce Supply Chains Fail

Intro & Executive Summary

Myth Busted: Affordable, Easy to manage Virtualization with High-Availability is a Reality

Dow Achieves Global Process Efficiency and Operational Excellence with Synergis Adept EDM

The SAS Intelligence Architecture

Key Factors in Optimizing Complex Manufacturing Businesses

Mobility in Consumer Electronics. Advancing the Business of Manufacturing

Revenue Cycle Management for Software Companies

Business Transformation with Cloud ERP

THINKSAP THINK THINK. THE FUTURE OF SAP BW. by Andrew Rankin

The Technology-Driven of. Project Management. Capitalizing on the Potential Changes and Opportunities

feature Big Data Hot Air or Hot Topic? Risk

MuleSoft Connectivity Benchmark Report 2018

You might not realize it yet, but every time you log in to salesforce.com

Strategy and Structure

SAS Is Helping Standard Chartered Comply with IFRS 9 Quickly and Cost Effectively

You might not realize it yet, but every time you log in to Salesforce,

How to Future-Proof Your Indirect Tax Team WHITE PAPER

THE CFO OF THE FUTURE

OUTGROWING MICROSOFT DYNAMICS GP

The E-Myth Revisited Why Most Small Businesses Don t Work And What to Do About It

THE ACUMATICA PLATFORM

CASE STUDY. Resolving Landed Cost and Multi-Currency Challenges with Open ERP

Luxoft and the Internet of Things

Changing The Business landscape SAS and Open Source, Better Together. Dr Mark Chia, Head of Advanced Analytics, SAS

WHITE PAPER LOGISTICS AS A SERVICE HOW LOGISTICS EXPERTS CAN REDUCE SPEND, SAVE TIME, AND INCREASE COMPANY PROFITS

Grow Your Small Business With Salesforce SELL. SERVICE. MARKET. SUCCEED.

QUICK FACTS. Driving Improved Incident Management and Resolution for a Law Firm TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES

Case Study Service Improves With Wireless IT Administration and Management

The Key to Project Success: Reducing Solution Scope

Pathways to the cloud. A quick guide for higher education institutions

G U I D E B O O K P R O G R A M : D A T A A N D A N A L Y T I C S D O C U M E N T N U M B E R : S 180 N O V E M BER TENSORFLOW ON AWS

See the world. differently. Embrace change and profit from marketplace uncertainty

Technology company turns big data into insight

Nimble Storage InfoSight for VMs, a step forward in infrastructure analytics

ATTUNITY GRIZZLY OIL SANDS

Application Migration to Cloud Best Practices Guide

ROI: 270% Payback: 3 Months VERTICA T H E B O T T O M L I N E. Barbara Z Peck

Case Study. How Gemalto s Trust ID Network is revolutionizing self-sovereign digital identities by leveraging R3 s Corda blockchain platform

ORACLE HYPERION PLANNING

PERSONALIZATION WITH FAST DATA

TOP 20 QUESTIONS TO ASK BEFORE SELECTING AN ENTERPRISE IAM VENDOR

7 things to ask when upgrading your ERP solution

ORACLE CLOUD FOR FINANCE

The real story about third-party support.

Transcription:

GUIDEBOOK SAS ANALYTICS AND OPEN SOURCE April 2014 2014 Nucleus Research, Inc. Reproduction in whole or in part without written permission is prohibited.

THE BOTTOM LINE Many organizations balance open source solutions with commercial software to meet the requirements for statistical analysis both within their organizations and externally with regulatory bodies. While open source analytic tools offer a robust online community, and extensive array of algorithms, packaged analytics software companies, most notably SAS offer the performance, scalability, governance, and support organizations require for production and operational analytics. Nucleus found that in most of these organizations, open source and SAS is quickly becoming a complementary partnership, where new hires expertise and the ability to be nimble in conducting research and analysis bring benefits and new approaches that can become part of an enterprise analytics implementation. THE SITUATION In the early 1990s when Linux first appeared on the scene, many did not see the impact that a free, open source operating system would have on the market. It was developed for a very small niche market and had little awareness outside of that group. That perception changed dramatically over the next 10 years, and its place in the technology and software space is now clearly entrenched. Linux had made a significant disruption to the operating system market. As more hardware and software vendors began to develop and release their own versions, and integrate it into solutions, the market awareness and acceptance of Linux quickly developed. Now, the Linux kernel is now as much a part of the operating system market as any of the others. The same dynamic is being felt in the analytics space, with the growth of acceptance of open source programming languages such as R and Python. These open source programming languages also started with small on-line communities and were seen to fit a small niche market, but that is quickly changing. As vendors, such as Revolution Analytics, start to develop and sell packages using open source, the move of open source to commercial software will continue to progress. Nucleus has found that in many organizations, the work and output from open source development is being used within commercial software implementation. With the expertise of the analysts, as well as the algorithms and analysis being developed, organizations are quickly realizing benefits and seeing results in their enterprise analytics implementations. One such commercial software vendor, SAS, has thousands of users around the world, and many of those organizations are using open source to extend their SAS implementations. Nucleus Research Inc. 100 State Street Boston, MA 02109 Phone: +1 617.720.2000

SAS is used by organizations to perform deep statistical analysis in numerous industries including financial services, healthcare, manufacturing, hospitality, and others that need to analyze large volumes of data. In its analysis of SAS, Nucleus found key benefits from the solution include improved decision making, increased analyst productivity, improved profitability, operational efficiency, and the ability to identify opportunities for growth. ANALYTICAL CONSIDERATIONS FOR OPEN SOURCE AND SAS Open Source SAS Source of data External data Non-production data environment Unstructured and structured data Corporate data warehouse Transactional systems External data Production and non-production data environment Unstructured and structured data Volume of data Small data sets Spreadsheets, small data files Small and large high-volume data sets, from ERP systems, corporate systems of record Sensitivity of data Non-corporate data Low security, open access Non-enterprise or enterprise data Access can be tightly controlled High corporate sensitivity Governance Not consistently available Open access Multiple algorithms available for Extensive capabilities Validated, proven and reliable processes the same techniques Not validated or reliable processes Technical Support Available via online communities Available directly from SAS technical support, telephone and online communities, list servers, online self help Data Management Not available from one source Not core to analysis capabilities Extensive capabilities While the use of open source analytics is growing, organizations continue to use SAS for strategic and operational analysis of corporate data in production environments. In these organizations, SAS is the system of record, and the analytics and information derived are a trusted part of the decision-making process. Meanwhile, open source analytical assets are rising in popularity for specific types of analytics tasks for a number of reasons, including the amount of open source training in the educational environment, SAS expert resources reducing due to attrition, and the perceived lower cost of open source. Page 3

Many see open source as a low investment option for standalone, non-production research. Open source allows analysts to conduct analysis on data not yet part of the enterprise or production data environment, and uncover new approaches that could be incorporated into the production environment as appropriate. Balancing the licensing differences between open source and traditional software can be initially be misleading, but with further investigation, the merits can easily be understood. Open source software is seen as a perceived low cost approach, providing the code to the analyst to do with as necessary. Packaged software from a vendor takes the code approach to the next level, where customers will access not just the code, but integrated software capabilities, training, roadmaps, customer support and other legal and operational benefits. Many SAS customers stated the ability to leverage integrated capabilities such as data management, extraction, security, governance, as well as support and training, met their corporate standards, IT requirements, and was the approach best suited for operational and production analysis. To better understand the evolving analytics landscape, and the dynamics between SAS and open source analytics, Nucleus analyzed the experiences of several SAS customers to understand the business needs associated with their analytics solutions, their experiences with open source solutions, and the benefits they ve gained from both technology strategies. These customers ranged in size from 17 thousand employees and 3 billion dollars in revenue, to 300 thousand+ employees and 109 billion dollars in revenue. WHY COMMERCIAL SOFTWARE Nucleus found there were several reasons companies choose to use commercial software packages, such as SAS, for operational and production analysis instead of open source: scalability, performance, governance, security, and user support. SCALABILITY AND PERFORMANCE The organizations Nucleus analyzed stated they required a solution that was able to process, analyze, and manage large amounts of data. For all, SAS has a proven ability to scale and handle the volumes of production data organizations process for their analyses. Open source solutions did not yet have a demonstrable ability to scale and perform to the same level that many of these organizations required for production analysis. Customers said: We use very large data sets big data. Anything we use must efficiently process those data sets. In the future we ll be working with big data appliances and large volumes of unstructured data. We have no concerns regarding SAS being able to handle these future requirements, and know we can pull in experts as required. SAS has much better performance against large data sets. SAS has better scalability than R. Page 4

CUSTOMER PROFILE: NORTH AMERICAN TELECOMMUNICATIONS SERVICE PROVIDER This US-based organization uses SAS and open source for analysis in several different departments, and in most cases, both SAS and open source are used together. The usage pattern is based upon the skill set of the personnel within the department, as well as the business need and the resources available. Both are used because: SAS has the proven scalability, reliability and performance that the departments that analyze high volume, structured data require for their analyses. In departments that leverage external and internal data sources for their analysis, a variety of open source and SAS is used. Security requirements of the data, size of data set, the research and analysis performed, and the skill set of the individuals performing the work are all drivers in the distribution and usage of the tools. Open source and SAS are currently used throughout the company. Open source is generally used for data sets not controlled by the corporate data security rules and regulations. Open source gives analysts the flexibility to do ad-hoc analysis outside of standard IT policies, while SAS provides the predictability and security that corporate governance requires. The proven capability, security and the continuity of SAS is its strength within our organization. Open source definitely has its place, and will continue to work in tandem with our SAS implementation. Principal Analyst, Business Systems DATA MANAGEMENT The customers Nucleus analyzed stated that the ability to manipulate, manage, and integrate many diverse data sources was important to their analytics and business requirements. Open source solutions did not yet have the data manipulation and management capabilities that many of these organizations required. Customers said: R is not designed to acquire, manage or manipulate data. Open source is all about developing that analysis, where SAS is all about the data. We use SAS for analytics, data extraction, and data management. Open source cannot do this. We can easily integrate new functions and manipulate large volumes of data with SAS. We can t do this with open source. We use SAS for the combination of data manipulation and connectivity to multiple data sources it supports. This data connectivity, married with the stored procedures, provides us the ability to perform advanced analytics. Page 5

CUSTOMER PROFILE: MULTINATIONAL FOOD PROCESSING COMPANY This multinational company has been using SAS globally for many years. It continues to use SAS Analytics for production and operational analysis because: Its Information Systems / Information Technology (ISIT) team has stringent corporate security, compliance, and administrative requirements for all software used at a global and production level. SAS has better scalability and performance for production analysis of the very large volumes of data. SAS is used for the data management and data manipulation, not currently available in open source. While open source software does not meet these requirements, it is used in many divisions for non-production work in the investigation of new markets. It is also used in smaller projects, as the data for this type of analysis is usually from outside sources, and not part of the operational systems, and as result, not strictly controlled by the ISIT team. Every year we take a critical look at our implementation, and while for non-commercial work, R provides user flexibility for us, only SAS meets our strict compliance and data security requirements. Demand Planning Specialist & Statistician GOVERNANCE Security and governance ranked very high for these organizations. The ability to control and view who is accessing the data, who is running analysis, the validity and accuracy of the algorithms, and who is executing them was very important from a corporate security perspective. Open source was unable, at this point, to provide that level of information, nor meet the stringent regulatory, legal, and security requirements of the external regulatory bodies, and internal corporate management teams. Customers said: Open source is based on fragments. There is no control or governance on those fragments, and they can be changed, altered or even taken away. With SAS, that never happens. You can have the same confidence that something you wrote 10 years ago will still run, just like the code you wrote last week. While we do have R in house, only standalone work is done in R. We would be concerned with respect to data security if using R on our production systems. We are a multi-national corporation, and our Information Systems / Information Technology (ISIT) team is quite strict and has a very high demand for control. We must align with strict rules regarding compliance, governance, access, security, and administration. We can match these rules in SAS. We can t do that with open source. For all of the organizations Nucleus analyzed, that fact that SAS, as a company, could be held accountable, was an important factor behind the decision to maintain and use SAS Page 6

for their analytics. Open source solutions met their internal needs for a tool that would be used for research and testing analysis on non-production datasets. These large enterprise organizations could not afford, from a legal, operational or regulatory perspective, to use software for strategic and operational decision making that did not have a vendor behind it to provide support, and maintenance or product roadmaps. Many expressed concerns about the lack of a legal entity behind open source, and the inability to have confidence in a partnership with a vendor. Customers said: SAS is more established, and there are already legal, business, and support processes in place to rely on. Security is a part of the risk for considering using open source in production. If something were to happen there really is no one that can be held accountable with open source. If there is a breach of data we would require a reliable company to work with and hold responsible. We can rely on SAS as a partner. TRAINING, DOCUMENTATION AND SUPPORT Trusted and expert customer support was very important. Being able to confirm, validate, and trust the expertise was key for 100% of the organizations. They all stated the ability to work with a true customer support organization, and be able to contact and speak with an expert who understood the models and algorithms, provided the confidence, trust, security, and reliability these organizations required. Customers said: Open source doesn t offer the training that SAS does. Training courses and user groups are important. SAS is well established in the market, and it would be very hard to replace the level of training. If anything went wrong, and you couldn t find the solution to code problems in R you had to go to the forums to hopefully find the solution. Not so in SAS. You are able to get reliable support and solutions for problems. The reliability and predictability of the SAS algorithms was of most importance to the organizations surveyed for strategic, production and operational analysis. They trusted the algorithms, knew the algorithms were proven, validated, and well documented; they could trust the quality of the results, and most importantly, knew an expert support organization could be contacted to provide any assistance. As a result, the organizations knew their analysts would lose minimal productivity having to uncover or troubleshoot algorithms for production use. Customers said: I don t know who is writing the algorithms in R. SAS algorithms are proven and fully documented. R provides lots of choice, but in many cases too much choice. Code isn t always well thought out, and there isn t continuity in the streams. SAS is predictable, reliable, and proven. With SAS, you have that core base product that everything is spun off. This provides a level of extensibility, core knowledge, and scalability that open source can t give. Page 7

CUSTOMER PROFILE: GLOBAL HOSPITALITY COMPANY This global organization uses SAS for production and operational analysis, as well as data management. It continues to use SAS Analytics for production and operational analysis because: Data security is important, and open source does not meet our Information Systems / Information Technology team s requirements. Open source is constrained with respect to the amount of data it can process. The volume of data required for analysis requires the scalability and performance capabilities found with SAS. SAS solutions are used across the organization for data extraction and manipulation, data management as well as analytics. This functionality is not available in open source and would require a significant investment to replace with other tools and solutions. Open source is currently used for ad-hoc work only, and the resulting analysis development is potentially leveraged in SAS. R gives analysts the ability to perform independent work with algorithms and analysis they ve developed in R before transferring that work into SAS. Every year we re-evaluate, but stay with SAS because its superior customer support, data manipulation, scalability, and performance for large data volumes that we require. Director, Strategy & Analytics WHY OPEN SOURCE Many organizations are adopting open source analytical tools such as R and Python in some situations because the perceived low cost and ease of adoption makes it a valuable tool for analyzing data. The organizations surveyed for this report showed a similar trend. This is particularly true if the organization is targeting transactional data already addressed by the SAS footprint or data that is not necessarily meant for the enterprise data warehouse. Open source was leveraged by many as an important tool for its ability to perform ad-hoc research in a standalone environment. Users of open source cited increases in computing power, the ability to rapidly deploy and analyze data, and low initial cost of adoption as main reasons for open source adoption. Customers said: Open source offers an attractive initial pricing model. The on-line user communities have highly skilled and knowledge people, and for the right sized company with the right problems, open source is a good choice. R is used by individuals in my company for their specialized projects. They can easily install it, and conduct research on things that may or may not become part of our production or operational systems. Page 8

All the customers did agree that open source offers many algorithms, and flexible approaches that can be used in a variety of ways. In addition, the open source community offers a very strong source of knowledge and assistance. Customers stated that in some cases, open source was a good fit: Some things are easier to do in open source. You can be more creative because of the diversity of algorithms. R allows our analysts to experiment and try out new analysis on a smaller scale on non-production data sets. While we do have R in house, only standalone work only is done in R. We would be concerned with respect to data security if using R on our production systems. At this point, we use R for research work. Analysts can run tests, and research on their own machines without impacting the production systems or having to worry about security and governance issues. Nucleus has found that there is a place for both open source and SAS in many enterprise environments where SAS has been successfully used, sometimes for decades, to analyze data. THE COST OF SWITCHIN G Nucleus found many organizations that had already made a significant investment in resources and skills within their SAS environment believed that while free did appear cheaper, there would be significant switching costs associated with moving their current analytics footprint to open source. Main areas companies cited where switching to open source would create certain disruption, and they believed, unnecessary expense included the costs to convert their SAS analysis to open source code; the personnel costs of such a project, the lost employee productivity, and the lost business impact that time away from current analytics efforts would produce. Customers said: It would cost us between $0.5-1M in salary alone to make the transition from SAS to open source. We would need a couple of people, an additional $300-400K, and no one would be working on creating new models, analysis or algorithms. We do everything with SAS - analytics, data extraction, and data management. Across the organization we would have to make a significant investment in other tools to move from SAS. We can t hire an army of people to build a new environment. We have many different ways to make the SAS tools work, and it is fully integrated to other systems. The effort required to move away from SAS is our biggest concern. Our industry is very specific as to how analytics are done, and the algorithms that are used. Rules and legislation are very well defined from a maintainability perspective as there are a lot of standards and structure. The use of SAS is a requirement. Page 9

CONCLUSION Many organizations balance open source solutions with SAS to meet the growing need for statistical analysis both within their organizations and externally with regulatory bodies. Open source analytic tools offer hundreds of ways to execute an analytic analysis, while SAS offers the performance, scalability, security, and governance many organizations require for production and operational analytics. Organizations choose SAS for the customer support required for enterprise sized organizations, and its ability to provide high caliber training and documentation. Nucleus found that in most of these organizations, the use of open source and SAS is not a vice-versa situation, but one where the two environments are able to augment each other, and drive greater benefit for the business. In choosing the best analytics approach for a particular task, considering the source, volume, and sensitivity of data, will help organizations ensure they maximize returns from both analytics approaches while making the most of their existing SAS investment. Page 10 SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 107100_S125006.0514