Table of Contents: FOREWORD EXECUTIVE SUMMARY EVOLUTION OF ETL ARCHITECTURE: CONCLUSION CUSTOMER CASE STUDIES:

Size: px
Start display at page:

Download "Table of Contents: FOREWORD EXECUTIVE SUMMARY EVOLUTION OF ETL ARCHITECTURE: CONCLUSION CUSTOMER CASE STUDIES:"

Transcription

1

2 Table of Contents: FOREWORD EXECUTIVE SUMMARY EVOLUTION OF ETL ARCHITECTURE: DEVELOPER PRODUCTIVITY PERFORMANCE & SCALABILITY PATENTED ALGORITHMS DIRECT I/O FOR FASTER DATA TRANSFERS HIGH-PERFORMANCE COMPRESSION DYNAMIC SELF-TUNING ENGINE HYBRID MULTI-THREADING + MULTI-PROCESS ENGINE CONCLUSION CUSTOMER CASE STUDIES: COMSCORE: DRIVING BUSINESS GROWTH WITH BIG DATA INTEGRATION NATIONAL EDUCATION ASSOCIATION: OFFLOADING ORACLE TO OPTIMIZE PERFORMANCE, REDUCE COSTS

3 Foreword BY PHILIP HOWARD, BLOOR RESEARCH I ve been taking a look under the covers to see what is different about DMX 1, which is the company s data integration product, and how come it holds the ETL record (by a big margin) for loading data into a warehouse. There are a number of important points. The first is that the technology was originally developed to run on mainframes back in the 1970s. At that time you needed to be parsimonious with your use of resources and you had to squeeze every bit of performance out of whatever little memory you could access. That frugality has been carried through into today s product (which runs on all leading platforms). For example, by default, DMX uses 15% of available memory on whatever platform it is running. Other tools in this space typically take all available memory. Similarly, the product directly connects to the disk drives on the source and target rather than going through the operating system, thus cutting out any overhead associated with that process. Secondly, and the biggest thing that makes Syncsort unique in the data integration space, is that the product is built around an optimiser in much the same way that databases have an optimiser. Of course, this only makes sense if you have lots of different ways of achieving the same results. Most ETL and data integration platforms don t have more than a few different algorithms for performing joins and sorting, for example, so it is arguable that they wouldn t get much better performance if they did have an optimiser, because their choices are so limited. Syncsort, on the other hand, has some 30 different sort algorithms and a similarly large number of join and other algorithms. The optimiser then creates a transformation plan in the same way that a database optimiser creates a query plan. Moreover, this optimiser is dynamic so that it monitors data movement as it is happening and, if it finds that the current algorithms being used are not optimal, then it can dynamically change the transformation plan. I could go on but suffice it to say that DMX is extremely efficient and, for bulk loading at least, probably the fastest product available on the market. 1 Previously known as DMExpress 3

4 EXECUTIVE SUMMARY Today, the term Big Data dominates the headlines, and talk of the Three V s (Volume, Velocity and Variety) is ubiquitous. Meanwhile, the vast majority of organizations still struggle to integrate and transform terabytes, if not gigabytes, of data. In fact, complex data transformations can introduce sizable hurdles even on small data volumes. This highlights a key problem: conventional data integration solutions are significantly increasing the cost and complexity of blending and leveraging diverse data sources. Organizations need a smarter approach. This paper presents a technical overview of Syncsort DMX fast, secure, enterprise-grade ETL software used by data-driven organizations in more than 85 countries. Over the years, DMX has become the standard ETL solution for organizations that seek optimal levels of performance, efficiency and productivity while reducing costs. In addition, DMX is also used by leading global enterprises to: Offload data and workloads from expensive data warehouses and mainframes Optimize ETL and data warehouse processing Accelerate sort and batch data processing Migrate data associated with mainframe application modernization projects Optimize Cloud data integration All the architectural aspects of DMX, the concepts explained throughout this paper, are based upon four key principles: Deliver maximum performance at scale. DMX can deliver sustainable performance with up to 10x faster elapsed processing times versus conventional DI tools, and up to 25x faster processing over hand coding, with linear scalability and no tuning required. Optimize resource efficiency. Several optimizations discussed throughout this paper result in up to 75% less CPU and memory utilization, and up to 90% less storage. Accelerate time to insight and increase user productivity. A single engine, small footprint, easy-to-use graphical interface, Use Case Accelerators, and dynamic ETL Optimizer combine to enable users to install in minutes not hours, and deliver results quickly. Reduce the cost-structure of data integration. Independent research has shown that DMX typically delivers up to 65% lower data integration TCO and 200% return on investment (ROI) with payback in nine months. The goal is simple: enable organizations to collect, process and distribute more data in less time, with fewer resources and at a lower cost. Ultimately, the viability of most businesses today depends on their ability to quickly and effectively transform data, Big or otherwise, into competitive insights; and that s exactly the value of data. 4

5 EVOLUTION OF ETL When ETL and Data Integration tools were first introduced in the late 1990s as an alternative to hand coding, one of the key benefits cited was that they would enable less-skilled users to build and deploy highly scalable data integration flows. This was supposed to be enabled by the tools high-performance engines and metadata-driven design, which would encourage re-use. However, most ETL platforms didn t grow organically. A lot of time and resources were spent trying to integrate disparate technologies acquired over the years, adding a vast array of functionality, but neglecting the performance and scalability aspects of ETL. As a result, these tools have fallen far short on delivering the promised developer productivity, and are increasingly inadequate to handle Big Data. Despite the high cost of ownership of most of these ETL tools, companies have been forced to supplement them with extensive hand coding and by pushing processing into the database an approach commonly known as ELT. The resulting architectures, illustrated below, are very complex, yet still don t deliver the performance or cost structure that customers need. DATA WAREHOUSE VISION OF THE 1990S THE DATA WAREHOUSE REALITY TODAY Oracle ETL ETL Data Mart Oracle ELT ETL Data Mart Files / XML Files / XML ERP Data Warehouse Data Mart ERP Data Warehouse +ELT Data Mart Mainframe Mainframe Real-Time Data Mart Real-Time Data Mart Hadoop Hadoop By contrast, DMX, which draws on Syncsort s long heritage in high-performance data integration, was built from the ground up based on four guiding principles: high data processing performance, resource efficiency, development productivity and lower cost. As a result, it can dramatically reduce the cost and complexity of DI initiatives as well as offload heavy ELT workloads to free up data warehouse capacity and resources. 5

6 ARCHITECTURE The DMX client-server architecture enables customers to cost-effectively solve enterpriseclass data integration problems, irrespective of data volume, complexity or velocity. The key to building this framework, which is optimized for a wide variety of data integration requirements, relies on a single processing engine that has continually evolved since its inception. It is important to note that DMX has a very small-footprint architecture with no dependency on third-party applications like a relational database, compiler, or application server for design or runtime. This means installing DMX takes only a few minutes. DMX can be deployed virtually anywhere on premises in Linux, Unix and Windows or even within a Hadoop cluster or over virtualized environments in the Cloud and can easily co-exist with virtually any other application. There are two major components of the DMX client-server platform: Client: Graphical user interface that allows users to design, execute and control data integration jobs. engine. Server: Combination of repository and File-Based Metadata Repository using the standard file system enables seamless design and runtime version control integration with source code control systems. This also provides High Availability simply by inheriting the characteristics of the underlying file system between nodes. Engine a high-performance, linearly scalable, and small footprint engine that includes a unique dynamic ETL Optimizer, which ensures maximum throughput at all times, using minimal resources. THE DMX CLIENT-SERVER ARCHITECTURE CLIENTS SERVER SOURCES & TARGETS Appliances CRM/ ERP DMX Workstations Flat File Based Metadata Repository DMX Server Engine RDBMS Cloud Files / XML check-in check-out Hadoop Version Control Real Time Mainframe Windows-based GUI AIX HP-UX Oracle Solaris Linux Relational DB2 Informix MySQL Oracle SQL Server Sybase Teradata Native ODBC DataDirect Windows Real Time MQ SOAP Other / ERP Files / XML FTP / SFTP Mainframe Hadoop / HDFS SAP Amazon Web Services Appliances Greenplum Netezza Vertica Cloud SOAP Salesforce.com 6

7 DMX provides native support for a wide variety of data sources/targets, including: FILE-BASED SOURCE RDBMS APPLIANCES OTHER Flat Mainframe HDFS Legacy Sources Oracle DB2 SQL Server Teradata Sybase IBM Netezza EMC Greenplum HP Vertica XML MQ Salesforce.com JSON DEVELOPER PRODUCTIVITY DELIVER RESULTS QUICKLY Developers can get productive quickly thanks to a library of Use Case Accelerators for common ETL applications such as aggregating web logs, identifying changes between two versions of a given dataset (CDC), lookups and more. The DMX graphical, template-based design paradigm allows for the rapid development and deployment of sophisticated data flows. Once deployed, jobs are easy to maintain & govern. The designer can concentrate on functional requirements while the DMX Optimizer automatically tunes the jobs for optimum performance. Re-usable tasks are the building blocks of a DMX job. Each task is completely self-contained and unit testable. The tasks are assembled to create jobs that include process sequencing. By default, each task is designed to use files for intermediate results. This allows for easy de-bugging, testing and re-starting. At runtime/deployment, with a single mouse click, the designer can replace these files with high-speed, in-memory transfers. ENABLE ON AND OFF-SHORE DEVELOPMENT TEAMS This approach facilitates hybrid development environments with dispersed on- and off-shore teams much more effectively. First, metadata can be easily propagated to all environments at both design and runtime. In addition, DMX allows each self-contained unit to be assembled and tested separately, then combined and run optimally in a production environment. FOCUS ON BUSINESS REQUIREMENTS, NOT PERFORMANCE With traditional ETL tools, a majority of the large library of components is devoted to manually tune performance/scalability. This forces the user to make design decisions that can dramatically impact overall throughput. Moreover, it means that performance is heavily dependent on an individual developer s knowledge of the tool. In essence, the developer must not only code to meet the functional requirements, but also design for 7

8 performance. There are very few developers who have this knowledge, and it is gained only after many years of experience. Often organizations will not have such expert resources in house. DMX is different because the dynamic Optimizer handles the performance aspects of any job or task. The designer only has to learn a core set of five stages/transforms (copy, sort, merge, join and aggregate). These simple tasks are combined to meet all functional requirements. This is what makes DMX so unique. The designers don t need to worry about performance because the Optimizer automatically delivers it to every job and task regardless of the environment. As a result, jobs have far fewer components and are easier to maintain and govern. With DMX, users design for functionality, and they simply inherit performance. PERFORMANCE & SCALABILITY Performance of any software system, especially ETL, is based on the performance triangle. Efficiency and speed require the balancing of CPU, Memory and I/O. The triangle is important because of the close dependency among these resources overuse of one has an immediate impact on the others (e.g. executing a join that exceeds physical memory will require additional disk and CPU time.) Most traditional ETL tools are CPU- and memory-bound but, ultimately, all ETL is disk dependent. As a result, to increase performance you must minimize the impact on every aspect of the triangle. DMX addresses the performance triangle with a combination of five key technologies: A library of patented algorithms for all key data transformations Direct I/O access for the fastest data transfers High-performance compression to minimize I/O A dynamic ETL Optimizer to ensure maximum performance at runtime, with minimum resource utilization Hybrid Multi-threading + Multi-process Engine 8

9 PATENTED ALGORITHMS As much as 80 percent of all ETL processing is spent sorting records. Joins, aggregations, rankings, database loads, etc. all depend on sorting to complete their processing. Joining two heterogeneous sources (e.g., a file and a table), grouping records to create subtotals, and creating rankings are all common examples where the record sorting is required. Even the final step of loading data into a target database can be more efficient, using less elapsed and CPU time, if the data is sorted first. Benefits extend beyond loading to post-load index creation, where querying a table containing sorted data is faster too. However, sorting records with traditional tools is typically the most inefficient step in the ETL process. Not so with DMX, which leverages hundreds of time-proven optimizations that deliver the highest levels of throughput while minimizing resource utilization. This unique capability is based on a series of algorithms first patented in 1971 to streamline mainframe sorting by minimizing resource utilization, constantly adapting to specific environmental variables, and scaling to meet the demands of growing data volumes. In the 40 years since the first Syncsort patent was issued, additional algorithms and unique technology covering joins, merges, aggregations, transformations, copies, memory management and compression have been added. Database Load Up to 40% Faster Source Extract & FTP Up to 90% Smaller SORT 6 Patents + 3 in progress Partition Data Up to 40% Faster JOIN 3 Patents + 3 in progress Aggregation Up to 70% Faster Merging & Transformation Up to 50% Faster Joining Records Up to 60% Faster AGGREGATE COPY Σ 3 Patents + 3 in progress Direct, block level read I/O 80% of ETL TOTAL SAVING Up to 10x Faster Using 75% Less Resources SAVE TIME & MONEY 9

10 DIRECT I/O FOR FASTER DATA TRANSFERS Every ETL process is ultimately disk read/write bound especially at the end points of a job: extracting the data from the source and loading it into the target. The transformation phase also quickly becomes disk bound when carrying out an operation that exceeds physical memory. Since disk is the slowest resource, its misuse can have the most dramatic impact on performance. DMX incorporates Direct I/O capabilities for many file systems and storage systems (e.g. Storage Array Networks). Direct I/O bypasses the OS buffer cache enabling a more efficient transfer of larger blocks of data. As a result, DMX avoids an extra memory copy, thereby utilizing less CPU. This optimization is automatic and handled at runtime by DMX. DMX enables Direct I/O for the majority of file systems. When the I/O sizes are properly aligned and of sufficient size, Direct I/O will occur automatically. Some file systems do this via Discovered Direct I/O. DMX works with virtually all systems that support Discovered Direct I/O. For sources greater than 12 GB, there are also automatic sort optimizations for striped file systems, such as Storage Array Networks. Larger block sizes are allocated automatically, improving overall performance by as much as 30%. DMX also incorporates direct read optimizations. For example, DMX can read directly from Oracle data files and bypass Oracle s client OCI interface, freeing Oracle resources typically used to process OCI client calls. Performance improvements gained by reading Oracle tables directly are typically around 30%. In addition, DMX optimizes loads by using direct path loads whenever possible. HIGH-PERFORMANCE COMPRESSION Large data volumes can have a negative impact on performance not only by increasing disk read/write access, but also network I/O. This is especially important given the increasing diversity of data sources, including those residing in the Cloud and HDFS. Moreover, Big Data can easily drive storage costs to unsustainable levels. Compression technology can help solve both of these problems, prompting the leading database and appliance vendors to make considerable investments in these technologies. For data integration, compression can be used not only to minimize storage requirements but also to accelerate overall elapsed time by decreasing 10

11 the amount of I/O. DMX is the only tool to incorporate enhanced compression algorithms, selected at runtime based on: I/O read/write speed Data volumes Data types In addition to offering compression for reading and writing data files, DMX incorporates unique technology capable of compressing temporary work space thus enabling significant storage savings for large data volumes. Depending on data compression ratios and system specifications (i.e. number and speed of CPUs, I/O rate, etc.) the high-performance compression capabilities of DMX can deliver over 2x faster elapsed time and storage savings of up to 90%. The end result is a dramatic impact on the use of scratch or temporary disk, saving terabytes of storage for even simple tasks. DYNAMIC SELF-TUNING ENGINE A library of algorithms and optimizations is not enough to deliver fast, efficient, simple data integration. This requires a constant balance of the Resource Triangle between memory, CPU and disk. These requirements change millisecond by millisecond, as a variety of applications (ETL, RDBMS, etc.) compete for priority. Moreover, the technology needs to effectively abstract the user from the complexity and manual effort involved in tuning an application. DMX to dynamically select the most efficient algorithms based on the data structures and system attributes encountered at runtime. It automatically adapts and self-optimizes to the exact characteristics of a particular job and system. This combination of scalability and ease-of-use delivers maximum performance at scale while ensuring the most efficient use of resources, by dynamically selecting and switching algorithms midstream. DMX combines the scalability and ease-of-use of a single engine with the efficiency and versatility of a dynamic Optimizer. The Optimizer enables 11

12 HYBRID MULTI-THREADING + MULTI-PROCESS ENGINE One of the classic arguments of traditional ETL vendors is process vs. thread architecture discussion. In short, both threads and processes are methods of parallelizing an application. Processes are self-contained execution units that interact with each other using inter-process communication. Subsequently, orchestrated mechanisms use a master process to spawn sub-processes. Threads are contained within a process and multiple threads can exist in each, sharing the same state and memory space and thus directly communicating. Threads can be dynamically spawned/killed to handle processing that does not have to be sequential. Conventional ETL tools and hand coding typically require a deployment or compile step where hard-coded logic is eventually pushed out this results in less dynamic and rigid ETL flows that limit the amount of flexibility at runtime to adapt to the changing system conditions. Also, many ETL approaches have very poor thread and process management often, the tools will be constrained by the overwhelming thread and process spawning requests swamping the operating system. DMX has evolved into a dynamic multi-process AND multi-thread based architecture. It provides the full benefits of a master orchestration process with threads that are dynamically spawned/ killed based on demand and processing. This incredibly efficient processing method, automatically controlled by the dynamic ETL Optimizer, conserves resources by only allocating them to steps where they are required, as the data flows through the integration job. In addition, DMX has an interpreted engine the user interface simply creates a script that is passed behind the scenes to the engine and dynamically processed at runtime. This delivers faster start-up and runtime performance while allowing greater flexibility, especially when passing dynamic variables/parameters. In many cases, the DMX engine finishes processing the data before conventional tools have even finished compiling, thus making it ideal for micro-batch or near-real-time environments. 12

13 DMX HYBRID MULTI-THREADING + MULTI-PROCESS ARCHITECTURE DYNAMICALLY ALLOCATES SYSTEM RESOURCES FOR OPTIMUM PERFORMANCE. Sources Read Join Aggregate Write TARGETS ETL Job Σ Σ EDW DM Thread Management Tasks Dynamic Optimizations CONCLUSION Data Integration tools have historically focused on expanding functionality, neglecting two critical success factors: developer ease-of-use and core engine performance at scale. As IT organizations confront the accelerating volume, variety and velocity of data by applying analytics to make sense of it all, they have been forced to turn to costly and inefficient workarounds such as constant tuning, adding hardware, and pushing data transformations down to the database. Approaches like these result in unsustainable costs that can undermine the value of Big Data. Therefore, when selecting the right data integration architecture, organizations must carefully evaluate the following key requirements. Performance without compromise. A scalable architecture must be built on the foundation of both engine performance and user productivity. It s important to balance both of these requirements equally without compromise. The ideal solution facilitates design by focusing on business logic and abstracting the complexity of performancedriven design from the user with an automatic ETL Optimizer that dynamically adapts at runtime to ensure optimal throughput at all times. Seamless scalability through an interpreted engine. To deal with today s rapidly changing, data-intensive requirements, a DI engine must fully exploit both thread- and processbased architectures of modern hardware, with the flexibility to allow rapid and dynamic changes using an interpreted engine. Optimum resource efficiency for onpremises and Cloud deployments. The architecture must have a very small footprint, 13

14 allowing fast, easy deployment and avoiding competition for resources. Organizations should avoid tools that have a dependency on 3rd party application servers, databases or compilers, as each of these components add significant complexity and consume resources that could otherwise be utilized for more efficient data processing. Direct I/O with high-performance compression. The data engine must optimally balance the performance triangle of disk, memory and CPU usage. Moreover, since disk I/O is the ultimate constraint on any data related process, seamless native compression and bypassing the I/O buffer cache can result in even higher performance and efficiency gains, while minimizing storage requirements. Improved productivity and reusability. The ideal DI architecture must provide intuitive, business-driven design paradigms with a basic set of reusable building blocks. This allows users to design complex structures based on simpler, highly-reusable components. Additionally, the client-server architecture needs to enable the distributed development that is characteristic of modern IT teams. Ability to offload data and workloads from data warehouses and mainframes. As growing ELT workloads continue to drive up to 80% of database resource utilization, data integration solutions must be capable of identifying and offloading suitable data processing workloads, freeing up database capacity while reducing costs. These requirements serve as the architectural foundation of DMX the result of more than 40 years of experience, leveraging patented algorithms and hundreds of time-proven optimizations. For this reason, 87 of the Fortune 100 companies are Syncsort customers, and Syncsort s products are used in more than 85 countries to offload expensive and inefficient legacy data workloads, speed data warehouse and mainframe processing, and optimize cloud data integration. The ability to go beyond just performance, using the powerful combination of the technologies explained in this white paper, enables Syncsort customers to consistently deliver results in less time, for fewer dollars, and with fewer resources. Deliver maximum performance at scale. DMX can deliver sustainable performance with up to 10x faster elapsed processing times versus conventional DI tools, and up to 25x faster processing over hand coding, with linear scalability and no tuning required. Optimize resource efficiency. Several optimizations discussed throughout this paper result in up to 75% less CPU and memory utilization, and up to 90% less storage. Accelerate time to insight and increase user productivity. A single engine, small footprint, ease-of-use, and dynamic ETL Optimizer combine to enable users to install in minutes not hours, and deploy in weeks not months. Reduce the cost-structure of data integration. Independent research has shown that DMX typically delivers up to 65% lower data integration TCO and 200% return on investment (ROI) with a nine-month payback period. 14

15 Customer Case Study: DRIVING BUSINESS GROWTH WITH BIG DATA The performance and ease of use of DMX technology positively impacts our bottom line. DMX technology is able to convert raw click-stream data into valuable granular information at lightning speed. INTEGRATION MIKE BROWN, CTO COMPANY OVERVIEW: comscore is a global leader in measuring the digital world and the preferred source of digital marketing intelligence. Through a powerful combination of behavioral and survey insights, comscore enables clients to better understand, leverage and profit from the rapidly evolving worldwide web and mobile arena. comscore helps leading organizations around the world implement more effective digital business strategies by delivering critical insights to over 1,200 organizations, spanning all continents and 35 countries. BUSINESS CHALLENGE: Data integration is a critical business process for comscore; their success as a company depends on their ability to monitor, collect, transform and analyze data from a panel of 2 million internet users around the world. comscore monitors all the user panel internet and mobile activity from just browsing to publications they read, services they subscribe to, things they buy, etc. Information from the panel is collected 24x7 in the form of flat files, which then need to be sorted and aggregated. Originally, comscore started off on a homegrown grid processing stack. By the year 2000, when they switched to Syncsort, they were already processing about 10 billion records of data per day. We were up and running in weeks, Brown says. It literally made our software run 5-10 times 15

16 We were up and running in weeks... It literally made our software run 5-10 times faster. faster. You re not just adding storage, but you re adding compute as well. In 2009, in the midst of the recession, comscore decided to build and release its third- generation product Unified Digital Measurement (UDM or Hybrid). This initiative established a platform that blends panel data plus census data, effectively elevating the volume and complexity of data transformations. In addition, comscore had some specific goals, including: Ramp up data collection Deploy new methodologies for data processing and analysis Scale linearly to support exponential growth Cut data latency to a maximum of 24 hrs Deploy the entire system in four months or less comscore initially developed a custom solution. However, they started to face some serious challenges: Not feasible to meet SLAs Exponential server and storage costs Long time to insight due to constant manual tuning and ongoing development/ maintenance SOLUTION: HIGH-PERFORMANCE SORT & ETL SOLUTION WITH DMX comscore uses DMX in over 200 servers to enable efficient data processing. According to Mike Brown, comscore now has aggregation systems that can process over 50 GB of data with 357 million rows in 20 minutes on a Dell R710 2U server. More recently, given the exponential data growth that comscore is experiencing, they also decided to leverage Hadoop. According to Mike Brown, Syncsort s software made the Hadoop migration significantly easier. You don t have to change any code, except the push code, he says. We use DMX in [more than] 30 different apps. It s our tool for any situation [where] we have to adjust the data. Today, comscore uses DMX to sort, partition and compress their data before loading it into Hadoop. In addition, DMX allows them to optimize their Hadoop environment by: Reducing the amount of storage on the Hadoop cluster Accelerating the load process by a factor of 2x Improving overall Hadoop performance by splitting large files into smaller files that fit perfectly into Hadoop segments 16

17 FAST Sort over 50GB of data in 20 min with just one server Accelerate Hadoop data load by up to 2x Benefits EFFICIENT Storage savings: 75TB of data per month Optimize Hadoop processing, partitioning and compressing the data prior to loading into the Hadoop cluster SIMPLE comscore went from processing 18 Billion records per day to 32 Billion in just 9 months without tuning or re-architecting. The DMX engine scaled to support the increased data volumes. Facilitated migration to Hadoop 32B records / day DMX HDFS HADOOP Node Node Node Node Load files Sort, compress, partition. Load to HDFS Post-processing & analysis COST EFFECTIVE: At least $350K/year on deferred server costs Savings of up to 75TB of storage DMX ENABLES COMSCORE TO: Support business growth Meet, and even exceed, customer SLAs Deliver new revenue-generating offerings Maintain competitive advantage while scaling cost-effectively to support increasing demands Business Value 17

18 Customer Case Study: OFFLOADING ORACLE TO OPTIMIZE PERFORMANCE, REDUCE COSTS DMX has saved our organization hundreds of thousands of dollars! JANET DADE, DIRECTOR, IT SERVICES COMPANY OVERVIEW: The National Education Association (NEA), the largest professional employee organization in the United States, is committed to advancing the cause of public education. NEA s 3 million members work at every level of education from pre-school to university graduate programs. NEA has affiliate organizations in every state and in 14,000+ communities across the United States. BUSINESS CHALLENGE: Every day, NEA collects data from its 3.2 million members, integrates it, and moves it into multiple data warehouses for Business Intelligence analysis. The goal is to quickly deliver detailed analyses to the organization so they can make better decisions and meet their objectives. All data integration processes at NEA were initially done with very complex PL/SQL scripts. Some of the scripts were over 15 pages long with thousands of lines of PL/SQL and very inefficient joins, which consumed a lot of Oracle capacity. As a result, their batch window to do all the backups, data transformations and data movement was taking over 15 hrs to complete. Moreover, due to all the manual effort, the process was prone to errors, increasing user frustration and lowering IT staff productivity. At the same time, scripts were extremely difficult to maintain, tune and extend, increasing overall risk and hindering knowledge transfer to bring new staff on board. SOLUTION: HIGH-PERFORMANCE ETL FOR PL/SQL OFFLOAD NEA replaced their custom ELT processes in PL/SQL with DMX. The High-performance ETL Solution with DMX allowed NEA to: Eliminate Oracle staging Implement a flexible, scalable DI environment Eliminate the need for constant manual coding and tuning 18

19 FAST Reduced overall batch window from 15 hrs to 4 hrs Accelerated key data integration processes by up to 25x Benefits EFFICIENT Eliminated the need for expensive Oracle staging Freed up database capacity for more analytical queries SIMPLE Deployed a key data integration job within 48 man-hours with DMX. The same job previously took 1 year to deploy with PL/ SQL Eliminated the need for constant tuning Facilitated knowledge transfer Before: PL/SQL Scripts (ELT) After: DMX (ETL) Data: 8-14M rows ORACLE Data: 8-14M rows DMX ORACLE ELT Analytics Analytics Read files Convert files into staging tables. Run over 900 lines of PL/SQL scripts. Load data into DWH for BI & analytics Read files Transform data with only 2 DMX jobs & load into DWH Business intelligence & analytics Effort: 1 year Est. cost to develop: $80K at ($60/hr) Total run time: 6 hrs Complex PL/SQL scripts with staging Manual coding. Manual tuning. No reusability Total batch window: 15 hrs Effort: 48 man/hrs Est. cost to develop: $2.88K (at $60/hr) Total run time: 30 min No staging No coding. No tuning. Reusable objects Total batch window: 4 hrs BUSINESS VALUE: Accelerated time-to-insight for over 3 million members and thousands of affiliates Minimized organizational risk from retiring staff Fast ROI with $80K estimated productivity savings for just one project 19

20 LEARN MORE ebook: A Practical Guide to Big Data Readiness > ebook: 5 Tips to Break Through ELT Roadblocks > ebook: The Ultimate Checklist for High Performance ETL > ABOUT US Syncsort provides fast, secure, enterprise-grade software spanning Big Data solutions in Hadoop to Big Iron on mainframes. We help customers around the world to collect, process and distribute more data in less time, with fewer resources and lower costs. 87 of the Fortune 100 companies are Syncsort customers, and Syncsort s products are used in more than 85 countries to offload expensive and inefficient legacy data workloads, speed data warehouse and mainframe processing, and optimize cloud data integration. Experience Syncsort at Syncsort Incorporated. All rights reserved. Company and product names used herein may be the trademarks of their respective companies. DMX-WP US

ETL on Hadoop What is Required

ETL on Hadoop What is Required ETL on Hadoop What is Required Keith Kohl Director, Product Management October 2012 Syncsort Copyright 2012, Syncsort Incorporated Agenda Who is Syncsort Extract, Transform, Load (ETL) Overview and conventional

More information

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake White Paper Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake Motivation for Modernization It is now a well-documented realization among Fortune 500 companies

More information

Cognizant BigFrame Fast, Secure Legacy Migration

Cognizant BigFrame Fast, Secure Legacy Migration Cognizant BigFrame Fast, Secure Legacy Migration Speeding Business Access to Critical Data BigFrame speeds migration from legacy systems to secure next-generation data platforms, providing up to a 4X performance

More information

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica Accelerating Your Big Data Analytics Jeff Healey, Director Product Marketing, HPE Vertica Recent Waves of Disruption IT Infrastructu re for Analytics Data Warehouse Modernization Big Data/ Hadoop Cloud

More information

Analytic Workloads on Oracle and ParAccel

Analytic Workloads on Oracle and ParAccel Analytic Workloads on Oracle and ParAccel Head-to-head comparisons of real-world analytic workloads demonstrate the performance improvement and cost savings of ParAccel over Oracle. ParAccel was designed

More information

EXECUTIVE BRIEF. Successful Data Warehouse Approaches to Meet Today s Analytics Demands. In this Paper

EXECUTIVE BRIEF. Successful Data Warehouse Approaches to Meet Today s Analytics Demands. In this Paper Sponsored by Successful Data Warehouse Approaches to Meet Today s Analytics Demands EXECUTIVE BRIEF In this Paper Organizations are adopting increasingly sophisticated analytics methods Analytics usage

More information

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud Datametica The Modern Data Platform Enterprise Data Hub Implementations Why is workload moving to Cloud 1 What we used do Enterprise Data Hub & Analytics What is Changing Why it is Changing Enterprise

More information

ROI Strategies for IT Executives. Syncsort ROI Strategies for IT Executives

ROI Strategies for IT Executives. Syncsort ROI Strategies for IT Executives ROI Strategies for IT Executives Syncsort ROI Strategies for IT Executives Introduction In the 1996 movie Jerry Maguire, the character Rod Tidwell played by Cuba Gooding Jr. was always challenging Jerry

More information

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition delivers high-performance data movement and transformation among enterprise platforms with its open and integrated E-LT

More information

DataAdapt Active Insight

DataAdapt Active Insight Solution Highlights Accelerated time to value Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced analytics for structured, semistructured and unstructured

More information

InfoSphere Warehousing 9.5

InfoSphere Warehousing 9.5 IBM Software Group Optimised InfoSphere Warehousing 9.5 Flexible Simple Phil Downey InfoSphere Warehouse Technical Marketing 2007 IBM Corporation Information On Demand End-to-End Capabilities Optimization

More information

5th Annual. Cloudera, Inc. All rights reserved.

5th Annual. Cloudera, Inc. All rights reserved. 5th Annual 1 The Essentials of Apache Hadoop The What, Why and How to Meet Agency Objectives Sarah Sproehnle, Vice President, Customer Success 2 Introduction 3 What is Apache Hadoop? Hadoop is a software

More information

IBM Tivoli Monitoring

IBM Tivoli Monitoring Monitor and manage critical resources and metrics across disparate platforms from a single console IBM Tivoli Monitoring Highlights Proactively monitor critical components Help reduce total IT operational

More information

Cask Data Application Platform (CDAP) Extensions

Cask Data Application Platform (CDAP) Extensions Cask Data Application Platform (CDAP) Extensions CDAP Extensions provide additional capabilities and user interfaces to CDAP. They are use-case specific applications designed to solve common and critical

More information

Bringing Big Data to Life: Overcoming The Challenges of Legacy Data in Hadoop

Bringing Big Data to Life: Overcoming The Challenges of Legacy Data in Hadoop 0101 001001010110100 010101000101010110100 1000101010001000101011010 00101010001010110100100010101 0001001010010101001000101010001 010101101001000101010001001010010 010101101 000101010001010 1011010 0100010101000

More information

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud DAMA Datametica The Modern Data Platform Enterprise Data Hub Implementations What is happening with Hadoop Why is workload moving to Cloud 1 The Modern Data Platform The Enterprise Data Hub What do we

More information

IBM Db2 Warehouse. Hybrid data warehousing using a software-defined environment in a private cloud. The evolution of the data warehouse

IBM Db2 Warehouse. Hybrid data warehousing using a software-defined environment in a private cloud. The evolution of the data warehouse IBM Db2 Warehouse Hybrid data warehousing using a software-defined environment in a private cloud The evolution of the data warehouse Managing a large-scale, on-premises data warehouse environments to

More information

Microsoft Azure Essentials

Microsoft Azure Essentials Microsoft Azure Essentials Azure Essentials Track Summary Data Analytics Explore the Data Analytics services in Azure to help you analyze both structured and unstructured data. Azure can help with large,

More information

Making BI Easier An Introduction to Vectorwise

Making BI Easier An Introduction to Vectorwise Taking Action on Big Data Making BI Easier An Introduction to Vectorwise Richard.Stock@actian.com Actian Overview Taking Action on Big Data Ingres Vectorwise Action Apps. Cloud Action Platform World class

More information

HP Cloud Maps for rapid provisioning of infrastructure and applications

HP Cloud Maps for rapid provisioning of infrastructure and applications Technical white paper HP Cloud Maps for rapid provisioning of infrastructure and applications Table of contents Executive summary 2 Introduction 2 What is an HP Cloud Map? 3 HP Cloud Map components 3 Enabling

More information

Six Critical Capabilities for a Big Data Analytics Platform

Six Critical Capabilities for a Big Data Analytics Platform White Paper Analytics & Big Data Six Critical Capabilities for a Big Data Analytics Platform Table of Contents page Executive Summary...1 Key Requirements for a Big Data Analytics Platform...1 Vertica:

More information

MapR Pentaho Business Solutions

MapR Pentaho Business Solutions MapR Pentaho Business Solutions The Benefits of a Converged Platform to Big Data Integration Tom Scurlock Director, WW Alliances and Partners, MapR Key Takeaways 1. We focus on business values and business

More information

IBM Digital Analytics Accelerator

IBM Digital Analytics Accelerator IBM Digital Analytics Accelerator On-premises web analytics solution for high-performance, granular insights Highlights: Efficiently capture, store, and analyze online data Benefit from highly scalable

More information

TECHNICAL WHITE PAPER. Rubrik and Microsoft Azure Technology Overview and How It Works

TECHNICAL WHITE PAPER. Rubrik and Microsoft Azure Technology Overview and How It Works TECHNICAL WHITE PAPER Rubrik and Microsoft Azure Technology Overview and How It Works TABLE OF CONTENTS THE UNSTOPPABLE RISE OF CLOUD SERVICES...3 CLOUD PARADIGM INTRODUCES DIFFERENT PRINCIPLES...3 WHAT

More information

SAVE MAINFRAME COSTS ZIIP YOUR NATURAL APPS

SAVE MAINFRAME COSTS ZIIP YOUR NATURAL APPS ADABAS & NATURAL SAVE MAINFRAME COSTS ZIIP YOUR NATURAL APPS Reduce your mainframe TCO with Natural Enabler TABLE OF CONTENTS 1 Can you afford not to? 2 Realize immediate benefits 2 Customers quickly achieve

More information

LOWERING MAINFRAME TCO THROUGH ziip SPECIALTY ENGINE EXPLOITATION

LOWERING MAINFRAME TCO THROUGH ziip SPECIALTY ENGINE EXPLOITATION March 2009 0 LOWERING MAINFRAME TCO THROUGH ziip SPECIALTY ENGINE EXPLOITATION DataDirect Shadow - version 7.1.3 - Web Services Performance Benchmarks Shadow v7 ziip Exploitation Benchmarks 1 BUSINESS

More information

The Importance of good data management and Power BI

The Importance of good data management and Power BI The Importance of good data management and Power BI The BI Iceberg Visualising Data is only the tip of the iceberg Data Preparation and provisioning is a complex process Streamlining this process is key

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Exadata Database Machine IOUG Exadata SIG Update February 6, 2013 Mathew Steinberg Exadata Product Management 2 The following is intended to outline our general product direction. It is intended for

More information

Robotic Process Automation

Robotic Process Automation Automate any business process on-the-fly with Robotic Process Automation Paradoxically, IT is the least automated department in many organizations. Robotic Process Automation (RPA) applies specific technologies

More information

DELL EMC XTREMIO X2: NEXT-GENERATION ALL-FLASH ARRAY

DELL EMC XTREMIO X2: NEXT-GENERATION ALL-FLASH ARRAY DATA SHEET DELL EMC XTREMIO X2: NEXT-GENERATION ALL-FLASH ARRAY Realizing New Levels of Efficiency, Performance, and TCO ESSENTIALS Performance and Efficiency Predictable and consistent high performance

More information

Deep Learning Acceleration with

Deep Learning Acceleration with Deep Learning Acceleration with powered by A Technical White Paper TABLE OF CONTENTS The Promise of AI The Challenges of the AI Lifecycle and Infrastructure MATRIX Powered by Bitfusion Flex Solution Architecture

More information

Cisco IT Automates Workloads for Big Data Analytics Environments

Cisco IT Automates Workloads for Big Data Analytics Environments Cisco IT Case Study - September 2013 Cisco Tidal Enterprise Scheduler and Big Data Cisco IT Automates Workloads for Big Data Analytics Environments Cisco Tidal Enterprise Scheduler eliminates time-consuming

More information

Making the Smart Choice vs x86

Making the Smart Choice vs x86 IBM Power Systems Making the Smart Choice vs x86 Scalable servers to meet the business needs of tomorrow. Contents 2 Contents 3 Making the Smart Platform Choice 4 Why IBM Power Systems over x86? 6 Client

More information

TechValidate Survey Report. Converged Data Platform Key to Competitive Advantage

TechValidate Survey Report. Converged Data Platform Key to Competitive Advantage TechValidate Survey Report Converged Data Platform Key to Competitive Advantage TechValidate Survey Report Converged Data Platform Key to Competitive Advantage Executive Summary What Industry Analysts

More information

Real-Time Streaming: IMS to Apache Kafka and Hadoop

Real-Time Streaming: IMS to Apache Kafka and Hadoop Real-Time Streaming: IMS to Apache Kafka and Hadoop - 2017 Scott Quillicy SQData Outline methods of streaming mainframe data to big data platforms Set throughput / latency expectations for popular big

More information

An Oracle White Paper January Upgrade to Oracle Netra T4 Systems to Improve Service Delivery and Reduce Costs

An Oracle White Paper January Upgrade to Oracle Netra T4 Systems to Improve Service Delivery and Reduce Costs An Oracle White Paper January 2013 Upgrade to Oracle Netra T4 Systems to Improve Service Delivery and Reduce Costs Executive Summary... 2 Deploy Services Faster and More Efficiently... 3 Greater Compute

More information

THE MAGIC OF DATA INTEGRATION IN THE ENTERPRISE WITH TIPS AND TRICKS

THE MAGIC OF DATA INTEGRATION IN THE ENTERPRISE WITH TIPS AND TRICKS THE MAGIC OF DATA INTEGRATION IN THE ENTERPRISE WITH TIPS AND TRICKS DATA HOLDS ALL THE POTENTIAL TO HELP BUSINESSES WIN CUSTOMERS INCREASE REVENUE GAIN COMPETITIVE ADVANTAGE STREAMLINE OPERATIONS BUT

More information

Microsoft reinvents sales processing and financial reporting with Azure

Microsoft reinvents sales processing and financial reporting with Azure Microsoft IT Showcase Microsoft reinvents sales processing and financial reporting with Azure Core Services Engineering (CSE, formerly Microsoft IT) is moving MS Sales, the Microsoft revenue reporting

More information

HYBRID CLOUD MANAGEMENT WITH. ServiceNow. Research Paper

HYBRID CLOUD MANAGEMENT WITH. ServiceNow. Research Paper HYBRID CLOUD MANAGEMENT WITH ServiceNow Research Paper 1 Introduction The demand for multiple public and private cloud platforms has been increasing significantly due to rapid growth in adoption of cloud

More information

Hamburg 20 September, 2018

Hamburg 20 September, 2018 Hamburg 20 September, 2018 Data Warehouse Modernization Pivotal Greenplum Dell Greenplum Building Blocks Les Klein EMEA Field CTO, Pivotal @LesKlein Great organizations leverage software, analytics, and

More information

Simplifying Hadoop. Sponsored by. July >> Computing View Point

Simplifying Hadoop. Sponsored by. July >> Computing View Point Sponsored by >> Computing View Point Simplifying Hadoop July 2013 The gap between the potential power of Hadoop and the technical difficulties in its implementation are narrowing and about time too Contents

More information

IBM Balanced Warehouse Buyer s Guide. Unlock the potential of data with the right data warehouse solution

IBM Balanced Warehouse Buyer s Guide. Unlock the potential of data with the right data warehouse solution IBM Balanced Warehouse Buyer s Guide Unlock the potential of data with the right data warehouse solution Regardless of size or industry, every organization needs fast access to accurate, up-to-the-minute

More information

GUIDEBOOK ADAPTIVE INSIGHTS

GUIDEBOOK ADAPTIVE INSIGHTS GUIDEBOOK ADAPTIVE INSIGHTS December 2013 July 2013 Document NX THE BOTTOM LINE Although corporate performance management (CPM) solutions have been in the market for some time, a new set of vendors such

More information

Make the Smart Platform Choice

Make the Smart Platform Choice Make the Smart Platform Choice Why organizations are choosing IBM Power Systems over x86 platforms The perceived x86 benefits of lower acquisition cost and standardizing on a commodity platform are often

More information

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11 Top 5 Challenges for Hadoop MapReduce in the Enterprise Whitepaper - May 2011 http://platform.com/mapreduce 2 5/9/11 Table of Contents Introduction... 2 Current Market Conditions and Drivers. Customer

More information

IBM Spectrum Scale. Advanced storage management of unstructured data for cloud, big data, analytics, objects and more. Highlights

IBM Spectrum Scale. Advanced storage management of unstructured data for cloud, big data, analytics, objects and more. Highlights IBM Spectrum Scale Advanced storage management of unstructured data for cloud, big data, analytics, objects and more Highlights Consolidate storage across traditional file and new-era workloads for object,

More information

A Guide for Application Providers: Choosing the Right Integration Partner

A Guide for Application Providers: Choosing the Right Integration Partner A Guide for Application Providers: Choosing the Right Integration Partner Your Customers Expect Their Applications to Work Together Your customers are running more applications than ever. With the rapid

More information

Legally Avoid the Data Tax. White Paper

Legally Avoid the Data Tax. White Paper White Paper Table of contents The data tax is a tax on volume... 3 The data tax is a tax on diversity... 4 The data tax is unpredictable... 5 Examples of data tax victims...6 An unfair tax... 7 Avoiding

More information

IBM Tivoli Workload Automation View, Control and Automate Composite Workloads

IBM Tivoli Workload Automation View, Control and Automate Composite Workloads Tivoli Workload Automation View, Control and Automate Composite Workloads Mark A. Edwards Market Manager Tivoli Workload Automation Corporation Tivoli Workload Automation is used by customers to deliver

More information

INFOBrief. EMC VisualSRM Storage Resource Management Suite. Key Points

INFOBrief. EMC VisualSRM Storage Resource Management Suite. Key Points INFOBrief EMC VisualSRM Storage Resource Management Suite Key Points EMC VisualSRM is data center-class software specifically architected to provide centralized storage resource management for mid-tier

More information

Bringing the Power of SAS to Hadoop Title

Bringing the Power of SAS to Hadoop Title WHITE PAPER Bringing the Power of SAS to Hadoop Title Combine SAS World-Class Analytics With Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities ii Contents Introduction... 1 What

More information

Access and present any data the way you want. Deliver the right reports to end users at the right time

Access and present any data the way you want. Deliver the right reports to end users at the right time Crystal Reports Overview Access and present all your enterprise data with a single reporting solution Deliver up-to-date reports to end users securely over the web Integrate reporting functionality into

More information

InfoSphere Warehouse. Flexible. Reliable. Simple. IBM Software Group

InfoSphere Warehouse. Flexible. Reliable. Simple. IBM Software Group IBM Software Group Flexible Reliable InfoSphere Warehouse Simple Ser Yean Tan Regional Technical Sales Manager Information Management Software IBM Software Group ASEAN 2007 IBM Corporation Business Intelligence

More information

Comparing Infrastructure Management Vendors Time to Monitor

Comparing Infrastructure Management Vendors Time to Monitor Comparing Infrastructure Management Vendors Time to Monitor Comparison of CA Unified Infrastructure Management Version 8.5.1 (CA UIM v8.5.1), SolarWinds SAM, and Nagios XI Apprize360 Intelligence, LLC

More information

IBM Tivoli Workload Scheduler

IBM Tivoli Workload Scheduler Manage mission-critical enterprise applications with efficiency IBM Tivoli Workload Scheduler Highlights Drive workload performance according to your business objectives Help optimize productivity by automating

More information

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK Are you drowning in Big Data? Do you lack access to your data? Are you having a hard time managing Big Data processing requirements?

More information

Integrating Configuration Management Into Your Release Automation Strategy

Integrating Configuration Management Into Your Release Automation Strategy WHITE PAPER MARCH 2015 Integrating Configuration Management Into Your Release Automation Strategy Tim Mueting / Paul Peterson Application Delivery CA Technologies 2 WHITE PAPER: INTEGRATING CONFIGURATION

More information

Active Analytics Overview

Active Analytics Overview Active Analytics Overview The Fourth Industrial Revolution is predicated on data. Success depends on recognizing data as the most valuable corporate asset. From smart cities to autonomous vehicles, logistics

More information

Syncsort Incorporated, 2016

Syncsort Incorporated, 2016 Syncsort Incorporated, 2016 All rights reserved. This document contains proprietary and confidential material, and is only for use by licensees of DMExpress. This publication may not be reproduced in whole

More information

When big business meets big data, a dynamic approach to analytics is essential

When big business meets big data, a dynamic approach to analytics is essential Big Data & Analytics the way we see it Elastic Analytics When big business meets big data, a dynamic approach to analytics is essential Turning analytics into a business expense Businesses need to make

More information

IBM Grid Offering for Analytics Acceleration: Customer Insight in Banking

IBM Grid Offering for Analytics Acceleration: Customer Insight in Banking Grid Computing IBM Grid Offering for Analytics Acceleration: Customer Insight in Banking customers. Often, banks may purchase lists and acquire external data to improve their models. This data, although

More information

TOP 5 WAYS DATA CONNECTIVITY ACCELERATES YOUR BUSINESS

TOP 5 WAYS DATA CONNECTIVITY ACCELERATES YOUR BUSINESS WHITE PAPER TOP 5 WAYS DATA CONNECTIVITY ACCELERATES YOUR BUSINESS 2 INTRODUCTION For many businesses trying to reduce costs, the lure of free proprietary drivers is strong. Many use open source code in

More information

Building a Single Source of Truth across the Enterprise An Integrated Solution

Building a Single Source of Truth across the Enterprise An Integrated Solution SOLUTION BRIEF Building a Single Source of Truth across the Enterprise An Integrated Solution From EDW modernization to self-service BI on big data This solution brief showcases an integrated approach

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

Enterprise Enabler and Salesforce.com

Enterprise Enabler and Salesforce.com White Paper Enterprise Enabler and Salesforce.com Incorporating on-premise data without moving it to the cloud Pamela Szabó Stone Bond Technologies, L.P. 1 Enterprise Enabler and Salesforce.com Incorporate

More information

Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect

Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect (technical) Updates & demonstration Robert Voermans Governance architect Amsterdam Please note IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice

More information

When It Needs to Get Done at 2 a.m., That s when you can rely on CA Workload Automation

When It Needs to Get Done at 2 a.m., That s when you can rely on CA Workload Automation When It Needs to Get Done at 2 a.m., That s when you can rely on CA Workload Automation 1 Your Workload Management Has Reached a Tipping Point YOUR ORGANIZATION HAS A SIMPLE DIRECTIVE: Provide the best

More information

Innovative solutions to simplify your business. IBM System i5 Family

Innovative solutions to simplify your business. IBM System i5 Family Innovative solutions to simplify your business IBM System i5 Family Highlights Provide faster, extremely reliable and highly secure ways to help simplify your IT environment, enabling savings to be invested

More information

Oracle Autonomous Data Warehouse Cloud

Oracle Autonomous Data Warehouse Cloud Oracle Autonomous Data Warehouse Cloud 1 Lower Cost, Increase Reliability and Performance to Extract More Value from Your Data With Oracle Autonomous Database Cloud Service for Data Warehouse Today s leading-edge

More information

IBM WebSphere Information Integrator Content Edition Version 8.2

IBM WebSphere Information Integrator Content Edition Version 8.2 Introducing content-centric federation IBM Content Edition Version 8.2 Highlights Access a broad range of unstructured information sources as if they were stored and managed in one system Unify multiple

More information

AMD and Cloudera : Big Data Analytics for On-Premise, Cloud and Hybrid Deployments

AMD and Cloudera : Big Data Analytics for On-Premise, Cloud and Hybrid Deployments August, 2018 AMD and Cloudera : Big Data Analytics for On-Premise, Cloud and Hybrid Deployments Standards Based AMD is committed to industry standards, offering you a choice in x86 architecture with design

More information

In-Memory Analytics: Get Faster, Better Insights from Big Data

In-Memory Analytics: Get Faster, Better Insights from Big Data Discussion Summary In-Memory Analytics: Get Faster, Better Insights from Big Data January 2015 Interview Featuring: Tapan Patel, SAS Institute, Inc. Introduction A successful analytics program should translate

More information

How Diverse Cloud Deployment Boosts the Value of Analytics and Future Proofs Your Business

How Diverse Cloud Deployment Boosts the Value of Analytics and Future Proofs Your Business White Paper Analytics and Big Data How Diverse Cloud Deployment Boosts the Value of Analytics and Future Proofs Your Business In this white paper, you will learn about three major cloud deployment methods,

More information

Driving competitive advantage with real-time applications, enabled by SAP HANA on IBM Power Systems

Driving competitive advantage with real-time applications, enabled by SAP HANA on IBM Power Systems Business Challenge wanted to support its customers business process innovations with a robust, in-memory cloud solution, helping them to benefit from real-time insights and gain new competitive advantages.

More information

Data Integration for the Real-Time Enterprise

Data Integration for the Real-Time Enterprise Solutions Brief Data Integration for the Real-Time Enterprise Business Agility in a Constantly Changing World Executive Summary For companies to navigate turbulent business conditions and add value to

More information

#mstrworld. A Deep Dive Into Self-Service Data Discovery In MicroStrategy. Vijay Anand Gianthomas Tewksbury Volpe. #mstrworld

#mstrworld. A Deep Dive Into Self-Service Data Discovery In MicroStrategy. Vijay Anand Gianthomas Tewksbury Volpe. #mstrworld A Deep Dive Into Self-Service Data Discovery In MicroStrategy Vijay Anand Gianthomas Tewksbury Volpe Introducing MicroStrategy Analytics Agenda Introduction to MicroStrategy Analytics Platform Product

More information

Making Smarter Decisions for Data Discovery Solutions: Evaluating 3 Options

Making Smarter Decisions for Data Discovery Solutions: Evaluating 3 Options Making Smarter Decisions for Data Discovery Solutions: Evaluating 3 Options WHITE PAPER The business intelligence market is undergoing dramatic change. New technology, more demanding business requirements

More information

Cognitive Data Warehouse and Analytics

Cognitive Data Warehouse and Analytics Cognitive Data Warehouse and Analytics Hemant R. Suri, Sr. Offering Manager, Hybrid Data Warehouses, IBM (twitter @hemantrsuri or feel free to reach out to me via LinkedIN!) Over 90% of the world s data

More information

Architecting for Real- Time Big Data Analytics. Robert Winters

Architecting for Real- Time Big Data Analytics. Robert Winters Architecting for Real- Time Big Data Analytics Robert Winters About Me 2 ROBERT WINTERS Head of Business Intelligence, TravelBird Ten years experience in analytics, five years with Vertica and big data

More information

Cloud-Scale Data Platform

Cloud-Scale Data Platform Guide to Supporting On-Premise Spark Deployments with a Cloud-Scale Data Platform Apache Spark has become one of the most rapidly adopted open source platforms in history. Demand is predicted to grow at

More information

WELCOME TO. Cloud Data Services: The Art of the Possible

WELCOME TO. Cloud Data Services: The Art of the Possible WELCOME TO Cloud Data Services: The Art of the Possible Goals for Today Share the cloud-based data management and analytics technologies that are enabling rapid development of new mobile applications Discuss

More information

The Value of Oracle Database Appliance (ODA) for ISVs

The Value of Oracle Database Appliance (ODA) for ISVs Analysis from The Wikibon Project September 2014 The Value of Oracle Database Appliance (ODA) for ISVs David Floyer A Wikibon Reprint View the live research note on Wikibon. Executive Summary Wikibon has

More information

15-MINUTE GUIDE. Dell EMC Converged Infrastructure for SAP DELL EMC PERSPECTIVE

15-MINUTE GUIDE. Dell EMC Converged Infrastructure for SAP DELL EMC PERSPECTIVE 15-MINUTE GUIDE Dell EMC Converged Infrastructure for SAP DELL EMC PERSPECTIVE TABLE OF CONTENTS PREFACE...3 Transforming IT to run SAP in the digital business... 3 TRANSFORMATIONING THE DATA CENTER...4

More information

Four Elastic ipaas Requirements That Must Not Be Ignored A SNAPLOGIC WHITEPAPER

Four Elastic ipaas Requirements That Must Not Be Ignored A SNAPLOGIC WHITEPAPER Four Elastic ipaas Requirements That Must Not Be Ignored A SNAPLOGIC WHITEPAPER 2 Introduction 3 Table of Contents Elastic ipaas Requirement #1: Resiliency 4 Elastic ipaas Requirement #2: Fluidity in 5

More information

Building Your Big Data Team

Building Your Big Data Team Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.

More information

Pentaho 8.0 Overview. Pedro Alves

Pentaho 8.0 Overview. Pedro Alves Pentaho 8.0 Overview Pedro Alves Safe Harbor Statement The forward-looking statements contained in this document represent an outline of our current intended product direction. It is provided for information

More information

Catalogic ECX : Data Center Modernization with In-Place Copy Data Management

Catalogic ECX : Data Center Modernization with In-Place Copy Data Management Catalogic ECX : Data Center Modernization with In-Place Copy Data Management Catalog. Automate. Transform. Catalogic ECX Highlights Automate the creation and use of copy data snapshots, clones, and replicas

More information

The Fastest, Easiest Way to Integrate Oracle Systems with Salesforce. Real-Time Integration, Not Data Duplication WHITEPAPER

The Fastest, Easiest Way to Integrate Oracle Systems with Salesforce. Real-Time Integration, Not Data Duplication WHITEPAPER The Fastest, Easiest Way to Integrate Oracle Systems with Salesforce Real-Time Integration, Not Data Duplication WHITEPAPER Salesforce may be called the Customer Success Platform, but success with this

More information

DATENBANK TRANSFORMATION WARUM, WOHIN UND WIE? 15. Mai 2018

DATENBANK TRANSFORMATION WARUM, WOHIN UND WIE? 15. Mai 2018 DATENBANK TRANSFORMATION WARUM, WOHIN UND WIE? 15. Mai 2018 PROVENTA AG Founded Branches Employees Turnover 1992 Telco, Banking, Insurance, Automotive, Energy >70 internal >200 freelancer pool 14 Mio.

More information

COST ADVANTAGES OF HADOOP ETL OFFLOAD WITH THE INTEL PROCESSOR- POWERED DELL CLOUDERA SYNCSORT SOLUTION

COST ADVANTAGES OF HADOOP ETL OFFLOAD WITH THE INTEL PROCESSOR- POWERED DELL CLOUDERA SYNCSORT SOLUTION link COST ADVANTAGES OF HADOOP ETL OFFLOAD WITH THE INTEL PROCESSOR- POWERED DELL CLOUDERA SYNCSORT SOLUTION Many companies are adopting Hadoop solutions to handle large amounts of data stored across clusters

More information

IBM PERFORMANCE Madrid Smarter Decisions. Better Results.

IBM PERFORMANCE Madrid Smarter Decisions. Better Results. IBM PERFORMANCE Madrid 2010 Smarter Decisions. Better Results. 1 IBM Business Analytics on SAP Solutions The Smarter Choice Session: Speaker: Advanced Analytics with IBM Cognos for SAP Customers, ERP Market

More information

Table of Contents. Are You Ready for Digital Transformation? page 04. Take Advantage of This Big Data Opportunity with Cisco and Hortonworks page 06

Table of Contents. Are You Ready for Digital Transformation? page 04. Take Advantage of This Big Data Opportunity with Cisco and Hortonworks page 06 Table of Contents 01 02 Are You Ready for Digital Transformation? page 04 Take Advantage of This Big Data Opportunity with Cisco and Hortonworks page 06 03 Get Open Access to Your Data and Help Ensure

More information

IBM PureData System for Analytics Overview

IBM PureData System for Analytics Overview IBM PureData System for Analytics Overview Chris Jackson Technical Sales Specialist chrisjackson@us.ibm.com Traditional Data Warehouses are just too complex They do NOT meet the demands of advanced analytics

More information

Enterprise PLM Solutions Advanced PLM Platform

Enterprise PLM Solutions Advanced PLM Platform Enterprise PLM Solutions Advanced PLM Platform The Aras Innovator Model-based SOA for Enterprise PLM Advantages of combining the Model-based Approach with a Service-Oriented Architecture Updated Edition

More information

Analytics in Action transforming the way we use and consume information

Analytics in Action transforming the way we use and consume information Analytics in Action transforming the way we use and consume information Big Data Ecosystem The Data Traditional Data BIG DATA Repositories MPP Appliances Internet Hadoop Data Streaming Big Data Ecosystem

More information

<Insert Picture Here> Oracle Exalogic Elastic Cloud: Revolutionizing the Datacenter

<Insert Picture Here> Oracle Exalogic Elastic Cloud: Revolutionizing the Datacenter Oracle Exalogic Elastic Cloud: Revolutionizing the Datacenter Mike Piech Senior Director, Product Marketing The following is intended to outline our general product direction. It

More information

Oracle BIEE Plus Complete Overview for Implementers. Naren Thota April, 2008

Oracle BIEE Plus Complete Overview for Implementers. Naren Thota April, 2008 Oracle BIEE Plus Complete Overview for Implementers Naren Thota April, 2008 Professional Background Oracle Application Development since 1996 Experienced in Implementations and Upgrades 10.6 SC thru 11.5.10.2

More information

NetApp Services Viewpoint. How to Design Storage Services Like a Service Provider

NetApp Services Viewpoint. How to Design Storage Services Like a Service Provider NetApp Services Viewpoint How to Design Storage Services Like a Service Provider The Challenge for IT Innovative technologies and IT delivery modes are creating a wave of data center transformation. To

More information

SAP BW/4HANA. Next Generation Data Warehouse. Simon Iglesias Analytics Solution Sales. Internal

SAP BW/4HANA. Next Generation Data Warehouse. Simon Iglesias Analytics Solution Sales. Internal SAP BW/4HANA Next Generation Data Warehouse Simon Iglesias Analytics Solution Sales Internal New Reality: A Data Tsunami Volume exponential data growth, insanely large amounts Velocity real-time, constant

More information

BUILT FOR THE SPEED OF BUSINESS

BUILT FOR THE SPEED OF BUSINESS BUILT FOR THE SPEED OF BUSINESS STEVE ILLINGWORTH Chief Technology Officer Pivotal, Asia Pacific Japan sillingworth@gopivotal.com 1 Digital Banking Business needs IT impacts Corporate culture Pivotal One

More information