Big Data and Real Time Analytics Streams and Hadoop
|
|
- Colleen Harper
- 6 years ago
- Views:
Transcription
1 Big Data and Real Time Analytics Streams and Hadoop Infrastructure Matters 2014 Briefing 2014 IBM Corporation
2 Big Data is more than just Hadoop What can you tell me about Big Data? I want to know all about Hadoop. Big Data is a lot more than Hadoop! And our competitors don t understand this they cannot deliver value on an entire set of Big Data use cases. Service Oriented Finance CMO IBM Big Data and Real Time Analytics 2
3 IBM Big Data Solutions All Data New/Enhanced Applications Information ingestion and operational information zone DB2 BLU Real-time analytics zone InfoSphere Streams Exploration, landing and archive zone BigInsights and Hadoop Information governance zone Enterprise warehouse data mart and analytic appliances zone DB2 BLU and PureData System for Analytics InfoSphere Server and DataStage Cognos Why did it happen? Reporting, analysis, content analytics InfoSphere Discovery What is happening? Discovery and exploration Cognitive Fabric SPSS What could happen? Predictive analytics and modeling What action should I take? Decision management Systems Security Storage On premise, Cloud, As a service Analyze all data, from any source, with the right technology IBM Big Data & Analytics Infrastructure Big Data and Real Time Analytics 3
4 There are two main types of Big Data Real-time Analytics Zone Stream Computing Landing and Analytics Zone Hadoop System Data in motion Data typically not stored Tremendous velocity Ultra low latency required Multiple data sources Huge volumes of unstructured data Data at rest Data stored on disk Huge volumes of unstructured data No pre-defined schemas Too large for traditional tools to process in a timely manner Our competitors do not address both of these! Big Data and Real Time Analytics 4
5 New programming models and low cost hardware solve Big Data problems Streaming Application Streaming Cluster Streaming and Apache Hadoop applications Proven frameworks to process large amounts of data Streaming for data in motion, Hadoop for data at rest Enable applications to transparently work with large clusters of nodes in parallel Hadoop Cluster Clusters of low cost POWER8 servers are ideal for Hadoop and streaming applications Big Data and Real Time Analytics 5
6 Service Oriented Finance wants to gain a competitive advantage from Big Data This application will give our market managers a real advantage! Service Oriented Finance wants to deploy a stock trading application with the following requirements Process millions of trades per second Application must scale Constant flow of input data Microsecond latency Unstructured trade data input Sophisticated analytics logic Service Oriented Finance Market Manager Big Data and Real Time Analytics 6
7 InfoSphere Streams is a platform for Real-time Analytics on Big Data in motion Just in time decisions InfoSphere Streams can meet these requirements! Powerful Analytics Millions of events per second Microsecond Latency Streams is a platform for real-time analytics on Big Data Our competitors do not have this capability Sensor, video, audio, text and relational data sources Big Data and Real Time Analytics 7
8 Streams Makes it easy for data in motion programming Developer Role Eclipse based tools Visual application monitoring Built in accelerators Administrator Role Visual application monitoring Stream data visualization Start/stop jobs Business User Role Visual application monitoring Stream data visualization InfoSphere Streams Console Big Data and Real Time Analytics 8
9 Streams programming is drag and drop simple Source Adapters Operator Repository Sink Adapters Application composition (Optimized Compilation) Big Data and Real Time Analytics 9
10 Streams Studio provides a rich set of Eclipse based tools Drag and drop simple Big Data and Real Time Analytics 10
11 Streams provides more throughput than Apache Storm in analysis benchmark Throughput ( s/s) Throughput - Four Nodes Streams 2.6x-12.3x More Throughput Storm x4 (100%) x8 (100%) x8 (200%) x8 (400%) Parallelism (dataset) Based on IBM internal tests comparing InfoSphere Streams against Apache Storm. Results may not be typical and will vary based on actual workload, configuration, applications and other variables in a production environment. Users of this document should verify the applicable data for their specific environment. Contact IBM and see what we can do for you. Big Data and Real Time Analytics 11
12 InfoSphere BigInsights includes Apache Hadoop to process data at rest Hadoop Cluster Processing Input Storage MapReduce Java Program Result Comprised of a cluster of inexpensive hardware Nodes have processors, memory and disks Special file system Hadoop Distributed File System (HDFS) Special programming model MapReduce Big Data and Real Time Analytics 12
13 The Hadoop Distributed File System (HDFS) distributes data across a Hadoop cluster inputfile.txt B1 B2 B3 B = block R = replica Hadoop Distributed File System B1 R3 B2 R1 R3 R2 B3 R2 R1 Node 1 Node 2 Node 3 Node n A distributed file system that spans all the nodes in a Hadoop cluster Files are split automatically at load time into blocks and spread among Data Nodes System assumes nodes will fail Achieves reliability by replicating data across multiple nodes Elastically scalable Big Data and Real Time Analytics 13
14 The MapReduce framework sends programs out to the data MapReduce Job Map and Reduce Tasks Map and Reduce Tasks Map and Reduce Tasks Map and Reduce Tasks HDFS HDFS HDFS HDFS Node 1 Node 2 Node 3 Node n MapReduce job is sent out to each node Map and Reduce tasks run in parallel across nodes Hadoop framework does a lot of the heavy lifting e.g., moving data between map and reduce tasks Big Data and Real Time Analytics 14
15 BigInsights makes it easy for all Big Data roles Developer Role Eclipse based tooling Read/write access to HDFS Extensive views of jobs and workflows in system Application staging, launch and scheduling center Many built in accelerators InfoSphere BigInsights Console Administrator Role Complete management of cluster Monitor/start/stop components Add/remove nodes Portal style dashboards Business User Role No Java required Spreadsheet tooling Visualization Big Data and Real Time Analytics 15
16 Service Oriented Finance wants to analyze customer complaints We need to know what our customers are complaining about. We can help you do that with sentiment analysis using BigInsights Service Oriented Finance CMO IBM Big Data and Real Time Analytics 16
17 Sentiment Analysis - A Big Data challenge but also a Big Data opportunity Trying to determine Product demand Feelings - Attitudes Emotions - Opinions Thoughts - Desires New product acceptance Competitive threats Threats to brand reputations Huge volumes of unstructured data Advertisement targets Finding sentiment from social media data Big Data and Real Time Analytics 17
18 DEMO: Using BigInsights to Analyze negative sentiment on Twitter I don t trust the web site for on-line banking The service reps are very nice and helpful I love the check guard feature! The ATM fees are ridiculous! Data Source Twitter Topic Service Oriented Finance Likes Love the check guard feature Like the on-line bill pay feature Like that the ATMs are located all over the city Like the service representatives Dislikes Don t trust the on-line banking feature Don t like to wait in line for a long time Don t like the ATM fees Hate the overdraft fees Big Data and Real Time Analytics 18
19 Architecture matters when you design a micro processor for emerging big workloads It s not about the number of transistors, it is what you do with them to handle Big Workloads POWER7 to POWER8 1.2 Billion Transistors 567 mm 45nm to 22nm 4.2 Billion Transistors 650 mm 2 Westmere EX to Ivy Bridge EX 2.6 Billion Transistors 513 mm 32nm to 22nm 4.3 Billion Transistors 541 mm 2 CAPI: Coherent Accelerator Processor Interface POWER8 vs. Ivy Bridge EX - 96 threads/socket vs. 30-4x Memory Bandwidth - 3x on-die Cache - Cache latency reduced by 50% - 5x I/O Bandwidth - 15 metal layers vs. 9 - edram vs SRAM POWER8 Unique Technology - CAPI Technology - Integrated PCIe - Transactional Memory - L4 Cache - Dynamic Overclocking Big Data and Real Time Analytics 19
20 Is Ivy Bridge a new breakthrough architecture? The Ivy Bridge CPU micro-architecture is a shrink from Sandy Bridge Ivy Bridge is a tick in the tick/tock Intel release cycle compared to Sandy Bridge A tick means same architecture as previous with some minor improvements The major improvement is the 22 nm Tri-gate transistor technology from the prior 32 nm technology More transistors More cores, sockets 20% more L3 cache Does the 22nm technology result in better performance? nm = nanometer which is a measure of the CMOS semiconductor device fabrication process. Smaller nm process allows for more transistors Source for Tick Tock Model: Big Data and Real Time Analytics 20
21 Intel s performance per Core is not increasing over previous generation RPE2 per Core Socket HP Servers Sandy Bridge EP 2.9 GHz 16 cores Ivy Bridge EP 2.7 GHz 24 cores Ivy Bridge EX 2.8 GHz 30 cores 0 The number shown is best in each category (sockets and number of cores) Source of RPE2: (Gartner) RPE2 numbers are derived from the following six benchmark inputs: SAP SD Two-Tier, TPC-C, TPC-H, SPECjbb2006 and two SPEC CPU2006 components The data in this tool is derived from RPE2 from Ideas International. Ideas International was acquired by Gartner, Inc. in Gartner, Inc. and/or its affiliates. All rights reserved. Big Data and Real Time Analytics 21
22 The new POWER8 scale-out servers innovation to put data to work POWER8 roll-out is leading with scale-out (1 and 2 Socket) systems Expanded Linux focus: Ubuntu, KVM, and OpenStack OpenPOWER Innovations 1 & 2 Socket Power Systems S812L S822L S822 S814 S824L S824 1-socket, 2U Up to 12 cores 512 GB Memory 6 PCIe Gen 3 Linux only PowerVM & PowerKVM August 29, socket, 2U Up to 24 cores 1 TB memory 9 PCIe Gen3 Linux only PowerVM & PowerKVM June 10, socket, 2U Up to 20 cores 1 TB memory 9 PCIe Gen 3 AIX & Linux PowerVM June 10, socket, 4U Up to 8 cores 512 GB memory 7 PCIe Gen 3 AIX, IBM i, Linux PowerVM June 10, socket, 4U Up to 24 cores Linux only NVIDIA GPU 2H socket, 4U Up to 24 cores 1 TB memory 11 PCIe Gen 3 AIX, IBM i, Linux PowerVM June 10, 2014 Scale-out Big Data and Real Time Analytics 22
23 Power S822L servers are priced competitively to Intel Ivy Bridge servers Comparable TCA Linux on Intel Ivy Bridge + KVM vs. Linux on POWER8 + KVM Server list price* -3-year warranty, on-site Dell PowerEdge R720 HP ProLiant DL380 G8 IBM Power S822L $21,300 $22,763 $22,382 $12,605 $14,068 $14,895 Virtualization - 2 sockets, 3 yr. 9x5 sub./supp. $2,998 KVM for Red Hat on x86 (RHEV) $ 2,998 KVM for Red Hat on x86 (RHEV) $2,998 KVM for Linux on Power (PowerKVM) Linux OS list price - RHEL, 2 sockets, unlimited guests, 9x5, 3 yr. sub./ supp. $5,697 Red Hat subscription and Red Hat support $5,697 Red Hat subscription and Red Hat support $4,489 Red Hat subscription and IBM support Total list price: (Total cost of acquisition) $21,300 $22,763 $22,382 Server model Dell R720 HP Proliant DL380p G8 IBM Power 822L Processor / cores Two 2.7 GHz, E5-2697, Ivy Bridge, 12-core processors Two 3.4 GHz POWER8, 10-core Configuration 64 GB memory, 2 x 300GB 15k HDD, 10 Gb two port Same memory, HDD, NIC * Based on US pricing for Power S822L announcing on April 28, 2014 matching configuration table above. Source: hp.com, dell.com, vmware.com Big Data and Real Time Analytics 23
24 POWER architecture is ideal for Hadoop and Streams workloads Mutli-Threading High I/O Performance Processor and Cache Java Optimization Twice the thread capability to use to run parallel Java workloads More threads enables faster BLU Acceleration Single- Instruction, Multiple Data (SIMD) vector processing More threads enable faster Hadoop workload processing High memory and I/O bandwidth enables faster Cognos ad hoc queries and better SPSS scoring throughput High I/O performance enables Cognos faster ad hoc queries and better SPSS scoring throughput Higher speed that significantly outperforms competitive x86 Hadoop configurations Large processor cache per core, large numbers of cores and high DRAM capacity on a single server benefits Cognos Cubes performance Extensive Soft-Error Recovery, self-healing for solid faults, and alternate processor recovery is ideal for Hadoop-based workloads IBM s Java Virtual Machine is specifically optimized for the POWER architecture to deliver optimal performance of big data and analytics solutions Big Data and Real Time Analytics 24
25 BigInsights on POWER beats the competition with TeraSort benchmark GB/core/min Normalized Performance GB / Core / Min 1.7X Competitor 0.73 Competitor 0.75 Competitor POWER7+ with BigInsights POWER8 with BigInsights 0 Xeon E5-2697v2 8 nodes 192 cores 1 TB Xeon E5-2630v2 24 nodes 288 cores 10 TB Xeon E nodes 256 cores 10 TB POWER7+ 7R2 10 nodes 120 cores 1 TB POWER8 S822L 8 nodes 192 cores 10 TB Big Data and Real Time Analytics 25
26 Big SQL V3.0: Bringing SQL on Hadoop to the next level Massively parallel SQL engine on Hadoop Architected from the ground up for low latency and high throughput Comprehensive SQL support The same SQL you use on your data warehouse should run with few or no modifications Full support for sub-queries All standard join operations Stored procedures / User defined functions Supports all modern file formats Big SQL Engine SQL MPP Run-time HDFS CSV Seq Parquet RC Avro ORC SQL Based Application JSON Custom Big Data and Real Time Analytics 26
27 Big SQL on POWER8 delivers Big Data queries faster than Hive on Ivy Bridge TCP-DS inspired workload queries BigInsights v3.0 on 8 S822L data nodes Time to complete queries in 7 concurrent streams 11.2x Faster S822L, 3.3 GHz 24 cores 256 GB Memory RHEL hours 4.7x 40 min Lower Cost CPO Master POWER Competitive Proof Points v2.9 27
28 Hadoop implementation pain points Building clusters is very complex, time consuming and can become costly Difficult to plan, architect and build out a cluster of nodes and network components Difficult to manage the cluster when deployed Inability to share resources results in cluster sprawl x86 deployments are not flexible enough Built with a fixed processor to disk ratio When you add more machines you get both processors and hard drives Some Hadoop workloads require more of one or the other but not always both at the same time HDFS NameNode is a single point of failure HDFS is not POSIX compliant and wastes storage Requires use of non standard commands and utilities Copies of data may need to be stored twice Once in Linux file system and once in HDFS Hadoop workload management is very basic Does not utilize cluster resources to their maximum Does not support multi-tenancy Can lead to silos of under utilized clusters Big Data and Real Time Analytics 28
29 Introducing the IBM Solution for Hadoop Power Systems Edition Best-in-class hardware IBM POWER Systems IBM InfoSphere BigInsights or Open-source Hadoop IBM DCS3700 Storage IBM Platform Symphony IBM Platform Cluster Manager Best-in-class software IBM BigInsights IBM Platform Computing IBM Platform Symphony IBM Platform Cluster Manager IBM Elastic Storage Formerly GPFS Distributed File System IBM Elastic Storage, HDFS Linux Operating Environment RHEL SUSE IBM Power Systems IBM Power 7+, Power8 Big Data and Real Time Analytics 29
30 Platform Symphony provides enterprise class Hadoop cluster management Replaces the open source MapReduce scheduler of Hadoop Existing Hadoop applications run unchanged Better performance and utilization in cluster overall and within individual jobs Scheduling is more efficient More efficient data transfer mechanisms for shuffle phase of MapReduce Uses generic slots rather than fixed map and reduce slots which waste resources Big Data and Real Time Analytics 30
31 Platform Symphony provides enterprise class Hadoop cluster management cont. Multi-tenancy is built in system is designed for concurrency Reduces costs through better resource sharing and utilization Provides dynamic priority based concurrency Interactive workloads like BigSheets can be given higher priority (.e.g., during the day) Priorities can be adjusted in real time Preemptive scheduling policies ensure service levels are met while also allowing the cluster to be fully utilized Jobs entering the system are immediately given the priority they are guaranteed Applications can decide if they allow other applications to use their unused capacity GUI provided to configure, manage and monitor policies and workload Allows running multiple versions of Hadoop Helps control costs and provides flexibility during upgrades Big Data and Real Time Analytics 31
32 Platform cluster manager removes the complexity of standing up clusters Provides comprehensive set of tools to install, provision and configure cluster management environment GUI, automated installation scripts and guidance Provides flexible provisioning options Bare-metal (i.e., no virtualization) KVM (virtualized cluster environment) Provides GUI for end users to provision, deploy, manage and monitor clusters Improves administrator and/or user productivity Deploy clusters easier and faster Saves on administration costs Lowers risk of deploying Hadoop clusters Big Data and Real Time Analytics 32
33 IBM Elastic Storage(GPFS) brings enterprise class file system capabilities to Hadoop No single point of failure like HDFS NameNode Metadata is distributed GPFS is POSIX compliant Any Linux command/utility can be run against data, not just Hadoop commands Easier to use and manage Makes archival and backup easier Other workload types can access data in cluster Saves storage space and administrator work Avoids moving data around to process inside and outside of HDFS Avoids having duplicate data inside and outside of HDFS Native OS Applications (POSIX) Allows variable block sizes to exist on same cluster to meet the needs of different applications Big Data and Real Time Analytics 33
34 IBM Solution for Hadoop Power Systems Edition key architecture components PowerLinux Data Node IBM PowerLinux 7R2 (P7+ based, P8 coming in 4Q2014) 2 sockets Power GHz CPU Data: 29 x 900Gb SAS HDDs, JBOD I/O Exp OS: 1 x 300Gb SAS HDD 128 GB DDR3 RDIMMs PowerLinux Management Node (JobTracker, NameNode, Console) IBM PowerLinux 7R2 (P7+ based, P8 coming in 4Q2014) 2 sockets Power GHz CPU OS: 6 x 600GB SAS HDD, mirrored 128GB DDR3 RDIMMs DCS3700 IBM DCS3700 Storage f/c c A maximum of 60 x 3.5" HDD, 4 TB each Standard controller (3000) validated 1GbE, 10GbE Switches 1GbE: IBM RackSwitch G GbE RJ45 ports, four 10 GbE SFP+ ports Low 130 W power rating and variable speed fans to reduce power consumption 10GbE: IBM RackSwitch G8264 Optimized for applications requiring high bandwidth and low latency Up to 64 1 Gb/10 Gb SFP+ ports, four 40 Gb QSFP+ports, 1.28 Tbps non-blocking throughput Big Data and Real Time Analytics 34
35 Learn more about Big Data Analytics Solutions on Power IBM Analytics Solutions for Power Systems The PowerLinux Community (DeveloperWorks) Big Data and Real Time Analytics 35
IBM Big Data Summit 2012
IBM Big Data Summit 2012 12.10.2012 InfoSphere BigInsights Introduction Wilfried Hoge Leading Technical Sales Professional hoge@de.ibm.com twitter.com/wilfriedhoge 12.10.1012 IBM Big Data Strategy: Move
More informationCognitive Solutions in the Context of IBM Systems Cognitive Analytics / Integration Scenarios / Use Cases
Cognitive Solutions in the Context of IBM Systems Cognitive Analytics / Integration Scenarios / Use Cases Mano Srinivasan Open Source Solutions Architect manojs@sg.ibm.com Topics and Questions to be addressed
More informationETL on Hadoop What is Required
ETL on Hadoop What is Required Keith Kohl Director, Product Management October 2012 Syncsort Copyright 2012, Syncsort Incorporated Agenda Who is Syncsort Extract, Transform, Load (ETL) Overview and conventional
More informationWhat s Happening to the Mainframe? Mobile? Social? Cloud? Big Data?
Glenn Anderson, IBM Lab Services and Training What s Happening to the Mainframe? Mobile? Social? Cloud? Big Data? Winter SHARE March 2014 Session 15126 Today s mainframe is a hybrid system InfoSphere Streams
More informationActualTests.C Q&A C Foundations of IBM Big Data & Analytics Architecture V1
ActualTests.C2030-136.40Q&A Number: C2030-136 Passing Score: 800 Time Limit: 120 min File Version: 4.8 http://www.gratisexam.com/ C2030-136 Foundations of IBM Big Data & Analytics Architecture V1 Hello,
More informationENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS)
ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS) Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how Dell EMC Elastic Cloud Storage (ECS ) can be used to streamline
More informationBringing the Power of SAS to Hadoop Title
WHITE PAPER Bringing the Power of SAS to Hadoop Title Combine SAS World-Class Analytics With Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities ii Contents Introduction... 1 What
More informationMicrosoft FastTrack For Azure Service Level Description
ef Microsoft FastTrack For Azure Service Level Description 2017 Microsoft. All rights reserved. 1 Contents Microsoft FastTrack for Azure... 3 Eligible Solutions... 3 FastTrack for Azure Process Overview...
More informationAccelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica
Accelerating Your Big Data Analytics Jeff Healey, Director Product Marketing, HPE Vertica Recent Waves of Disruption IT Infrastructu re for Analytics Data Warehouse Modernization Big Data/ Hadoop Cloud
More informationFrom Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN
From Information to Insight: The Big Value of Big Data Faire Ann Co Marketing Manager, Information Management Software, ASEAN The World is Changing and Becoming More INSTRUMENTED INTERCONNECTED INTELLIGENT
More informationECLIPSE 2012 Performance Benchmark and Profiling. August 2012
ECLIPSE 2012 Performance Benchmark and Profiling August 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationThe Intersection of Big Data and DB2
The Intersection of Big Data and DB2 May 20, 2014 Mike McCarthy, IBM Big Data Channels Development mmccart1@us.ibm.com Agenda What is Big Data? Concepts Characteristics What is Hadoop Relational vs Hadoop
More informationMapR Pentaho Business Solutions
MapR Pentaho Business Solutions The Benefits of a Converged Platform to Big Data Integration Tom Scurlock Director, WW Alliances and Partners, MapR Key Takeaways 1. We focus on business values and business
More informationBig Data The Big Story
Big Data The Big Story Jean-Pierre Dijcks Big Data Product Mangement 1 Agenda What is Big Data? Architecting Big Data Building Big Data Solutions Oracle Big Data Appliance and Big Data Connectors Customer
More informationMicrosoft Azure Essentials
Microsoft Azure Essentials Azure Essentials Track Summary Data Analytics Explore the Data Analytics services in Azure to help you analyze both structured and unstructured data. Azure can help with large,
More informationGoing beyond today: Extending the platform for cloud, mobile and analytics
Going beyond today: Extending the platform for cloud, mobile and analytics 2 Analytics System ETL ETL Data Mart Analytics System Data Mart Data Mart One move leads to another and another and another ETL
More informationKnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE
FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK Are you drowning in Big Data? Do you lack access to your data? Are you having a hard time managing Big Data processing requirements?
More informationOracle Big Data Cloud Service
Oracle Big Data Cloud Service Delivering Hadoop, Spark and Data Science with Oracle Security and Cloud Simplicity Oracle Big Data Cloud Service is an automated service that provides a highpowered environment
More informationBig Data Live selbst analysieren
Big Data Live selbst analysieren Hands on Workshop zu IBM InfoSphere Big Insights Harald Gröger Wilfried Hoge Gerhard Wenzel IBM 2013 IBM Corporation Agenda 15:00-15:10 Einführung IBM Big Data Plattform
More informationMapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia
MapR: Converged Data Pla3orm and Quick Start Solu;ons Robin Fong Regional Director South East Asia Who is MapR? MapR is the creator of the top ranked Hadoop NoSQL SQL-on-Hadoop Real Database time streaming
More informationHP SummerSchool TechTalks Kenneth Donau Presale Technical Consulting, HP SW
HP SummerSchool TechTalks 2013 Kenneth Donau Presale Technical Consulting, HP SW Copyright Copyright 2013 2013 Hewlett-Packard Development Development Company, Company, L.P. The L.P. information The information
More informationDatametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud
DAMA Datametica The Modern Data Platform Enterprise Data Hub Implementations What is happening with Hadoop Why is workload moving to Cloud 1 The Modern Data Platform The Enterprise Data Hub What do we
More informationInfoSphere Warehouse. Flexible. Reliable. Simple. IBM Software Group
IBM Software Group Flexible Reliable InfoSphere Warehouse Simple Ser Yean Tan Regional Technical Sales Manager Information Management Software IBM Software Group ASEAN 2007 IBM Corporation Business Intelligence
More informationInfoSphere Warehousing 9.5
IBM Software Group Optimised InfoSphere Warehousing 9.5 Flexible Simple Phil Downey InfoSphere Warehouse Technical Marketing 2007 IBM Corporation Information On Demand End-to-End Capabilities Optimization
More informationScalability and High Performance with MicroStrategy 10
Scalability and High Performance with MicroStrategy 10 Enterprise Analytics and Mobility at Scale. Copyright Information All Contents Copyright 2017 MicroStrategy Incorporated. All Rights Reserved. Trademark
More informationIBM xseries 430. Versatile, scalable workload management. Provides unmatched flexibility with an Intel architecture and open systems foundation
Versatile, scalable workload management IBM xseries 430 With Intel technology at its core and support for multiple applications across multiple operating systems, the xseries 430 enables customers to run
More informationORACLE BIG DATA APPLIANCE
ORACLE BIG DATA APPLIANCE BIG DATA FOR THE ENTERPRISE KEY FEATURES Massively scalable infrastructure to store and manage big data Big Data Connectors delivers unprecedented load rates between Big Data
More informationBusiness is being transformed by three trends
Business is being transformed by three trends Big Cloud Intelligence Stay ahead of the curve with Cortana Intelligence Suite Business apps People Custom apps Apps Sensors and devices Cortana Intelligence
More informationAchieving Agility and Flexibility in Big Data Analytics with the Urika -GX Agile Analytics Platform
Achieving Agility and Flexibility in Big Data Analytics with the Urika -GX Agile Analytics Platform Analytics R&D and Product Management Document Version 1 WP-Urika-GX-Big-Data-Analytics-0217 www.cray.com
More informationResearch Report. The Major Difference Between IBM s LinuxONE and x86 Linux Servers
Research Report The Major Difference Between IBM s LinuxONE and x86 Linux Servers Executive Summary The most important point in this Research Report is this: mainframes process certain Linux workloads
More informationExploring the Benefits of the Modernized Data Warehouse Philip Russom
Exploring the Benefits of the Modernized Data Warehouse Philip Russom TDWI Research Director for Data Management October 8, 2014 Sponsor 2 Speakers Philip Russom TDWI Research Director, Data Management
More informationAurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect
Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect 2005 Concert de Coldplay 2014 Concert de Coldplay 90% of the world s data has been created over the last two years alone 1 1. Source
More information2017 IBM Elastic Storage Server - Update
IBM Elastic Storage Server - Update Falk Steinbrueck Program Manager, IBM Systems Storage Disclaimer IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without
More informationOracle's Cloud Strategie für den Geschäftserfolg Alles Neue von der OOW
Oracle's Cloud Strategie für den Geschäftserfolg Alles Neue von der OOW Matthias Weiss Direktor Mittelstand Technologie Oracle Deutschland B.V. & Co. KG Agenda 1 2 3 4 5 6 Digital Transformation and CEO
More informationOracle Big Data Discovery The Visual Face of Big Data
Oracle Big Data Discovery The Visual Face of Big Data Today's Big Data challenge is not how to store it, but how to make sense of it. Oracle Big Data Discovery is a fundamentally new approach to making
More informationSr. Sergio Rodríguez de Guzmán CTO PUE
PRODUCT LATEST NEWS Sr. Sergio Rodríguez de Guzmán CTO PUE www.pue.es Hadoop & Why Cloudera Sergio Rodríguez Systems Engineer sergio@pue.es 3 Industry-Leading Consulting and Training PUE is the first Spanish
More informationDavid Taylor
Sept 10, 2013 What s New! IBM Cognos Business Intelligence 10.2.1.1 (released Sept 10, 2013) Analytic Catalyst TM1 10.2 Cognos Insight David Taylor david.taylor@us.ibm.com Agenda Overview of innovations
More informationDELL EMC POWEREDGE 14G SERVER PORTFOLIO
DELL EMC POWEREDGE 14G SERVER PORTFOLIO Transformation without compromise Seize your share of a huge market opportunity and accelerate your business by combining sales of the exciting new Dell EMC PowerEdge
More informationAdobe Deploys Hadoop as a Service on VMware vsphere
Adobe Deploys Hadoop as a Service A TECHNICAL CASE STUDY APRIL 2015 Table of Contents A Technical Case Study.... 3 Background... 3 Why Virtualize Hadoop on vsphere?.... 3 The Adobe Marketing Cloud and
More informationIntroduction to IBM Technical Computing Clouds IBM Redbooks Solution Guide
Introduction to IBM Technical Computing Clouds IBM Redbooks Solution Guide The IT Industry has tried to maintain a balance between more demands from business to deliver services against cost considerations
More informationThe IBM Reference Architecture for Healthcare and Life Sciences
The IBM Reference Architecture for Healthcare and Life Sciences Janis Landry-Lane IBM Systems Group World Wide Program Director janisll@us.ibm.com Doing It Right SYMPOSIUM March 23-24, 2017 Big Data Symposium
More informationE-guide Hadoop Big Data Platforms Buyer s Guide part 3
Big Data Platforms Buyer s Guide part 3 Your expert guide to big platforms enterprise MapReduce cloud-based Abie Reifer, DecisionWorx The Amazon Elastic MapReduce Web service offers a managed framework
More informationOracle Autonomous Data Warehouse Cloud
Oracle Autonomous Data Warehouse Cloud 1 Lower Cost, Increase Reliability and Performance to Extract More Value from Your Data With Oracle Autonomous Database Cloud Service for Data Warehouse Today s leading-edge
More informationTop 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11
Top 5 Challenges for Hadoop MapReduce in the Enterprise Whitepaper - May 2011 http://platform.com/mapreduce 2 5/9/11 Table of Contents Introduction... 2 Current Market Conditions and Drivers. Customer
More informationLeveraging Oracle Big Data Discovery to Master CERN s Data. Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden
Leveraging Oracle Big Data Discovery to Master CERN s Data Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden Manuel Martin Marquez Intel IoT Ignition Lab Cloud and
More informationMike Strickland, Director, Data Center Solution Architect Intel Programmable Solutions Group July 2017
Mike Strickland, Director, Data Center Solution Architect Intel Programmable Solutions Group July 2017 Accelerate Big Data Analytics with Intel Frameworks and Libraries with FPGA s 1. Intel Big Data Analytics
More informationDatasheet FUJITSU Integrated System PRIMEFLEX for Hadoop
Datasheet FUJITSU Integrated System PRIMEFLEX for Hadoop FUJITSU Integrated System PRIMEFLEX for Hadoop is a powerful and scalable platform analyzing big data volumes at high velocity FUJITSU Integrated
More informationHadoop and Analytics at CERN IT CERN IT-DB
Hadoop and Analytics at CERN IT CERN IT-DB 1 Hadoop Use cases Parallel processing of large amounts of data Perform analytics on a large scale Dealing with complex data: structured, semi-structured, unstructured
More informationManaging Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big Data Colin White President, BI Research December 5, 2012 Sponsor 2 Speakers Colin White President, BI Research Vineet Goel Product Manager, IBM InfoSphere
More informationDisclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme
VIRT1400BU Real-World Customer Architecture for Big Data on VMware vsphere Joe Bruneau, General Mills Justin Murray, Technical Marketing, VMware #VMworld #VIRT1400BU Disclaimer This presentation may contain
More informationAnalytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand
Paper 2698-2018 Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand ABSTRACT Digital analytics is no longer just about tracking the number
More informationWelcome to. enterprise-class big data and financial a. Putting big data and advanced analytics to work in financial services.
Welcome to enterprise-class big data and financial a Putting big data and advanced analytics to work in financial services. MapR-FSI Martin Darling We reinvented the data platform for next-gen intelligent
More informationData Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB
Data Analytics and CERN IT Hadoop Service CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB 1 Data Analytics at Scale The Challenge When you cannot fit your workload in a desktop Data
More informationOracle Autonomous Data Warehouse Cloud
Oracle Autonomous Data Warehouse Cloud 1 Lower Cost, Increase Reliability and Performance to Extract More Value from Your Data With Oracle Autonomous Data Warehouse Cloud Today s leading-edge organizations
More informationIntegrating MATLAB Analytics into Enterprise Applications
Integrating MATLAB Analytics into Enterprise Applications David Willingham 2015 The MathWorks, Inc. 1 Run this link. http://bit.ly/matlabapp 2 Key Takeaways 1. What is Enterprise Integration 2. What is
More informationE-guide Hadoop Big Data Platforms Buyer s Guide part 1
Hadoop Big Data Platforms Buyer s Guide part 1 Your expert guide to Hadoop big data platforms for managing big data David Loshin, Knowledge Integrity Inc. Companies of all sizes can use Hadoop, as vendors
More informationORACLE DATA INTEGRATOR ENTERPRISE EDITION
ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition delivers high-performance data movement and transformation among enterprise platforms with its open and integrated E-LT
More informationSAS & HADOOP ANALYTICS ON BIG DATA
SAS & HADOOP ANALYTICS ON BIG DATA WHY HADOOP? OPEN SOURCE MASSIVE SCALE FAST PROCESSING COMMODITY COMPUTING DATA REDUNDANCY DISTRIBUTED WHY HADOOP? Hadoop will soon become a replacement complement to:
More informationHadoop Integration Deep Dive
Hadoop Integration Deep Dive Piyush Chaudhary Spectrum Scale BD&A Architect 1 Agenda Analytics Market overview Spectrum Scale Analytics strategy Spectrum Scale Hadoop Integration A tale of two connectors
More informationRODOD Performance Test on Exalogic and Exadata Engineered Systems
An Oracle White Paper March 2014 RODOD Performance Test on Exalogic and Exadata Engineered Systems Introduction Oracle Communications Rapid Offer Design and Order Delivery (RODOD) is an innovative, fully
More informationBig Data at the Speed of Business IBM Innovations for a new era! Rob Thomas Vice President, Big Data Sales IBM Software Group, Information Management
Big Data at the Speed of Business IBM Innovations for a new era! Rob Thomas Vice President, Big Data Sales IBM Software Group, Information Management Agenda for today 1 IBM s viewpoint on on big big data
More informationBuilding a Multi-Tenant Infrastructure for Diverse Application Workloads
Building a Multi-Tenant Infrastructure for Diverse Application Workloads Rick Janowski Marketing Manager IBM Platform Computing 1 The Why and What of Multi-Tenancy 2 Parallelizable problems demand fresh
More informationHP Cloud Maps for rapid provisioning of infrastructure and applications
Technical white paper HP Cloud Maps for rapid provisioning of infrastructure and applications Table of contents Executive summary 2 Introduction 2 What is an HP Cloud Map? 3 HP Cloud Map components 3 Enabling
More informationData Center Operating System (DCOS) IBM Platform Solutions
April 2015 Data Center Operating System (DCOS) IBM Platform Solutions Agenda Market Context DCOS Definitions IBM Platform Overview DCOS Adoption in IBM Spark on EGO EGO-Mesos Integration 2 Market Context
More information2015 IBM Corporation
2015 IBM Corporation Marco Garibaldi IBM Pre-Sales Technical Support Prestazioni estreme, accelerazione applicativa,velocità ed efficienza per generare valore dai dati 2015 IBM Corporation Trend nelle
More informationIBM z13 Technical Innovation
IBM z13 Technical Innovation Introducing the IBM z13 The mainframe optimized for the digital era 10% Up to 40% Up to 10 TB 2 SODs Up to 141 IBM z13 (z13) Machine Type: 2964 Models: N30, N63, N96, NC9,
More informationGabriel India Ltd builds a single platform for group-wide ERP with IBM and SAP
Gabriel India Ltd builds a single platform for group-wide ERP with IBM and SAP Gabriel India Ltd is the flagship company of the Anand Group, which is among India s leading automotive component companies.
More informationInsights to HDInsight
Insights to HDInsight Why Hadoop in the Cloud? No hardware costs Unlimited Scale Pay for What You Need Deployed in minutes Azure HDInsight Big Data made easy Enterprise Ready Easier and more productive
More informationSizing SAP Central Process Scheduling 8.0 by Redwood
Sizing SAP Central Process Scheduling 8.0 by Redwood Released for SAP Customers and Partners January 2012 Copyright 2012 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted
More informationFlashStack For Oracle RAC
FlashStack For Oracle RAC May 2, 2017 Mayur Dewaikar, Product Management, Pure Storage John McAbel, Product Management, Cisco What s Top of Mind for Oracle Teams? 1 2 3 4 Consolidation Scale & Speed Simplification
More informationIntegrated Social and Enterprise Data = Enhanced Analytics
ORACLE WHITE PAPER, DECEMBER 2013 THE VALUE OF SOCIAL DATA Integrated Social and Enterprise Data = Enhanced Analytics #SocData CONTENTS Executive Summary 3 The Value of Enterprise-Specific Social Data
More informationApplication Integrator Automate Any Application
Application Integrator Automate Any Application BMC Control-M by applications BMC Control-M by platforms ERP Business Intelligence Data Integration / ETL OS Platform SAP Oracle ebusiness Suite PeopleSoft
More informationMicrosoft Big Data. Solution Brief
Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,
More informationOracle Platform as a Service and Infrastructure as a Service Public Cloud Service Descriptions-Metered & Non-Metered.
Oracle Platform as a Service and Infrastructure as a Service Public Cloud Service Descriptions-Metered & Non-Metered November 20, 2017 Contents GLOSSARY PUBLIC CLOUD SERVICES-NON-METERED... 9 API Call...
More informationOracle Exalytics X6-4
ORACLE DATA SHEET Oracle Exalytics X6-4 Oracle Exalytics In-Memory Machine X6-4 is the industry s first engineered system for in-memory analytics that delivers extreme performance for Business Intelligence
More informationHybrid Data Management
Kelly Schlamb Executive IT Specialist, Worldwide Analytics Platform Enablement and Technical Sales (kschlamb@ca.ibm.com, @KSchlamb) Hybrid Data Management IBM Analytics Summit 2017 November 8, 2017 5 Essential
More informationCompiere ERP Starter Kit. Prepared by Tenth Planet
Compiere ERP Starter Kit Prepared by Tenth Planet info@tenthplanet.in www.tenthplanet.in 1. Compiere ERP - an Overview...3 1. Core ERP Modules... 4 2. Available on Amazon Cloud... 4 3. Multi-server Support...
More informationREQUEST FOR PROPOSAL FOR
REQUEST FOR PROPOSAL FOR HIGH PERFORMANCE COMPUTING (HPC) SOLUTION Ref. No. PHY/ALK/43 (27/11/2012) by DEPARTMENT OF PHYSICS UNIVERSITY OF PUNE PUNE - 411 007 INDIA NOVEMBER 27, 2012 1 Purpose of this
More informationContinuityPatrol. An intelligent Service Availability Management (isam) Suite VISIBILITY I ACCOUNTABILITY I ORCHESTRATION I AUTOMATION
An intelligent Service Availability Management (isam) Suite VISIBILITY I ACCOUNTABILITY I ORCHESTRATION I AUTOMATION Overview Continuity Patrol enables Real-Time Enterprise Visibility for Intelligent Business
More informationCask Data Application Platform (CDAP)
Cask Data Application Platform (CDAP) CDAP is an open source, Apache 2.0 licensed, distributed, application framework for delivering Hadoop solutions. It integrates and abstracts the underlying Hadoop
More informationOPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT
WHITEPAPER OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT A top-tier global bank s end-of-day risk analysis jobs didn t complete in time for the next start of trading day. To solve
More informationA Cloud Migration Checklist
A Cloud Migration Checklist WHITE PAPER A Cloud Migration Checklist» 2 Migrating Workloads to Public Cloud According to a recent JP Morgan survey of more than 200 CIOs of large enterprises, 16.2% of workloads
More informationCommon Customer Use Cases in FSI
Common Customer Use Cases in FSI 1 Marketing Optimization 2014 2014 MapR MapR Technologies Technologies 2 Fortune 100 Financial Services Company 104M CARD MEMBERS 3 Financial Services: Recommendation Engine
More informationORACLE EXALYTICS IN-MEMORY MACHINE T5-8 DATA SHEET
ORACLE EXALYTICS IN-MEMORY MACHINE T5-8 DATA SHEET SPEED OF THOUGHT ANALYTICS KEY FEATURES 128 CPU cores 4 TB of RAM 6.4 TB PCIe Flash 9.6 TB of raw disk capacity Four 40 Gbps InfiniBand ports Four 8 Gbps
More informationModernizing Data Integration
Modernizing Data Integration To Accommodate New Big Data and New Business Requirements Philip Russom Research Director for Data Management, TDWI December 16, 2015 Sponsor Speakers Philip Russom TDWI Research
More informationWhite paper A Reference Model for High Performance Data Analytics(HPDA) using an HPC infrastructure
White paper A Reference Model for High Performance Data Analytics(HPDA) using an HPC infrastructure Discover how to reshape an existing HPC infrastructure to run High Performance Data Analytics (HPDA)
More informationManaging Data in Motion with the Connected Data Architecture
Managing in Motion with the Connected Architecture Dmitry Baev Director, Solutions Engineering Doing It Right SYMPOSIUM March 23-24, 2017 1 Hortonworks Inc. 2011 2016. All Rights Reserved 4 th Big & Business
More informationTCA / TCO Advantages of using POWER8
IBM Power Systems IBM STG Lab Services Executive Consulting IT Optimization Consulting TCA / TCO Advantages of using POWER8 Competitive Analysis DB2 BLU Lightning May 20, 2014 Availability Optimization
More informationEnabling a Large Analytic Userbase on Hadoop & Co
Enabling a Large Analytic Userbase on Hadoop & Co Justin Coffey, j.coffey@criteo.com, @jqcoffey GOTO Amsterdam 2015 What we ll be taking (lots) about I ll be presenting an overview of our analytics stack
More informationIBM Tivoli Workload Automation View, Control and Automate Composite Workloads
Tivoli Workload Automation View, Control and Automate Composite Workloads Mark A. Edwards Market Manager Tivoli Workload Automation Corporation Tivoli Workload Automation is used by customers to deliver
More informationSurPaaS Analyzer. Cut your application assessment. Visualize Your Cloud Options. Time by a factor of 10x and Cost by 75% Unique Features
SurPaaS Analyzer Visualize Your Cloud Options DATASHEET Cut your application assessment Time by a factor of 10x and Cost by 75% SurPaaS Analyzer is a Cloud smart decision-enabling tool that analyzes your
More informationOptimizing resource efficiency in Microsoft Azure
Microsoft IT Showcase Optimizing resource efficiency in Microsoft Azure By July 2017, Core Services Engineering (CSE, formerly Microsoft IT) plans to have 90 percent of our computing resources hosted in
More informationIBM Tivoli Monitoring
Monitor and manage critical resources and metrics across disparate platforms from a single console IBM Tivoli Monitoring Highlights Proactively monitor critical components Help reduce total IT operational
More information1. Intoduction to Hadoop
1. Intoduction to Hadoop Hadoop is a rapidly evolving ecosystem of components for implementing the Google MapReduce algorithms in a scalable fashion on commodity hardware. Hadoop enables users to store
More informationNew Ways to Leverage Open Source
#techsummitch New Ways to Leverage Open Source NEW DEMANDS NEW OPPORTUNITIES Open Source Projects Have Exploded + SUSE: What it Means to be Open Open Source Community Customers & Partners Committed to
More informationPentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara
Pentaho 8.0 and Beyond Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara Safe Harbor Statement The forward-looking statements contained in this document represent an outline of our
More informationIntegrated Service Management
Integrated Service Management for Power servers As the world gets smarter, demands on the infrastructure will grow Smart traffic systems Smart Intelligent food oil field technologies systems Smart water
More informationThe Journey to Cognos Analytics. Paul Rivera, Eric Smith IBM Analytics Lab Services
The Journey to Cognos Analytics Paul Rivera, Eric Smith IBM Analytics Lab Services Agenda What s Changed What Cognos Analytics Requires Lifecycle of a Typical Migration Upgrade to Cloud What about How
More informationBuilding a Scalable Architecture for Big Data
Building a Scalable Architecture for Big Data Presenter: Adrian D cruz Senior Enterprise Architect, Financial Services Industry Presales Consulting Organization- Malaysia Objectives 2 Part A HP s View
More informationIBM Platform LSF & PCM-AE Dynamische Anpassung von HPC Clustern für Sondernutzung und Forschungskollaborationen
IBM Platform LSF & PCM-AE Dynamische Anpassung von HPC Clustern für Sondernutzung und Forschungskollaborationen ZKI Meeting 2012 - Düsseldorf Heiko Lehmann Account Manager Heiko.lehmann@de.ibm.com Bernhard
More information