A REVIEW ON HADOOP ARCHITECTURE FOR BIG DATA

Size: px
Start display at page:

Download "A REVIEW ON HADOOP ARCHITECTURE FOR BIG DATA"

Transcription

1 International Journal of Research in Engineering, Technology and Science, Volume VI, Special Issue, July ISSN A REVIEW ON HADOOP ARCHITECTURE FOR BIG DATA Shaik Aleem Ur Rehaman 1, Raman Preet Kaur 2, Tanveer Baig Z 1, Saqib Rashid 1, Zahid Nazir Moon 1 1 Dept. Of Electronics and Communication, HKBK College of Engineering, Bangalore, India 2 Dept. Of Computer Science, HKBK College of Engineering, Bangalore, India ABSTRACT: This paper aims at providing the description of big data, its objectives and the processing of Big data. The benefits of using Hadoop architecture is dealt which serves as a core platform for structuring the Big data as a resultant of massive data creation from all possible source. Hadoop uses distributed computing system which has multiple servers using relatively cheaper hardware to store large data.then the creating a value using big data is also been discussed in this paper. Keywords: Big Data,Processors,Huge Data Storage, Hadoop [1] INTRODUCTION According to McKinsey, Big Data refers to datasets whose size are beyond the ability of typical database software tools to capture, store, manage and analyze[11]. There is no explicit definition of how big a dataset should be in order to be considered Big Data. New technology has to be in place to manage this Big Data phenomenon. IDC defines Big Data technologies as a new generation of technologies and architectures designed to extract value economically from very large volumes of a wide variety of data by enabling high velocity capture, discovery and analysis. According to O Reilly, Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or does not fit the structures of existing database architectures [1]. To gain value from these data, there must be an alternative way to process it. Data volume is also growing exponentially due to the explosion of machine-generated data (data records, web-log files, sensor data) and from growing human engagement within the social networks. Analysis of data sets can find new correlations to spot business trends,prevent diseases, combat crime and so on [9]. Figure: 1. A decade of Digital Universe Growth :Storage in Exabyte s Shaik Aleem Ur Rehaman, Raman Preet Kaur, Tanveer Baig Z, Saqib Rashid, Zahid Nazir Moon 1

2 A REVIEW ON HADOOP ARCHITECTURE FOR BIG DATA [2] OBJECTIVES OF BIG DATA Like many new information technologies, big data can bring about dramatic cost reductions, substantial improvements in the time required to perform a computing task, or new product and service offerings. Like traditional analytics, it can also support internal business decisions. The technologies and concepts behind big data allow organizations to achieve a variety of objectives, but most of the organizations we interviewed were focused on one or two. The chosen objectives have implications for not only the outcome and financial benefits from big data, but also the process who leads the initiative, where it fits within the organization, and how to manage the project. A) Cost Reduction from Big Data Technologies Some organizations pursuing big data believe strongly that MIPS and terabyte storage for structured data are now most cheaply delivered through big data technologies like Hadoop clusters. One company s cost comparison, for example, estimated that the cost of storing one terabyte for a year was $37,000 for a traditional relational database, $5,000 for a database appliance, and only $2,000 for a Hadoop cluster Of course, these figures are not directly comparable, in that the more traditional technologies may be somewhat more reliable and easily managed. Data security approaches, for example, are not yet fully developed in the Hadoop cluster environment. Organizations that were focused on cost reduction made the decision to adopt big data tools primarily within the IT organization on largely technical and economic criteria. IT groups may want to involve some of your users and sponsors in debating the data management advantages and disadvantages of this kind of storage, but that is probably the limit of the discussion needed. [3] B) Time Reduction from Big Data The second common objective of big data technologies and solutions is time reduction. Macy s merchandise pricing optimization application provides a classic example of reducing the cycle time for complex and large-scale analytical calculations from hours or even days to minutes or seconds. The department store chain has been able to reduce the time to optimize pricing of its 73 million items for sale from over 27 hours to just over 1 hour. Described by some as big data analytics, this capability set obviously makes it possible for Macy s to re-price items much more frequently to adapt to changing conditions in the retail marketplace. This big data analytics application takes data out of a Hadoop cluster and puts it into other parallel computing and in-memory software architectures. Macy s also says it achieved 70% hardware cost reductions. Kerem Tomak, VP of Analytics at Macys.com, is using similar approaches to time reduction for marketing offers to Macy s customers (see the, Big Data at Macys.com, case study). He notes that the company can run a lot more models with this timesaving s. [3] BIG DATA PROCESSING Big-data projects have a number of different layers of abstraction from abstraction of the data through to running analytics against the abstracted data. Following figure shows the basic elements of analytical Big-data and their interrelationships. The higher level components help 2

3 International Journal of Research in Engineering, Technology and Science, Volume VI, Special Issue, July ISSN make big data projects easier and more dynamic. Hadoop is often at the center of Big-data projects, but it is not a precondition. Fig2: Analysis of Big Data Components The components of analytical Big-data are given below Hadoop packaging and support organizations like Cloudera; to include MapReduce - essentially the compute layer of big data. Any File system like Hadoop Distributed File System (HDFS), that manages the retrieval and storing of data and metadata required for computation. Databases such as Hbase can also be used. A higher-level language such as Pig (part of Hadoop) can be used instead of using JAVA to simplify the writing of computations. A data warehouse layer named Hive is built on top of Hadoop A thin Java library named Cascading is sits on top of Hadoop to allow suites of MapReduce jobs to be run and managed as a unit. This is a widely used as a special tool CR-X, a Semi-automated modeling tool allow to develop interactively at great speed, and can help set up the database that will run the analytics. Greenplum or Netezza, a specialized scale-out analytic databases allows very fast load & reload the data for the analytic models ISV big data analytical packages like ClickFox and Merced run against the database to help address the business issues [4] HADOOP ARCHITECTURE A) Apache Hadoop It is an open-source software framework for storage and large scale processing of data sets on clusters of commodity hardware. Hadoop is an Apache toplevel project being built and used by a global community of contributors and users. It is licensed under the Apache License 2.0.The Apache Hadoop framework is composed of the following modules: Shaik Aleem Ur Rehaman, Raman Preet Kaur, Tanveer Baig Z, Saqib Rashid, Zahid Nazir Moon 3

4 A REVIEW ON HADOOP ARCHITECTURE FOR BIG DATA Hadoop Common - contains libraries and utilities needed by other Hadoop modules Hadoop Distributed File System (HDFS) a distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster. Hadoop YARN - a resource-management platform responsible for managing compute resources in clusters and using them for scheduling of users' applications. Hadoop Map Reduce - a programming model for large-scale data processing. All the modules in Hadoop are designed with a fundamental assumption that hardware failures (of individual machines, or racks of machines) are common and thus should be automatically handled in software by the framework. Apache Hadoop's MapReduce and HDFS components originally derived respectively from Google's MapReduce and Google File System (GFS) papers. For the end-users, though Map Reduce Java code is common, any programming language can be used with "Hadoop Streaming" to implement the "map" and "reduce" parts of the user's program. Apache Pig, Apache Hive among other related projects expose higher-level user interfaces like Pig Latin and a SQL variant respectively. The Hadoop framework itself is mostly written in the Java programming language, with some native code in C and command line utilities written as shell-scripts.apache Hadoop is a registered trademark of the Apache Software Foundation. B) Architecture of Hadoop Hadoop consists of the Hadoop Common package, which provides file system and OS level abstractions, a Map Reduce engine and the Hadoop Distributed File System (HDFS). The Hadoop Common package contains the necessary Java Archive (JAR) files and scripts needed to start Hadoop. The package also provides source code, documentation and a contribution section that includes projects from the Hadoop Community.For effective scheduling of work, every Hadoop-compatible file system should provide location awareness: the name of the rack (more precisely, of the network switch) where a worker node is. Hadoop applications can use this information to run work on the node where the data is, and, failing that, on the same rack/switch, reducing backbone traffic. HDFS uses this method when replicating data to try to keep different copies of the data on different racks. The goal is to reduce the impact of a rack power outage or switch failure, so that even if these events occur, the data may still be readable. 4

5 International Journal of Research in Engineering, Technology and Science, Volume VI, Special Issue, July ISSN Fig 3: A multi-node Hadoop cluster A small Hadoop cluster includes a single master and multiple worker nodes. The master node consists of a Job Tracker, Task Tracker, Name Node and Data Node. A slave or worker node acts as both a Data Node and Task Tracker, though it is possible to have data-only worker nodes and compute-only worker nodes. These are normally used only in nonstandard applications. Hadoop requires Java Runtime Environment (JRE) 1.6 or higher. The standard start-up and shutdown scripts require Secure Shell to be set up between nodes in the cluster. In a larger cluster, the HDFS is managed through a dedicated Name Node server to host the file system index, and a secondary Name Node that can generate snapshots of the name node s memory structures, thus preventing file-system corruption and reducing loss of data. Similarly, a standalone Job Tracker server can mage job scheduling. In clusters where the Hadoop MapReduce engine is deployed against an alternate file system, the Name Node, secondary Name Node and Data Node architecture of HDFS is replaced by the file-systemspecific equivalent. C) File System Hadoop distributed file system (HDFS) It is a distributed, scalable, and portable file-system written in Java for the Hadoop framework. Each node in a Hadoop instance typically has a single name node; a cluster of data nodes form the HDFS cluster. The situation is typical because each node does not require a data node to be present. Each data node serves up blocks of data over the network using a block protocol specific to HDFS. The file system uses the TCP/IP layer for communication. Clients use Remote procedure call (RPC) to communicate between each other. Shaik Aleem Ur Rehaman, Raman Preet Kaur, Tanveer Baig Z, Saqib Rashid, Zahid Nazir Moon 5

6 A REVIEW ON HADOOP ARCHITECTURE FOR BIG DATA Fig 4: illustrates a simple big data technology environment HDFS stores large files (typically in the range of gigabytes to tera bytes across multiple machines. It achieves reliability by replicating the data across multiple hosts, and hence does theoretically not require RAID storage on hosts (but to increase I/O performance some RAID configurations are still useful). With the default replication value, 3, data is stored on three nodes: two on the same rack, and one on a different rack. Data nodes can talk to each other to rebalance data, to move copies around, and to keep the replication of data high. HDFS is not fully POSIX-compliant, because the requirements for a POSIX file-system differ from the target goals for a Hadoop application. The tradeoff of not having a fully POSIX-compliant filesystem is increased performance for data throughput and support for non-posix operations such as Append. The HDFS file system includes a so-called secondary name node, which that when the primary name node goes offline, the secondary name node takes over. In fact, the secondary name node regularly connects with the primary name node and builds snapshots of the primary name node s directory information, which the system then saves to local or remote directories. These check pointed images can be used to restart a failed primary name node without having to replay the entire journal of file-system actions, then to edit the log to create an up-to-date directory structure. Because the name node is the single point for storage and management of metadata, it can become a bottleneck for supporting a huge number of files, especially a large number of small files. HDFS Federation, a new addition, aims to tackle this problem to a certain extent by allowing multiple name-spaces served by separate name nodes. [5] CREATING VALUE THROUGH BIG DATA McKinsey Global Institute conducted a research on big data where they pointed out five key areas where big data can create value like creating transparency, Employee performance improvement, segmenting populations to customer actions, Improve decision making, and innovating new product/service / business models. A) Creating transparency: If the company makes data available to the authorized person in a timely manner then it must be create transparency towards the company. In the organization it is also important to make data available to the inter-departmental use. B) Employee performance improvement: In organization it is very important to continuous improvement of the employee performance. Bid data can be important resource for the improving performance. As in data center employee s detail work history has been recorded. So if any employee not doing the task 6

7 International Journal of Research in Engineering, Technology and Science, Volume VI, Special Issue, July ISSN properly then it is very easy to analyze the work history and fine the solution which will improve the performance. C) Segmenting populations to customize actions: In marketing, customer segmentation is very important. Because through this company can seize right business strategies for the customer. Big data enables firm to collect detail information and buying pattern of the customer. Through analysis if any company offers precise product & service then customer will be happier. D) Improve decision making: Big data enables firm to collect detail information about customers and competitors. So by analyzing all data set, a firm can make better decision rather than who analyze only sample information. E) Innovating new product/service / business models: Through using big data a firm can offer new product / service to the existing/new customer. Because existing customer can provide excellent suggestion for new product & service. In addition with that their customer detail history can be a good spring of new business model [6] CONCLUSION AND FUTURE SCOPE Data volume is growing exponentially due to the explosion of machine-generated data (data records, web-log files, sensor data) and from growing human engagement within the social networks. As data volumes are growing exponentially, so is the concern over data preservation, access, dissemination, and usability. Many agencies has taken initiatives to research into areas such as automated analysis techniques, data mining, machine learning, privacy, and database interoperability and these will help to identify how big data can enable science in new ways and at new levels. The growth of data constitutes the Big Data phenomenon a technological phenomenon brought about by the rapid rate of data growth and parallel advancements in technology that have given rise to an ecosystem of software and hardware products that are enabling users to analyze this data to produce new and more granular levels of insight REFERNCES [1] Murnane, L. G., (April 9, 2012). Big Data: The Future Is Now. [2] Data,Data Everywhere The Economist.25 February [3] Big Data initiative to optimize geospatial intelligence.(10 April 2012). [4] Denne, S. (April 6, 2012). Big Data Success Stories: Opera Solutions. The Wall Street Journal. [5]Improving Pharmaceutical Research with Netezza Powered Analytics, (March 15, 2012). [6] Manyika,J.,Chui M.,Brown B., Bughin J., Dobbs R., Roxburgh C., & Byers A. H., (2011), Big data:the next [7] Putting real-time data to work and providing a platform for technology development. (December 15, 2010). [8] Smith, D., (2011). 5 real-world uses of big data. [9]Nijders,C;Matzat,Reips, BigData.BigGaps Of Knowledge In The Field Internet. International Journal Of Internet Science 7:1-5 [10] Big data brings big value. (February 29, 2012). IT Web Data management.retrieved April 13, 2012 Shaik Aleem Ur Rehaman, Raman Preet Kaur, Tanveer Baig Z, Saqib Rashid, Zahid Nazir Moon 7

8 A REVIEW ON HADOOP ARCHITECTURE FOR BIG DATA AUTHOR S BRIEF INTRODUCTION: 1. Shaik Aleem Ur Rehaman is currently pursuing his BE in electronics and communication engineering from HKBK college of engineering, Bangalore. He has presented over 30 papers in various national and international conferences.his area of interests are VLSI, Automation and robotics,etc. 2. Raman Preet Kaur is currently pursuing her BE in computer science engineering from HKBK college of engineering, Bangalore. Her area of interests are computer networking and java programming. 3. Tanveer Baig Z is currently working as assistant professor in Dept of ECE in HKBK college of engineering, Bangalore. His area of specialization is telecommunication. 4. Saqib Rashid is currently pursuing his BE in electronics and communication engineering from HKBK college of engineering, Bangalore.His area of interests are Embedded systems and Robotics. 5. Zahid Nazir Moon is currently pursuing his BE in electronics and communication engineering from HKBK college of engineering, Bangalore. His area of interest is Embedded systems and Robotics. 8

5th Annual. Cloudera, Inc. All rights reserved.

5th Annual. Cloudera, Inc. All rights reserved. 5th Annual 1 The Essentials of Apache Hadoop The What, Why and How to Meet Agency Objectives Sarah Sproehnle, Vice President, Customer Success 2 Introduction 3 What is Apache Hadoop? Hadoop is a software

More information

ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS)

ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS) ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS) Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how Dell EMC Elastic Cloud Storage (ECS ) can be used to streamline

More information

BIG DATA AND HADOOP DEVELOPER

BIG DATA AND HADOOP DEVELOPER BIG DATA AND HADOOP DEVELOPER Approximate Duration - 60 Hrs Classes + 30 hrs Lab work + 20 hrs Assessment = 110 Hrs + 50 hrs Project Total duration of course = 160 hrs Lesson 00 - Course Introduction 0.1

More information

Bringing the Power of SAS to Hadoop Title

Bringing the Power of SAS to Hadoop Title WHITE PAPER Bringing the Power of SAS to Hadoop Title Combine SAS World-Class Analytics With Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities ii Contents Introduction... 1 What

More information

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

E-guide Hadoop Big Data Platforms Buyer s Guide part 1 Hadoop Big Data Platforms Buyer s Guide part 1 Your expert guide to Hadoop big data platforms for managing big data David Loshin, Knowledge Integrity Inc. Companies of all sizes can use Hadoop, as vendors

More information

SAS and Hadoop Technology: Overview

SAS and Hadoop Technology: Overview SAS and Hadoop Technology: Overview SAS Documentation September 19, 2017 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS and Hadoop Technology: Overview.

More information

Spark, Hadoop, and Friends

Spark, Hadoop, and Friends Spark, Hadoop, and Friends (and the Zeppelin Notebook) Douglas Eadline Jan 4, 2017 NJIT Presenter Douglas Eadline deadline@basement-supercomputing.com @thedeadline HPC/Hadoop Consultant/Writer http://www.basement-supercomputing.com

More information

HADOOP ADMINISTRATION

HADOOP ADMINISTRATION HADOOP ADMINISTRATION PROSPECTUS HADOOP ADMINISTRATION UNIVERSITY OF SKILLS ABOUT ISM UNIV UNIVERSITY OF SKILLS ISM UNIV is established in 1994, past 21 years this premier institution has trained over

More information

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW TOPICS COVERED 1 2 Fundamentals of Big Data Platforms Major Big Data Tools Scaling Up vs. Out SCALE UP (SMP) SCALE OUT (MPP) + (n) Upgrade

More information

LEVERAGING DATA ANALYTICS TO GAIN COMPETITIVE ADVANTAGE IN YOUR INDUSTRY

LEVERAGING DATA ANALYTICS TO GAIN COMPETITIVE ADVANTAGE IN YOUR INDUSTRY LEVERAGING DATA ANALYTICS TO GAIN COMPETITIVE ADVANTAGE IN YOUR INDUSTRY Unlock the value of your data with analytics solutions from Dell EMC ABSTRACT To unlock the value of their data, organizations around

More information

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand Paper 2698-2018 Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand ABSTRACT Digital analytics is no longer just about tracking the number

More information

Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer

Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer Redefine Big Data: EMC Data Lake in Action Andrea Prosperi Systems Engineer 1 Agenda Data Analytics Today Big data Hadoop & HDFS Different types of analytics Data lakes EMC Solutions for Data Lakes 2 The

More information

Business is being transformed by three trends

Business is being transformed by three trends Business is being transformed by three trends Big Cloud Intelligence Stay ahead of the curve with Cortana Intelligence Suite Business apps People Custom apps Apps Sensors and devices Cortana Intelligence

More information

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11 Top 5 Challenges for Hadoop MapReduce in the Enterprise Whitepaper - May 2011 http://platform.com/mapreduce 2 5/9/11 Table of Contents Introduction... 2 Current Market Conditions and Drivers. Customer

More information

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica Accelerating Your Big Data Analytics Jeff Healey, Director Product Marketing, HPE Vertica Recent Waves of Disruption IT Infrastructu re for Analytics Data Warehouse Modernization Big Data/ Hadoop Cloud

More information

From Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN

From Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN From Information to Insight: The Big Value of Big Data Faire Ann Co Marketing Manager, Information Management Software, ASEAN The World is Changing and Becoming More INSTRUMENTED INTERCONNECTED INTELLIGENT

More information

Sr. Sergio Rodríguez de Guzmán CTO PUE

Sr. Sergio Rodríguez de Guzmán CTO PUE PRODUCT LATEST NEWS Sr. Sergio Rodríguez de Guzmán CTO PUE www.pue.es Hadoop & Why Cloudera Sergio Rodríguez Systems Engineer sergio@pue.es 3 Industry-Leading Consulting and Training PUE is the first Spanish

More information

1. Intoduction to Hadoop

1. Intoduction to Hadoop 1. Intoduction to Hadoop Hadoop is a rapidly evolving ecosystem of components for implementing the Google MapReduce algorithms in a scalable fashion on commodity hardware. Hadoop enables users to store

More information

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK Are you drowning in Big Data? Do you lack access to your data? Are you having a hard time managing Big Data processing requirements?

More information

Big Data. By Michael Covert. April 2012

Big Data. By Michael Covert. April 2012 Big By Michael Covert April 2012 April 18, 2012 Proprietary and Confidential 2 What is Big why are we discussing it? A brief history of High Performance Computing Parallel processing Algorithms The No

More information

Hadoop Integration Deep Dive

Hadoop Integration Deep Dive Hadoop Integration Deep Dive Piyush Chaudhary Spectrum Scale BD&A Architect 1 Agenda Analytics Market overview Spectrum Scale Analytics strategy Spectrum Scale Hadoop Integration A tale of two connectors

More information

SAS & HADOOP ANALYTICS ON BIG DATA

SAS & HADOOP ANALYTICS ON BIG DATA SAS & HADOOP ANALYTICS ON BIG DATA WHY HADOOP? OPEN SOURCE MASSIVE SCALE FAST PROCESSING COMMODITY COMPUTING DATA REDUNDANCY DISTRIBUTED WHY HADOOP? Hadoop will soon become a replacement complement to:

More information

Why Big Data Matters? Speaker: Paras Doshi

Why Big Data Matters? Speaker: Paras Doshi Why Big Data Matters? Speaker: Paras Doshi If you re wondering about what is Big Data and why does it matter to you and your organization, then come to this talk and get introduced to Big Data and learn

More information

Nouvelle Génération de l infrastructure Data Warehouse et d Analyses

Nouvelle Génération de l infrastructure Data Warehouse et d Analyses Nouvelle Génération de l infrastructure Data Warehouse et d Analyses November 2011 André Münger andre.muenger@emc.com +41 79 708 85 99 1 Agenda BIG Data Challenges Greenplum Overview Use Cases Summary

More information

Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect

Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect 2005 Concert de Coldplay 2014 Concert de Coldplay 90% of the world s data has been created over the last two years alone 1 1. Source

More information

Achieving Agility and Flexibility in Big Data Analytics with the Urika -GX Agile Analytics Platform

Achieving Agility and Flexibility in Big Data Analytics with the Urika -GX Agile Analytics Platform Achieving Agility and Flexibility in Big Data Analytics with the Urika -GX Agile Analytics Platform Analytics R&D and Product Management Document Version 1 WP-Urika-GX-Big-Data-Analytics-0217 www.cray.com

More information

Big Data The Big Story

Big Data The Big Story Big Data The Big Story Jean-Pierre Dijcks Big Data Product Mangement 1 Agenda What is Big Data? Architecting Big Data Building Big Data Solutions Oracle Big Data Appliance and Big Data Connectors Customer

More information

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration KnowledgeSTUDIO Advanced Modeling for Better Decisions Companies that compete with analytics are looking for advanced analytical technologies that accelerate decision making and identify opportunities

More information

Big Data: Essential Elements to a Successful Modernization Strategy

Big Data: Essential Elements to a Successful Modernization Strategy Big Data: Essential Elements to a Successful Modernization Strategy Ashish Verma Director, Deloitte Consulting Technology Information Management Deloitte Consulting Presented by #pbls14 #pbls14 Presented

More information

Five Questions to Ask Before Choosing a Hadoop Distribution

Five Questions to Ask Before Choosing a Hadoop Distribution Five Questions to Ask Before Choosing a Hadoop Distribution SPONSORED BY CONTENTS Introduction 1 1. What does it take to make Hadoop enterprise-ready? 1 2. Does the distribution offer scalability, reliability,

More information

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition delivers high-performance data movement and transformation among enterprise platforms with its open and integrated E-LT

More information

The Intersection of Big Data and DB2

The Intersection of Big Data and DB2 The Intersection of Big Data and DB2 May 20, 2014 Mike McCarthy, IBM Big Data Channels Development mmccart1@us.ibm.com Agenda What is Big Data? Concepts Characteristics What is Hadoop Relational vs Hadoop

More information

Big Data in Cloud. 堵俊平 Apache Hadoop Committer Staff Engineer, VMware

Big Data in Cloud. 堵俊平 Apache Hadoop Committer Staff Engineer, VMware Big Data in Cloud 堵俊平 Apache Hadoop Committer Staff Engineer, VMware Bio 堵俊平 (Junping Du) - Join VMware in 2008 for cloud product first - Initiate earliest effort on big data within VMware since 2010 -

More information

WELCOME TO. Cloud Data Services: The Art of the Possible

WELCOME TO. Cloud Data Services: The Art of the Possible WELCOME TO Cloud Data Services: The Art of the Possible Goals for Today Share the cloud-based data management and analytics technologies that are enabling rapid development of new mobile applications Discuss

More information

MapR Pentaho Business Solutions

MapR Pentaho Business Solutions MapR Pentaho Business Solutions The Benefits of a Converged Platform to Big Data Integration Tom Scurlock Director, WW Alliances and Partners, MapR Key Takeaways 1. We focus on business values and business

More information

Insights to HDInsight

Insights to HDInsight Insights to HDInsight Why Hadoop in the Cloud? No hardware costs Unlimited Scale Pay for What You Need Deployed in minutes Azure HDInsight Big Data made easy Enterprise Ready Easier and more productive

More information

Hadoop and Analytics at CERN IT CERN IT-DB

Hadoop and Analytics at CERN IT CERN IT-DB Hadoop and Analytics at CERN IT CERN IT-DB 1 Hadoop Use cases Parallel processing of large amounts of data Perform analytics on a large scale Dealing with complex data: structured, semi-structured, unstructured

More information

More information for FREE VS ENTERPRISE LICENCE :

More information for FREE VS ENTERPRISE LICENCE : Source : http://www.splunk.com/ Splunk Enterprise is a fully featured, powerful platform for collecting, searching, monitoring and analyzing machine data. Splunk Enterprise is easy to deploy and use. It

More information

Engaging in Big Data Transformation in the GCC

Engaging in Big Data Transformation in the GCC Sponsored by: IBM Author: Megha Kumar December 2015 Engaging in Big Data Transformation in the GCC IDC Opinion In a rapidly evolving IT ecosystem, "transformation" and in some cases "disruption" is changing

More information

Machine-generated data: creating new opportunities for utilities, mobile and broadcast networks

Machine-generated data: creating new opportunities for utilities, mobile and broadcast networks APPLICATION BRIEF Machine-generated data: creating new opportunities for utilities, mobile and broadcast networks Electronic devices generate data every millisecond they are in operation. This data is

More information

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB Data Analytics and CERN IT Hadoop Service CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB 1 Data Analytics at Scale The Challenge When you cannot fit your workload in a desktop Data

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 19 1 Acknowledgement The following discussion is based on the paper Mining Big Data: Current Status, and Forecast to the Future by Fan and Bifet and online presentation

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

More information

ETL on Hadoop What is Required

ETL on Hadoop What is Required ETL on Hadoop What is Required Keith Kohl Director, Product Management October 2012 Syncsort Copyright 2012, Syncsort Incorporated Agenda Who is Syncsort Extract, Transform, Load (ETL) Overview and conventional

More information

Cask Data Application Platform (CDAP)

Cask Data Application Platform (CDAP) Cask Data Application Platform (CDAP) CDAP is an open source, Apache 2.0 licensed, distributed, application framework for delivering Hadoop solutions. It integrates and abstracts the underlying Hadoop

More information

White paper A Reference Model for High Performance Data Analytics(HPDA) using an HPC infrastructure

White paper A Reference Model for High Performance Data Analytics(HPDA) using an HPC infrastructure White paper A Reference Model for High Performance Data Analytics(HPDA) using an HPC infrastructure Discover how to reshape an existing HPC infrastructure to run High Performance Data Analytics (HPDA)

More information

Oracle Big Data Discovery The Visual Face of Big Data

Oracle Big Data Discovery The Visual Face of Big Data Oracle Big Data Discovery The Visual Face of Big Data Today's Big Data challenge is not how to store it, but how to make sense of it. Oracle Big Data Discovery is a fundamentally new approach to making

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

Oracle Big Data Cloud Service

Oracle Big Data Cloud Service Oracle Big Data Cloud Service Delivering Hadoop, Spark and Data Science with Oracle Security and Cloud Simplicity Oracle Big Data Cloud Service is an automated service that provides a highpowered environment

More information

Operational Hadoop and the Lambda Architecture for Streaming Data

Operational Hadoop and the Lambda Architecture for Streaming Data Operational Hadoop and the Lambda Architecture for Streaming Data 2015 MapR Technologies 2015 MapR Technologies 1 Topics From Batch to Operational Workloads on Hadoop Streaming Data Environments The Lambda

More information

Building Your Big Data Team

Building Your Big Data Team Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.

More information

Datasheet FUJITSU Integrated System PRIMEFLEX for Hadoop

Datasheet FUJITSU Integrated System PRIMEFLEX for Hadoop Datasheet FUJITSU Integrated System PRIMEFLEX for Hadoop FUJITSU Integrated System PRIMEFLEX for Hadoop is a powerful and scalable platform analyzing big data volumes at high velocity FUJITSU Integrated

More information

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud DAMA Datametica The Modern Data Platform Enterprise Data Hub Implementations What is happening with Hadoop Why is workload moving to Cloud 1 The Modern Data Platform The Enterprise Data Hub What do we

More information

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop Simplifying the Process of Uploading and Extracting Data from Apache Hadoop Rohit Bakhshi, Solution Architect, Hortonworks Jim Walker, Director Product Marketing, Talend Page 1 About Us Rohit Bakhshi Solution

More information

BIG DATA and DATA SCIENCE

BIG DATA and DATA SCIENCE Integrated Program In BIG DATA and DATA SCIENCE CONTINUING STUDIES Table of Contents About the Course...03 Key Features of Integrated Program in Big Data and Data Science...04 Learning Path...05 Key Learning

More information

Big Data & Hadoop Advance

Big Data & Hadoop Advance Course Durations: 30 Hours About Company: Course Mode: Online/Offline EduNextgen extended arm of Product Innovation Academy is a growing entity in education and career transformation, specializing in today

More information

Big Data Job Descriptions. Software Engineer - Algorithms

Big Data Job Descriptions. Software Engineer - Algorithms Big Data Job Descriptions Software Engineer - Algorithms This position is responsible for meeting the big data needs of our various products and businesses. Specifically, this position is responsible for

More information

Product Brief SysTrack VMP

Product Brief SysTrack VMP Product Brief SysTrack VMP Benefits Optimize desktop and server virtualization and terminal server projects Anticipate and handle problems in the planning stage instead of postimplementation Use iteratively

More information

Cask Data Application Platform (CDAP) The Integrated Platform for Developers and Organizations to Build, Deploy, and Manage Data Applications

Cask Data Application Platform (CDAP) The Integrated Platform for Developers and Organizations to Build, Deploy, and Manage Data Applications Cask Data Application Platform (CDAP) The Integrated Platform for Developers and Organizations to Build, Deploy, and Manage Data Applications Copyright 2015 Cask Data, Inc. All Rights Reserved. February

More information

20775: Performing Data Engineering on Microsoft HD Insight

20775: Performing Data Engineering on Microsoft HD Insight Let s Reach For Excellence! TAN DUC INFORMATION TECHNOLOGY SCHOOL JSC Address: 103 Pasteur, Dist.1, HCMC Tel: 08 38245819; 38239761 Email: traincert@tdt-tanduc.com Website: www.tdt-tanduc.com; www.tanducits.com

More information

Meetup DB2 LUW - Madrid. IBM dashdb. Raquel Cadierno Torre IBM 1 de Julio de IBM Corporation

Meetup DB2 LUW - Madrid. IBM dashdb. Raquel Cadierno Torre IBM 1 de Julio de IBM Corporation IBM dashdb Raquel Cadierno Torre IBM Analytics @IBMAnalytics rcadierno@es.ibm.com 1 de Julio de 2016 1 2016 IBM Corporation What is dashdb? http://www.ibm.com/analytics/us/en/technology/cloud-data-services/dashdb/

More information

Jason Virtue Business Intelligence Technical Professional

Jason Virtue Business Intelligence Technical Professional Jason Virtue Business Intelligence Technical Professional jvirtue@microsoft.com Agenda Microsoft Azure Data Services Azure Cloud Services Azure Machine Learning Azure Service Bus Azure Stream Analytics

More information

Architecture Overview for Data Analytics Deployments

Architecture Overview for Data Analytics Deployments Architecture Overview for Data Analytics Deployments Mahmoud Ghanem Sr. Systems Engineer GLOBAL SPONSORS Agenda The Big Picture Top Use Cases for Data Analytics Modern Architecture Concepts for Data Analytics

More information

Microsoft Azure Essentials

Microsoft Azure Essentials Microsoft Azure Essentials Azure Essentials Track Summary Data Analytics Explore the Data Analytics services in Azure to help you analyze both structured and unstructured data. Azure can help with large,

More information

Adobe Deploys Hadoop as a Service on VMware vsphere

Adobe Deploys Hadoop as a Service on VMware vsphere Adobe Deploys Hadoop as a Service A TECHNICAL CASE STUDY APRIL 2015 Table of Contents A Technical Case Study.... 3 Background... 3 Why Virtualize Hadoop on vsphere?.... 3 The Adobe Marketing Cloud and

More information

Cloudera Hadoop & Industrie 4.0 wohin mit dem Datenstrom?

Cloudera Hadoop & Industrie 4.0 wohin mit dem Datenstrom? Cloudera Hadoop & Industrie 4.0 wohin mit dem Datenstrom? Bernard Doering Regional Sales Director, Central Europe 1 Cloudera Hadoop Scalable Flexible Open Cost- EffecLve 2 2014 Cloudera, Inc. All rights

More information

BIG DATA TRANSFORMS BUSINESS. Copyright 2012 EMC Corporation. All rights reserved.

BIG DATA TRANSFORMS BUSINESS. Copyright 2012 EMC Corporation. All rights reserved. BIG DATA TRANSFORMS BUSINESS 1 IN 2000 THE WORLD GENERATED TWO EXABYTES OF NEW INFORMATION Sources: How Much Information? Peter Lyman and Hal Varian, UC Berkeley,. 2011 IDC Digital Universe Study. 2 IN

More information

GET MORE VALUE OUT OF BIG DATA

GET MORE VALUE OUT OF BIG DATA GET MORE VALUE OUT OF BIG DATA Enterprise data is increasing at an alarming rate. An International Data Corporation (IDC) study estimates that data is growing at 50 percent a year and will grow by 50 times

More information

COPYRIGHTED MATERIAL. 1Big Data and the Hadoop Ecosystem

COPYRIGHTED MATERIAL. 1Big Data and the Hadoop Ecosystem 1Big Data and the Hadoop Ecosystem WHAT S IN THIS CHAPTER? Understanding the challenges of Big Data Getting to know the Hadoop ecosystem Getting familiar with Hadoop distributions Using Hadoop-based enterprise

More information

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia MapR: Converged Data Pla3orm and Quick Start Solu;ons Robin Fong Regional Director South East Asia Who is MapR? MapR is the creator of the top ranked Hadoop NoSQL SQL-on-Hadoop Real Database time streaming

More information

Azure ML Data Camp. Ivan Kosyakov MTC Architect, Ph.D. Microsoft Technology Centers Microsoft Technology Centers. Experience the Microsoft Cloud

Azure ML Data Camp. Ivan Kosyakov MTC Architect, Ph.D. Microsoft Technology Centers Microsoft Technology Centers. Experience the Microsoft Cloud Microsoft Technology Centers Microsoft Technology Centers Experience the Microsoft Cloud Experience the Microsoft Cloud ML Data Camp Ivan Kosyakov MTC Architect, Ph.D. Top Manager IT Analyst Big Data Strategic

More information

IBM ICE (Innovation Centre for Education) Welcome to: Unit 1 Overview of delivery models in Cloud Computing. Copyright IBM Corporation

IBM ICE (Innovation Centre for Education) Welcome to: Unit 1 Overview of delivery models in Cloud Computing. Copyright IBM Corporation Welcome to: Unit 1 Overview of delivery models in Cloud Computing 9.1 Unit Objectives After completing this unit, you should be able to: Understand cloud history and cloud computing Describe the anatomy

More information

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

Cloud Integration and the Big Data Journey - Common Use-Case Patterns Cloud Integration and the Big Data Journey - Common Use-Case Patterns A White Paper August, 2014 Corporate Technologies Business Intelligence Group OVERVIEW The advent of cloud and hybrid architectures

More information

Hadoop in Production. Charles Zedlewski, VP, Product

Hadoop in Production. Charles Zedlewski, VP, Product Hadoop in Production Charles Zedlewski, VP, Product Cloudera In One Slide Hadoop meets enterprise Investors Product category Business model Jeff Hammerbacher Amr Awadallah Doug Cutting Mike Olson - CEO

More information

GE Intelligent Platforms. Proficy Historian HD

GE Intelligent Platforms. Proficy Historian HD GE Intelligent Platforms Proficy Historian HD The Industrial Big Data Historian Industrial machines have always issued early warnings, but in an inconsistent way and in a language that people could not

More information

TechValidate Survey Report. Converged Data Platform Key to Competitive Advantage

TechValidate Survey Report. Converged Data Platform Key to Competitive Advantage TechValidate Survey Report Converged Data Platform Key to Competitive Advantage TechValidate Survey Report Converged Data Platform Key to Competitive Advantage Executive Summary What Industry Analysts

More information

Leveraging Oracle Big Data Discovery to Master CERN s Data. Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden

Leveraging Oracle Big Data Discovery to Master CERN s Data. Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden Leveraging Oracle Big Data Discovery to Master CERN s Data Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden Manuel Martin Marquez Intel IoT Ignition Lab Cloud and

More information

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake White Paper Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake Motivation for Modernization It is now a well-documented realization among Fortune 500 companies

More information

E-guide Hadoop Big Data Platforms Buyer s Guide part 3

E-guide Hadoop Big Data Platforms Buyer s Guide part 3 Big Data Platforms Buyer s Guide part 3 Your expert guide to big platforms enterprise MapReduce cloud-based Abie Reifer, DecisionWorx The Amazon Elastic MapReduce Web service offers a managed framework

More information

Beyond Ceilometer Metering and Billing

Beyond Ceilometer Metering and Billing www.persistent.com Beyond Ceilometer Metering and Billing Cloud Analytics opportunity Usage Polling Are you running Ceilometer? Are you using only for metering? How are you archiving your Ceilometer Data?

More information

Big Data Meets High Performance Computing

Big Data Meets High Performance Computing White Paper Intel Enterprise Edition for Lustre* Software High Performance Data Division Big Data Meets High Performance Computing Intel Enterprise Edition for Lustre* software and Hadoop* combine to bring

More information

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved. Apache Spark 2.0 GA The General Engine for Modern Analytic Use Cases 1 Apache Spark Drives Business Innovation Apache Spark is driving new business value that is being harnessed by technology forward organizations.

More information

The Alpine Data Platform

The Alpine Data Platform The Alpine Data Platform TABLE OF CONTENTS ABOUT ALPINE.... 2 ALPINE PRODUCT OVERVIEW... 3 PRODUCT ARCHITECTURE.... 5 SYSTEM REQUIREMENTS.... 6 ABOUT ALPINE DATA ADVANCED ANALYTICS FOR THE ENTERPRISE Alpine

More information

Machina Research White Paper for ABO DATA. Data aware platforms deliver a differentiated service in M2M, IoT and Big Data

Machina Research White Paper for ABO DATA. Data aware platforms deliver a differentiated service in M2M, IoT and Big Data Machina Research White Paper for ABO DATA Data aware platforms deliver a differentiated service in M2M, IoT and Big Data December 2013 Connections (billion) Introduction More and more businesses are making

More information

HadoopWeb: MapReduce Platform for Big Data Analysis

HadoopWeb: MapReduce Platform for Big Data Analysis HadoopWeb: MapReduce Platform for Big Data Analysis Saloni Minocha 1, Jitender Kumar 2,s Hari Singh 3, Seema Bawa 4 1Student, Computer Science Department, N.C. College of Engineering, Israna, Panipat,

More information

RDMA Hadoop, Spark, and HBase middleware on the XSEDE Comet HPC resource.

RDMA Hadoop, Spark, and HBase middleware on the XSEDE Comet HPC resource. RDMA Hadoop, Spark, and HBase middleware on the XSEDE Comet HPC resource. Mahidhar Tatineni, SDSC ECSS symposium December 19, 2017 Collaborative project with Dr D.K. Panda s Network Based Computing lab

More information

Building a Multi-Tenant Infrastructure for Diverse Application Workloads

Building a Multi-Tenant Infrastructure for Diverse Application Workloads Building a Multi-Tenant Infrastructure for Diverse Application Workloads Rick Janowski Marketing Manager IBM Platform Computing 1 The Why and What of Multi-Tenancy 2 Parallelizable problems demand fresh

More information

WorkloadWisdom Storage performance analytics for comprehensive workload insight

WorkloadWisdom Storage performance analytics for comprehensive workload insight DATASHEET Storage performance analytics for comprehensive workload insight software is the industry s only automated workload acquisition, workload analysis, workload modeling, and workload performance

More information

Design of material management system of mining group based on Hadoop

Design of material management system of mining group based on Hadoop IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS Design of material system of mining group based on Hadoop To cite this article: Zhiyuan Xia et al 2018 IOP Conf. Ser.: Earth Environ.

More information

Engineering Unplugged: A Discussion With Pure Storage s Brian Gold on Big Data Analytics for Apache Spark

Engineering Unplugged: A Discussion With Pure Storage s Brian Gold on Big Data Analytics for Apache Spark Engineering Unplugged: A Discussion With Pure Storage s Brian Gold on Big Data Analytics for Apache Spark Q&A Apache Spark has become a vital technology for development teams looking to leverage an ultrafast

More information

WebFOCUS: Business Intelligence and Analytics Platform

WebFOCUS: Business Intelligence and Analytics Platform WebFOCUS: Business Intelligence and Analytics Platform Strategic BI and Analytics for the Enterprise Features Extensive self-service for everyone Powerful browser-based authoring tool Create reusable analytical

More information

Real-time Streaming Insight & Time Series Data Analytic For Smart Retail

Real-time Streaming Insight & Time Series Data Analytic For Smart Retail Real-time Streaming Insight & Time Series Data Analytic For Smart Retail Sudip Majumder Senior Director Development Industry IoT & Big Data 10/5/2016 Economic Characteristics of Data Data is the New Oil..then

More information

Considerations and Best Practices for Migrating to an IP-based Access Control System

Considerations and Best Practices for Migrating to an IP-based Access Control System WHITE PAPER Considerations and Best Practices for Migrating to an IP-based Access Control System Innovative Solutions Executive Summary Migrating from an existing legacy Access Control System (ACS) to

More information

Cloud Based Big Data Analytic: A Review

Cloud Based Big Data Analytic: A Review International Journal of Cloud-Computing and Super-Computing Vol. 3, No. 1, (2016), pp.7-12 http://dx.doi.org/10.21742/ijcs.2016.3.1.02 Cloud Based Big Data Analytic: A Review A.S. Manekar 1, and G. Pradeepini

More information

Oracle Big Data Discovery Cloud Service

Oracle Big Data Discovery Cloud Service Oracle Big Data Discovery Cloud Service The Visual Face of Big Data in Oracle Cloud Oracle Big Data Discovery Cloud Service provides a set of end-to-end visual analytic capabilities that leverages the

More information

In-Memory Analytics: Get Faster, Better Insights from Big Data

In-Memory Analytics: Get Faster, Better Insights from Big Data Discussion Summary In-Memory Analytics: Get Faster, Better Insights from Big Data January 2015 Interview Featuring: Tapan Patel, SAS Institute, Inc. Introduction A successful analytics program should translate

More information

IBM Big Data Summit 2012

IBM Big Data Summit 2012 IBM Big Data Summit 2012 12.10.2012 InfoSphere BigInsights Introduction Wilfried Hoge Leading Technical Sales Professional hoge@de.ibm.com twitter.com/wilfriedhoge 12.10.1012 IBM Big Data Strategy: Move

More information

Using the Blaze Engine to Run Profiles and Scorecards

Using the Blaze Engine to Run Profiles and Scorecards Using the Blaze Engine to Run Profiles and Scorecards 1993, 2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording

More information

Addressing World-Scale Challenges. Computation as a powerful change agent in areas such as Energy, Environment, Healthcare, Education

Addressing World-Scale Challenges. Computation as a powerful change agent in areas such as Energy, Environment, Healthcare, Education Addressing World-Scale Challenges Computation as a powerful change agent in areas such as Energy, Environment, Healthcare, Education Collaboration and Community Massive amounts of data collected and aggregated

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing I am here to help buzzetti@us.ibm.com Historic Waves of Economic and Social Transformation Industrial Revolution Age of Steam and Railways Age of Steel and Electricity Age

More information