Research on the Framework and Data Fusion of an Energy Big-data Platform

Size: px
Start display at page:

Download "Research on the Framework and Data Fusion of an Energy Big-data Platform"

Transcription

1 1 Paper Number: 17PESGM2652 Panel: Big data for Integrated Energy Systems Research on the Framework and Data Fusion of an Energy Big-data Platform Gengfeng Li, Zhaohong Bie, Jiang Wu, Cheng Li Xi an Jiaotong University 21 July 2017

2 2 Content 01 Integrated Energy System 02 Framework of an Energy Big-data Platform 03 Multi-source Heterogeneous Data Fusion 04 Conclusions and Future Work

3 3 Integrated Energy System (Energy Internet)

4 4 01 Integrated Energy System New challenges Sustainable development Energy crisis Environmental degradation Global climate change Revolution of energy production and consumption 1 st Industrial Revolution: Appearance of steam engine 2 st Industrial Revolution: Wide use of electricity 3 st Industrial Revolution: Nuclear power, computers 4 st Industrial Revolution: Cyberphysical system Time: end of 18 th century start of 20 th century 70s of 20 th century now Integrated Energy System This new concept attracts a lot of attention Highly integrated and interdependent energy and cyber systems There is no consensus on an exact definition

5 5 01 Integrated Energy System Integrated Energy System - seen from power system field Centralized resources renewable energy Demand response renewable energy Power system at core Maximum integration of renewable energy Centralized and distributed resources Integrated energy system considering demand response A new generation of energy system with electricity at core

6 01 Integrated Energy System 6 Integrated Energy System - seen from cyber system field To innovate the current energy system based on the benefits of the cyber system, including openness, free flow of energy and peer-access

7 01 Integrated Energy System Integrated Energy System - seen from industry field 7 Global Integrated Energy System State Grid: to build a Global Integrated Energy System, based on ultra high-voltage electric transmission system, sharing and utilization renewable energy

8 01 Integrated Energy System Integrated Energy System - seen from finance field 8 cloud platform Business mode of Integrated Energy System Based on new technology and development trend Create revenue for customers and service providers Innovation for energy supply 8

9 01 Integrated Energy System Electricity field 9 Automation field Industry field Integrated Energy System Finance field Cyber field

10 10 01 Integrated Energy System 2016 National Key Research and Development Program of China Researches on basic theory of planning, operation and trading for Integrated Energy System (2016YFB ) (Principal Investigator)

11 01 Integrated Energy System 11 Wind Farm PV Power Station Traditional power system Tower Tower Transmission Line Tower Substation Substation Resident Users Power Plant Factory generation transmission distribution Residential Building Tower Business Building utilization

12 12 01 Integrated Energy System Architecture Wind Farm Tower Tower Charging Pile Hydrogen Production Plant Product H 2 PV Power Station Tower Transmission Line EV Product H 2 Substation Substation Resident Users Power Plant Factory Product CH 4 Generate Electricity Using Natural Gas Gas Power Plant Gas Power Plant Residential Building Tower Store H 2 H 2 Storage Container Product CH 4 Natural Gas Pipeline Pressurizer Natural Gas Pipeline Information Exchange Pressurizer Heating Station Heat Storage Business Building Electrified Traffic System Information Exchange Source Information Exchange the Internet Information Exchange Cloud Computing Equipment 1. A variety of primary energy including wind, solar, gas and coal; 2. Spatial and temporal distribution analysis of different resources; 3. Intermittent of renewable energy;

13 13 01 Integrated Energy System Architecture Wind Farm Tower Tower Charging Pile Hydrogen Production Plant Product H 2 PV Power Station Tower Transmission Line EV Product H 2 Substation Substation Resident Users Power Plant Factory Product CH 4 Generate Electricity Using Natural Gas Gas Power Plant Gas Power Plant Residential Building Tower Store H 2 H 2 Storage Container Product CH 4 Natural Gas Pipeline Pressurizer Natural Gas Pipeline Information Exchange Pressurizer Heating Station Heat Storage Business Building Electrified Traffic System Information Exchange Source Network Information Exchange the Internet Information Exchange Cloud Computing Equipment 1. Coupled Natural Gas and Electric Power Network; 2. Large scale, different time scale flows: electricity and gas flow; 3. Big data from both Natural Gas and Electric Power Systems.

14 14 01 Integrated Energy System Architecture Wind Farm Tower Tower Charging Pile Hydrogen Production Plant Product H 2 PV Power Station Tower Transmission Line EV Product H 2 Substation Substation Resident Users Power Plant Factory Product CH 4 Generate Electricity Using Natural Gas Gas Power Plant Gas Power Plant Residential Building Tower Store H 2 H 2 Storage Container Product CH 4 Natural Gas Pipeline Pressurizer Natural Gas Pipeline Information Exchange Pressurizer Heating Station Heat Storage Business Building Electrified Traffic System Information Exchange the Internet Information Exchange Source Network Demand Information Exchange Cloud Computing Equipment 1. Distributed sources: CHP, DG, etc.; 2. Coupled with Transportation Systems via electric vehicles; 3. Multi-energy demand and transform: cooling, heating, gas, electricity.

15 01 Integrated Energy System 15 Modeling Modeling of of cyber-physical Integrated Energy system Systems based on Energy Internet Mixed dynamic system modeing United Interface system modeling Stochastic system simulation and evaluation stochastic modeling of compatibility modeling System planning Planning and business Trading model for Mechanism Energy Internet Coordinated planning Integrated market Distributed game theory analysis Optimal Optimal operation operation and and dispatch dispatch for Energy Internet Stochastic optimization Integrated risk analysis Distributed optimization algorithm Planning platform Trading simulation platform Big data center Operation simulation platform simulation Comprehensive Comprehensive demonstration demonstration platform of system Integrated of Energy Systems Internet demonstration

16 16 Framework of an Energy Big-data Platform

17 02 Framework of an Energy Big-data Platform Data for demonstration Jiuquan wind power base 17 Hydropower base of the Yellow River Electric Power Network of Shaanxi high energy-consuming enterprises Electric Power Network of Northwest China Energy base of northern Shaanxi Distributed multiple energy system Energy system of Xi an

18 02 Framework of an Energy Big-data Platform 18 Features of the Energy Big-data Platform Planning Platform Trading Platform Collection of multi-type energy data Big-data Platform Fast Query of multi-type energy data Data processing and computing Operation Platform

19 02 Framework of an Energy Big-data Platform 19 (1) Collection of multi-type energy data 1 Real-time Data Collection Wind Farm PV Station H 2 Production Plant... Kafka Cluster Hadoop Cluster Real-time Monitoring Database Real-time data flow collection based on Kafka

20 02 Framework of an Energy Big-data Platform 20 (1) Collection of multi-type energy data 2 Storage 3 Data Fusion Challenge Big data volume(tb level) Solution Hadoop Distributed File System (HDFS) Challenge Sources of data various Heterogeneous characteristics Solution Multi-source Heterogeneous Data Fusion Raw Data Semantic heterogeneous data fusion & System heterogeneous data fusion Big-data Platform

21 02 Framework of an Energy Big-data Platform 21 (2) Fast Query of multi-type energy data Challenge Big data rapid indexing Solution A distributed, scalable, big data store: Apache Hbase. Hadoop Data Dase Hbase Basic storage support HDFS Computing power support MapReduce Coordination services & Failover ZooKeeper High level language support Hive

22 02 Framework of an Energy Big-data Platform 22 (3) Data processing and computing Mining Optimization Visualization Load Forecasting Planning Optimization Planning Platform Operation Platform Report Form State Assessment Power Quality Monitoring Operation Optimization Market Optimization Trading Platform Summary Graph User Interface

23 02 Framework of an Energy Big-data Platform Theoretical Architecture 03 Framework Physical Architecture 02 Technology Architecture 04 Core Technology and Function Big-data Platform

24 Communication Network Interactive interface 02 Framework of an Energy Big-data Platform 24 (1) Theoretical Architecture Data Source Big-data Platform Power system data Natural gas system data Heating system data Meteorological data Data Storage Rapid Index Offline Computing Online Computing Optimization Planning Platform Operation Platform Trading Platform Financial market data Resource Management

25 02 Framework of an Energy Big-data Platform 25 (2) Technology Architecture User Application Layer User Gateway Layer Base Platform Layer Infrastructure Layer Application Gateway Base Platform Infrastructure Surveillance System Offline Computing Online Computing Optimization Spark Client Hive Client Spark Wind Power Data Simulation Yarn HDFS Mapreduce PV Data Simulation Hardware Resources Hadoop Client Storm Client Kafka Storm Building Users Data Simulation Universal Optimiz- -ation Platform Standalone Hive Factory Users Data Simulation Operating Environment Server #1 Server #n Redhat Linux 6.2 Resource Management Universal Optimization Platform Client Distributed Collaborative Framework ZOO KEEPER ( )

26 02 Framework of an Energy Big-data Platform 26 (3) Physical Architecture The layers of the Big-data platform are deployed to the physical nodes, which are connected as a whole through the LAN, providing physical support for applications.

27 27 02 Framework of an Energy Big-data Platform (4) Core Technology and Function Data Processing and Computing Surveillance System Offline Computing Realtime Computing Optimization Random Characteristic Analysis Spark Mllib Mapreduce Resource and Job Management Yarn HDFS Realtime Analysis Spark Streaming Storm Data Storage Data Collection Kafka Standalone Hive Distributed Optimization Universal Optimiz- -ation Platform Distributed Collaborative Framework ZOO KEEPER ( ) Wind Power Data Simulation PV Data Simulation Building Users Data Simulation Factory Users Data Simulation Hardware Resources Server #1 Server #2 Server #n

28 28 Multi-source Heterogeneous Data Fusion

29 29 03 Multi-source Heterogeneous Data Fusion The large amount of data is a big challenge Volume Refers to the speed requirement for processing data Velocity Energy Big-data Variety Variety means the increasing complex of data types Value Data itself is meaningless unless valuable knowledge

30 30 03 Multi-source Heterogeneous Data Fusion System heterogeneous data are stored in different system or database Semantic heterogeneous two records of the same entity have different express Heterogeneous data Structured heterogeneous the data are not only structured data but also semistructured data and unstructured data Grammar heterogeneous data have different formats such as units of data

31 31 03 Multi-source Heterogeneous Data Fusion 1. Semantic heterogeneous data fusion (1) Duplicate database records: name date of birth terminal name terminal address power K.X. Huang Taoyuan K.X. Huang Taoyuan (2) Fusion methods 1 Field Matching Method 2 Sorted-neighborhood Method

32 32 03 Multi-source Heterogeneous Data Fusion 1 Field Matching Method(FMM) Calculate the weight of each attribute of records Weight Similarity Calculate the similarity between records Records matching and detection Detection

33 33 03 Multi-source Heterogeneous Data Fusion a) Weight Y = y 1, y 2,, y n, where y i = y i1, y i2,, y im, i 1,2,, n Y means n records in a database with m attributes T ik T i1, T i2,, T im, i 1,2,, N, T ik 1 T = T 1, T 2,, T m, where T k = N T Τ ik N, k 1,2,, m σ i=1 T k T k, S = max T 1, T 2,, T m, Convert T k to an integer T k S 1 W k = 1 S i i=tk, k 1,2,, m W j = m W j k=1 W k, j 1,2,, m

34 34 03 Multi-source Heterogeneous Data Fusion b) Similarity y ik = y i1, y i2,, y ip y jk = y j1, y j2,, y jq the k-th attribute of the record y i has p strings the k-th attribute of the record y j has q strings q sim y ik, y jk = max score y ika, y jkb p, a 1,2,, p b=1 m sim y i, y j = W k sim(y ik, y jk ) c) Detection k=1 correct numbers completeness = detected numbers correct numbers precision = true numbers

35 35 03 Multi-source Heterogeneous Data Fusion 2 Sorted-neighborhood Method(SNM) Sort current window Window Detection next window

36 36 03 Multi-source Heterogeneous Data Fusion (3) Example Power Data of Taoyuan residential community name date of birth terminal name terminal address power K.X. Huang Taoyuan K.X. Huang Taoyuan D.J. Zhang Taoyuan D.J. Zhang Taoyuan Results of two methods Detected number correct number completeness precision time(s) FMM SNM

37 37 03 Multi-source Heterogeneous Data Fusion 2. System heterogeneous data fusion (1) Data are stored in different database (2) Fusion method Open Database Connectivity (3) Example Excel Text SQL ODBC Oracle database

38 Conclusions and Future Work 38

39 04 Conclusions and Future Work 39 Conclusions 1. Integrated energy system has drawn widely attention around the world. Researches from various of fields greatly promote the development of Integrated energy system. 2. Energy Big-data Platform is the foundation of an Integrated Energy Platform, and is a significant research field. 3. Methods for multi-source heterogeneous data fusion is introduced, furthermore, establishment of an Energy Bigdata Platform framework is on going.

40 04 Conclusions and Future Work 40 Future Work An Integrated Energy Platform based on the Energy Big-data Platform Integrated Energy Platform Energy Big-data Platform Integrated Energy Planning Platform Integrated Energy Operation Platform Integrated Energy Trading Platform

41 Thanks for Your Attention! 41

20775A: Performing Data Engineering on Microsoft HD Insight

20775A: Performing Data Engineering on Microsoft HD Insight 20775A: Performing Data Engineering on Microsoft HD Insight Duration: 5 days; Instructor-led Implement Spark Streaming Using the DStream API. Develop Big Data Real-Time Processing Solutions with Apache

More information

Big Data Hadoop Administrator.

Big Data Hadoop Administrator. Big Data Hadoop Administrator www.austech.edu.au WHAT IS BIG DATA HADOOP ADMINISTRATOR?? Hadoop is a distributed framework that makes it easier to process large data sets that reside in clusters of computers.

More information

Preface About the Book

Preface About the Book Preface About the Book We are living in the dawn of what has been termed as the "Fourth Industrial Revolution" by the World Economic Forum (WEF) in 2016. The Fourth Industrial Revolution is marked through

More information

Intro to Big Data and Hadoop

Intro to Big Data and Hadoop Intro to Big and Hadoop Portions copyright 2001 SAS Institute Inc., Cary, NC, USA. All Rights Reserved. Reproduced with permission of SAS Institute Inc., Cary, NC, USA. SAS Institute Inc. makes no warranties

More information

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica Accelerating Your Big Data Analytics Jeff Healey, Director Product Marketing, HPE Vertica Recent Waves of Disruption IT Infrastructu re for Analytics Data Warehouse Modernization Big Data/ Hadoop Cloud

More information

Course Content. The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight.

Course Content. The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight. Course Content Course Description: The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight. At Course Completion: After competing this course,

More information

Statistics & Optimization with Big Data

Statistics & Optimization with Big Data Statistics & Optimization with Big Data Technology and data driven decision science company focused on helping academics to solve big data and analytics problems of any kind, from any source, at massive

More information

MapR: Solution for Customer Production Success

MapR: Solution for Customer Production Success 2015 MapR Technologies 2015 MapR Technologies 1 MapR: Solution for Customer Production Success Big Data High Growth 700+ Customers Cloud Leaders Riding the Wave with Hadoop The Big Data Platform of Choice

More information

20775 Performing Data Engineering on Microsoft HD Insight

20775 Performing Data Engineering on Microsoft HD Insight Duración del curso: 5 Días Acerca de este curso The main purpose of the course is to give students the ability plan and implement big data workflows on HD. Perfil de público The primary audience for this

More information

EXAMPLE SOLUTIONS Hadoop in Azure HBase as a columnar NoSQL transactional database running on Azure Blobs Storm as a streaming service for near real time processing Hadoop 2.4 support for 100x query gains

More information

20775: Performing Data Engineering on Microsoft HD Insight

20775: Performing Data Engineering on Microsoft HD Insight Let s Reach For Excellence! TAN DUC INFORMATION TECHNOLOGY SCHOOL JSC Address: 103 Pasteur, Dist.1, HCMC Tel: 08 38245819; 38239761 Email: traincert@tdt-tanduc.com Website: www.tdt-tanduc.com; www.tanducits.com

More information

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW TOPICS COVERED 1 2 Fundamentals of Big Data Platforms Major Big Data Tools Scaling Up vs. Out SCALE UP (SMP) SCALE OUT (MPP) + (n) Upgrade

More information

BIG DATA AND HADOOP DEVELOPER

BIG DATA AND HADOOP DEVELOPER BIG DATA AND HADOOP DEVELOPER Approximate Duration - 60 Hrs Classes + 30 hrs Lab work + 20 hrs Assessment = 110 Hrs + 50 hrs Project Total duration of course = 160 hrs Lesson 00 - Course Introduction 0.1

More information

Big data is hard. Top 3 Challenges To Adopting Big Data

Big data is hard. Top 3 Challenges To Adopting Big Data Big data is hard Top 3 Challenges To Adopting Big Data Traditionally, analytics have been over pre-defined structures Data characteristics: Sales Questions answered with BI and visualizations: Customer

More information

20775A: Performing Data Engineering on Microsoft HD Insight

20775A: Performing Data Engineering on Microsoft HD Insight 20775A: Performing Data Engineering on Microsoft HD Insight Course Details Course Code: Duration: Notes: 20775A 5 days This course syllabus should be used to determine whether the course is appropriate

More information

Spark and Hadoop Perfect Together

Spark and Hadoop Perfect Together Spark and Hadoop Perfect Together Arun Murthy Hortonworks Co-Founder @acmurthy Data Operating System Enable all data and applications TO BE accessible and shared BY any end-users Data Operating System

More information

Hadoop Course Content

Hadoop Course Content Hadoop Course Content Hadoop Course Content Hadoop Overview, Architecture Considerations, Infrastructure, Platforms and Automation Use case walkthrough ETL Log Analytics Real Time Analytics Hbase for Developers

More information

5th Annual. Cloudera, Inc. All rights reserved.

5th Annual. Cloudera, Inc. All rights reserved. 5th Annual 1 The Essentials of Apache Hadoop The What, Why and How to Meet Agency Objectives Sarah Sproehnle, Vice President, Customer Success 2 Introduction 3 What is Apache Hadoop? Hadoop is a software

More information

Modernizing Your Data Warehouse with Azure

Modernizing Your Data Warehouse with Azure Modernizing Your Data Warehouse with Azure Big data. Small data. All data. Christian Coté S P O N S O R S The traditional BI Environment The traditional data warehouse data warehousing has reached the

More information

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK Are you drowning in Big Data? Do you lack access to your data? Are you having a hard time managing Big Data processing requirements?

More information

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

E-guide Hadoop Big Data Platforms Buyer s Guide part 1 Hadoop Big Data Platforms Buyer s Guide part 1 Your expert guide to Hadoop big data platforms for managing big data David Loshin, Knowledge Integrity Inc. Companies of all sizes can use Hadoop, as vendors

More information

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration KnowledgeSTUDIO Advanced Modeling for Better Decisions Companies that compete with analytics are looking for advanced analytical technologies that accelerate decision making and identify opportunities

More information

Analytics in Action transforming the way we use and consume information

Analytics in Action transforming the way we use and consume information Analytics in Action transforming the way we use and consume information Big Data Ecosystem The Data Traditional Data BIG DATA Repositories MPP Appliances Internet Hadoop Data Streaming Big Data Ecosystem

More information

Business is being transformed by three trends

Business is being transformed by three trends Business is being transformed by three trends Big Cloud Intelligence Stay ahead of the curve with Cortana Intelligence Suite Business apps People Custom apps Apps Sensors and devices Cortana Intelligence

More information

Research of the Social Media Data Analyzing Platform Based on Cloud Mining Yi-Tang ZENG, Yu-Feng ZHANG, Sheng CAO, Li LI, Cheng-Wei ZHANG *

Research of the Social Media Data Analyzing Platform Based on Cloud Mining Yi-Tang ZENG, Yu-Feng ZHANG, Sheng CAO, Li LI, Cheng-Wei ZHANG * 2016 3 rd International Conference on Social Science (ICSS 2016) ISBN: 978-1-60595-410-3 Research of the Social Media Data Analyzing Platform Based on Cloud Mining Yi-Tang ZENG, Yu-Feng ZHANG, Sheng CAO,

More information

Microsoft Azure Essentials

Microsoft Azure Essentials Microsoft Azure Essentials Azure Essentials Track Summary Data Analytics Explore the Data Analytics services in Azure to help you analyze both structured and unstructured data. Azure can help with large,

More information

Angat Pinoy. Angat Negosyo. Angat Pilipinas.

Angat Pinoy. Angat Negosyo. Angat Pilipinas. Angat Pinoy. Angat Negosyo. Angat Pilipinas. Four megatrends will dominate the next decade Mobility Social Cloud Big data 91% of organizations expect to spend on mobile devices in 2012 In 2012, mobile

More information

Pentaho 8.0 Overview. Pedro Alves

Pentaho 8.0 Overview. Pedro Alves Pentaho 8.0 Overview Pedro Alves Safe Harbor Statement The forward-looking statements contained in this document represent an outline of our current intended product direction. It is provided for information

More information

Operational Hadoop and the Lambda Architecture for Streaming Data

Operational Hadoop and the Lambda Architecture for Streaming Data Operational Hadoop and the Lambda Architecture for Streaming Data 2015 MapR Technologies 2015 MapR Technologies 1 Topics From Batch to Operational Workloads on Hadoop Streaming Data Environments The Lambda

More information

Hortonworks Connected Data Platforms

Hortonworks Connected Data Platforms Hortonworks Connected Data Platforms MASTER THE VALUE OF DATA EVERY BUSINESS IS A DATA BUSINESS EMBRACE AN OPEN APPROACH 2 Hortonworks Inc. 2011 2016. All Rights Reserved Data Drives the Connected Car

More information

Big Data The Big Story

Big Data The Big Story Big Data The Big Story Jean-Pierre Dijcks Big Data Product Mangement 1 Agenda What is Big Data? Architecting Big Data Building Big Data Solutions Oracle Big Data Appliance and Big Data Connectors Customer

More information

Spark, Hadoop, and Friends

Spark, Hadoop, and Friends Spark, Hadoop, and Friends (and the Zeppelin Notebook) Douglas Eadline Jan 4, 2017 NJIT Presenter Douglas Eadline deadline@basement-supercomputing.com @thedeadline HPC/Hadoop Consultant/Writer http://www.basement-supercomputing.com

More information

Big Data Introduction

Big Data Introduction Big Data Introduction Who we are Experts At Your Service Over 50 specialists in IT infrastructure Certified, experienced, passionate Based In Switzerland 100% self-financed Swiss company Over CHF8 mio.

More information

IBM SPSS & Apache Spark

IBM SPSS & Apache Spark IBM SPSS & Apache Spark Making Big Data analytics easier and more accessible ramiro.rego@es.ibm.com @foreswearer 1 2016 IBM Corporation Modeler y Spark. Integration Infrastructure overview Spark, Hadoop

More information

Transforming Analytics with Cloudera Data Science WorkBench

Transforming Analytics with Cloudera Data Science WorkBench Transforming Analytics with Cloudera Data Science WorkBench Process data, develop and serve predictive models. 1 Age of Machine Learning Data volume NO Machine Learning Machine Learning 1950s 1960s 1970s

More information

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation Roger Ding Cloudera February 3rd, 2018 1 Agenda Hadoop History Introduction to Apache Hadoop

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

Insights to HDInsight

Insights to HDInsight Insights to HDInsight Why Hadoop in the Cloud? No hardware costs Unlimited Scale Pay for What You Need Deployed in minutes Azure HDInsight Big Data made easy Enterprise Ready Easier and more productive

More information

Common Customer Use Cases in FSI

Common Customer Use Cases in FSI Common Customer Use Cases in FSI 1 Marketing Optimization 2014 2014 MapR MapR Technologies Technologies 2 Fortune 100 Financial Services Company 104M CARD MEMBERS 3 Financial Services: Recommendation Engine

More information

Exploring Big Data and Data Analytics with Hadoop and IDOL. Brochure. You are experiencing transformational changes in the computing arena.

Exploring Big Data and Data Analytics with Hadoop and IDOL. Brochure. You are experiencing transformational changes in the computing arena. Brochure Software Education Exploring Big Data and Data Analytics with Hadoop and IDOL You are experiencing transformational changes in the computing arena. Brochure Exploring Big Data and Data Analytics

More information

AZURE HDINSIGHT. Azure Machine Learning Track Marek Chmel

AZURE HDINSIGHT. Azure Machine Learning Track Marek Chmel AZURE HDINSIGHT Azure Machine Learning Track Marek Chmel SESSION AGENDA Understanding different scenarios of Hadoop Building an end to end pipeline using HDInsight Using in-memory techniques to analyze

More information

Design of material management system of mining group based on Hadoop

Design of material management system of mining group based on Hadoop IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS Design of material system of mining group based on Hadoop To cite this article: Zhiyuan Xia et al 2018 IOP Conf. Ser.: Earth Environ.

More information

IBM Analytics Unleash the power of data with Apache Spark

IBM Analytics Unleash the power of data with Apache Spark IBM Analytics Unleash the power of data with Apache Spark Agility, speed and simplicity define the analytics operating system of the future 1 2 3 4 Use Spark to create value from data-driven insights Lower

More information

GPU ACCELERATED BIG DATA ARCHITECTURE

GPU ACCELERATED BIG DATA ARCHITECTURE INNOVATION PLATFORM WHITE PAPER 1 Today s enterprise is producing and consuming more data than ever before. Enterprise data storage and processing architectures have struggled to keep up with this exponentially

More information

Hortonworks Data Platform

Hortonworks Data Platform Hortonworks Data Platform An open-architecture platform to manage data in motion and at rest Highlights Addresses a range of data-at-rest use cases Powers real-time customer applications Delivers robust

More information

Oracle Big Data Cloud Service

Oracle Big Data Cloud Service Oracle Big Data Cloud Service Delivering Hadoop, Spark and Data Science with Oracle Security and Cloud Simplicity Oracle Big Data Cloud Service is an automated service that provides a highpowered environment

More information

REDEFINE BIG DATA. Zvi Brunner CTO. Copyright 2015 EMC Corporation. All rights reserved.

REDEFINE BIG DATA. Zvi Brunner CTO. Copyright 2015 EMC Corporation. All rights reserved. 1 REDEFINE BIG DATA Zvi Brunner CTO 2 2020: A NEW DIGITAL WORLD 30B DEVICES 7B PEOPLE Millions OF NEW BUSINESSES Source: Gartner Group, 2014 DIGITIZATION IS ALREADY BEGINNING PRECISION FARMING DRESS THAT

More information

Azure ML Data Camp. Ivan Kosyakov MTC Architect, Ph.D. Microsoft Technology Centers Microsoft Technology Centers. Experience the Microsoft Cloud

Azure ML Data Camp. Ivan Kosyakov MTC Architect, Ph.D. Microsoft Technology Centers Microsoft Technology Centers. Experience the Microsoft Cloud Microsoft Technology Centers Microsoft Technology Centers Experience the Microsoft Cloud Experience the Microsoft Cloud ML Data Camp Ivan Kosyakov MTC Architect, Ph.D. Top Manager IT Analyst Big Data Strategic

More information

Leveraging Oracle Big Data Discovery to Master CERN s Data. Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden

Leveraging Oracle Big Data Discovery to Master CERN s Data. Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden Leveraging Oracle Big Data Discovery to Master CERN s Data Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden Manuel Martin Marquez Intel IoT Ignition Lab Cloud and

More information

Contents at a Glance COPYRIGHTED MATERIAL. Introduction... 1 Part I: Getting Started with Big Data... 7

Contents at a Glance COPYRIGHTED MATERIAL. Introduction... 1 Part I: Getting Started with Big Data... 7 Contents at a Glance Introduction... 1 Part I: Getting Started with Big Data... 7 Chapter 1: Grasping the Fundamentals of Big Data...9 Chapter 2: Examining Big Data Types...25 Chapter 3: Old Meets New:

More information

StackIQ Enterprise Data Reference Architecture

StackIQ Enterprise Data Reference Architecture WHITE PAPER StackIQ Enterprise Data Reference Architecture StackIQ and Hortonworks worked together to Bring You World-class Reference Configurations for Apache Hadoop Clusters. Abstract Contents The Need

More information

GET MORE VALUE OUT OF BIG DATA

GET MORE VALUE OUT OF BIG DATA GET MORE VALUE OUT OF BIG DATA Enterprise data is increasing at an alarming rate. An International Data Corporation (IDC) study estimates that data is growing at 50 percent a year and will grow by 50 times

More information

Big Data Application Engineer/ Developer. Specialization in Apache Spark, Kafka, Airflow, HBase

Big Data Application Engineer/ Developer. Specialization in Apache Spark, Kafka, Airflow, HBase BIG DATA COURSE Big Data Application Engineer/ Developer Specialization in Apache Spark, Kafka, Airflow, HBase In Exclusive Association with 21,347+ Participants 10,000+ Brands 1200+ Trainings 45+ Countries

More information

Harnessing Machine Data with Data-Driven Machine Learning

Harnessing Machine Data with Data-Driven Machine Learning Harnessing Machine Data with Data-Driven Machine Learning (White Paper) Executive Summary Petabytes of machine data are generated by billions of machines every single day. Traditional data processing techniques

More information

Analytics Platform System

Analytics Platform System Analytics Platform System Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com Ofc 425-538-0044, Cell 303-324-2860 Sean Mikha, DW & Big Data Architect semikha@microsoft.com

More information

Enterprise Analytics Accelerating Your Path to Value with an Open Analytics Platform

Enterprise Analytics Accelerating Your Path to Value with an Open Analytics Platform Enterprise Analytics Accelerating Your Path to Value with an Open Analytics Platform Federico Pozzi @fedealbpozzi Mathias Coopmans @macoopma Characteristics of a badly managed platform No clear data

More information

Enterprise-Scale MATLAB Applications

Enterprise-Scale MATLAB Applications Enterprise-Scale Applications Sylvain Lacaze Rory Adams 2018 The MathWorks, Inc. 1 Enterprise Integration Access and Explore Data Preprocess Data Develop Predictive Models Integrate Analytics with Systems

More information

Azure Data Analytics & Machine Learning Seminar. Daire Cunningham: BI Practice Area Manager

Azure Data Analytics & Machine Learning Seminar. Daire Cunningham: BI Practice Area Manager Azure Data Analytics & Machine Learning Seminar Daire Cunningham: BI Practice Area Manager AGENDA 09:00 AM 09:30 AM Registration & Refreshments 09.30AM 10:00 AM 10:00 AM 10:30 AM Welcome & Keynote, Ger

More information

MapR Pentaho Business Solutions

MapR Pentaho Business Solutions MapR Pentaho Business Solutions The Benefits of a Converged Platform to Big Data Integration Tom Scurlock Director, WW Alliances and Partners, MapR Key Takeaways 1. We focus on business values and business

More information

Session 30 Powerful Ways to Use Hadoop in your Healthcare Big Data Strategy

Session 30 Powerful Ways to Use Hadoop in your Healthcare Big Data Strategy Session 30 Powerful Ways to Use Hadoop in your Healthcare Big Data Strategy Bryan Hinton Senior Vice President, Platform Engineering Health Catalyst Sean Stohl Senior Vice President, Product Development

More information

Exelon Utilities Data Analytics Journey

Exelon Utilities Data Analytics Journey Exelon Utilities Data Analytics Journey Presented by Dean M Hengst PI System uses with-in Exelon Utilities Intelligent Substation Substation Security Historical Playback / Capacity Planning ComEd as implemented

More information

PNDA.io: when big data and OSS collide

PNDA.io: when big data and OSS collide .io: when big data and OSS collide Simplified OSS / BSS Stack [Build Slide] Order Customer Bills and Reports Order Mgmt BSS Billing and Reporting Orchestration is responsible for service provisioning and

More information

Creating an Enterprise-class Hadoop Platform Joey Jablonski Practice Director, Analytic Services DataDirect Networks, Inc. (DDN)

Creating an Enterprise-class Hadoop Platform Joey Jablonski Practice Director, Analytic Services DataDirect Networks, Inc. (DDN) Creating an Enterprise-class Hadoop Platform Joey Jablonski Practice Director, Analytic Services DataDirect Networks, Inc. (DDN) Who am I? Practice Director, Analytic Services at DataDirect Networks, Inc.

More information

Sr. Sergio Rodríguez de Guzmán CTO PUE

Sr. Sergio Rodríguez de Guzmán CTO PUE PRODUCT LATEST NEWS Sr. Sergio Rodríguez de Guzmán CTO PUE www.pue.es Hadoop & Why Cloudera Sergio Rodríguez Systems Engineer sergio@pue.es 3 Industry-Leading Consulting and Training PUE is the first Spanish

More information

ABOUT THIS TRAINING: This Hadoop training will also prepare you for the Big Data Certification of Cloudera- CCP and CCA.

ABOUT THIS TRAINING: This Hadoop training will also prepare you for the Big Data Certification of Cloudera- CCP and CCA. ABOUT THIS TRAINING: The world of Hadoop and Big Data" can be intimidating - hundreds of different technologies with cryptic names form the Hadoop ecosystem. This comprehensive training has been designed

More information

Construction of Regional Logistics Information Platform Based on Cloud Computing

Construction of Regional Logistics Information Platform Based on Cloud Computing International Conference on Computational Science and Engineering (ICCSE 2015) Construction of Regional Logistics Information Platform Based on Cloud Computing Gang SUN 1,2,a,*, Xiu-You WANG 1,b, Hao WANG

More information

Cloud Based Analytics for SAP

Cloud Based Analytics for SAP Cloud Based Analytics for SAP Gary Patterson, Global Lead for Big Data About Virtustream A Dell Technologies Business 2,300+ employees 20+ data centers Major operations in 10 countries One of the fastest

More information

ADVANCED ANALYTICS & IOT ARCHITECTURES

ADVANCED ANALYTICS & IOT ARCHITECTURES ADVANCED ANALYTICS & IOT ARCHITECTURES Presented by: Orion Gebremedhin Director of Technology, Data & Analytics Marc Lobree National Architect, Advanced Analytics EDW THE RIGHT TOOL FOR THE RIGHT WORKLOAD

More information

Hadoop and Analytics at CERN IT CERN IT-DB

Hadoop and Analytics at CERN IT CERN IT-DB Hadoop and Analytics at CERN IT CERN IT-DB 1 Hadoop Use cases Parallel processing of large amounts of data Perform analytics on a large scale Dealing with complex data: structured, semi-structured, unstructured

More information

IV Riunione IU.NET. Towards the Internet of Energy A pathway to electric revolution. Paolo Tenti, Tommaso Caldognetto

IV Riunione IU.NET. Towards the Internet of Energy A pathway to electric revolution. Paolo Tenti, Tommaso Caldognetto IV Riunione IU.NET Towards the Internet of Energy A pathway to electric revolution Paolo Tenti, Tommaso Caldognetto University of Padova Department of Information Engineering Perugia - 21-22 settembre

More information

Deloitte School of Analytics. Demystifying Data Science: Leveraging this phenomenon to drive your organisation forward

Deloitte School of Analytics. Demystifying Data Science: Leveraging this phenomenon to drive your organisation forward Deloitte School of Analytics Demystifying Data Science: Leveraging this phenomenon to drive your organisation forward February 2018 Agenda 7 February 2018 8 February 2018 9 February 2018 8:00 9:00 Networking

More information

Analytics for the NFV World with PNDA.io

Analytics for the NFV World with PNDA.io for the NFV World with.io Speaker Donald Hunter Principal Engineer in the Chief Technology and Architecture Office at Cisco. Lead the MEF OpenLSO project which uses.io as a reference implementation for

More information

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition delivers high-performance data movement and transformation among enterprise platforms with its open and integrated E-LT

More information

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop Simplifying the Process of Uploading and Extracting Data from Apache Hadoop Rohit Bakhshi, Solution Architect, Hortonworks Jim Walker, Director Product Marketing, Talend Page 1 About Us Rohit Bakhshi Solution

More information

APAC Big Data & Cloud Summit 2013

APAC Big Data & Cloud Summit 2013 APAC Big Data & Cloud Summit 2013 Big Data Analytics & Hadoop Use Cases Eddie Toh Server Marketing Manager 21 August 2013 From the dawn of civilization until 2003, we humans created 5 Exabyte of information.

More information

Combine Microservices Framework for Flexible, Scalable, High Availability Big Data Analytics

Combine Microservices Framework for Flexible, Scalable, High Availability Big Data Analytics Combine Microservices Framework for Flexible, Scalable, High Availability Big Data Analytics Dan Widdis, Principal Operations Research Analyst May 10, 2016 Approved for public release; distribution is

More information

Research Co-design Activity

Research Co-design Activity Research Co-design Activity A. Purpose of Co-design: The ultimate goals of this co-design activity are to: Directly involve all members of a group to make decisions together that would affect their daily

More information

Welcome! 2013 SAP AG or an SAP affiliate company. All rights reserved.

Welcome! 2013 SAP AG or an SAP affiliate company. All rights reserved. Welcome! 2013 SAP AG or an SAP affiliate company. All rights reserved. 1 SAP Big Data Webinar Series Big Data - Introduction to SAP Big Data Technologies Big Data - Streaming Analytics Big Data - Smarter

More information

IoT ANALYTICS IN THE ENTERPRISE WITH FUNL

IoT ANALYTICS IN THE ENTERPRISE WITH FUNL INNOVATION PLATFORM WHITE PAPER 1 The plethora of IoT devices is already adding to the exponentially increasing volumes, variety, and velocity of Big Data. This paper examines IoT analytics and provides

More information

SAP Machine Learning for Hadoop. Customer

SAP Machine Learning for Hadoop. Customer SAP Machine Learning for Hadoop Customer SAP BusinessObjects Predictive Analytics and Big Data 1. Support for end-to-end operational predictive lifecycle on Hadoop 2. Business Analyst Friendly No coding

More information

Pentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara

Pentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara Pentaho 8.0 and Beyond Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara Safe Harbor Statement The forward-looking statements contained in this document represent an outline of our

More information

DataAdapt Active Insight

DataAdapt Active Insight Solution Highlights Accelerated time to value Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced analytics for structured, semistructured and unstructured

More information

SAP Predictive Analytics Suite

SAP Predictive Analytics Suite SAP Predictive Analytics Suite Tania Pérez Asensio Where is the Evolution of Business Analytics Heading? Organizations Are Maturing Their Approaches to Solving Business Problems Reactive Wait until a problem

More information

What s New. Bernd Wiswedel KNIME KNIME AG. All Rights Reserved.

What s New. Bernd Wiswedel KNIME KNIME AG. All Rights Reserved. What s New Bernd Wiswedel KNIME 2018 KNIME AG. All Rights Reserved. What this session is about Presenting (and demo ing) enhancements added in the last year By the team Questions? See us at the booth.

More information

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand Paper 2698-2018 Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand ABSTRACT Digital analytics is no longer just about tracking the number

More information

How In-Memory Computing can Maximize the Performance of Modern Payments

How In-Memory Computing can Maximize the Performance of Modern Payments How In-Memory Computing can Maximize the Performance of Modern Payments 2018 The mobile payments market is expected to grow to over a trillion dollars by 2019 How can in-memory computing maximize the performance

More information

Cloudera, Inc. All rights reserved.

Cloudera, Inc. All rights reserved. 1 Data Analytics 2018 CDSW Teamplay und Governance in der Data Science Entwicklung Thomas Friebel Partner Sales Engineer tfriebel@cloudera.com 2 We believe data can make what is impossible today, possible

More information

Bringing the Power of SAS to Hadoop Title

Bringing the Power of SAS to Hadoop Title WHITE PAPER Bringing the Power of SAS to Hadoop Title Combine SAS World-Class Analytics With Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities ii Contents Introduction... 1 What

More information

From Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN

From Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN From Information to Insight: The Big Value of Big Data Faire Ann Co Marketing Manager, Information Management Software, ASEAN The World is Changing and Becoming More INSTRUMENTED INTERCONNECTED INTELLIGENT

More information

SmartCare. SPSS Workshop. Rick Durham - North American Advanced Analytics Channel Team IBM Corporation. Date: 5/28/2014

SmartCare. SPSS Workshop. Rick Durham - North American Advanced Analytics Channel Team IBM Corporation. Date: 5/28/2014 SPSS Workshop Key Presenter Rick Durham - North American Advanced Analytics Channel Team Date: 5/28/2014 Agenda What is Predictive Analytics? What is the architecture of the IBM/SPSS technology stack?

More information

Big Data & Hadoop Advance

Big Data & Hadoop Advance Course Durations: 30 Hours About Company: Course Mode: Online/Offline EduNextgen extended arm of Product Innovation Academy is a growing entity in education and career transformation, specializing in today

More information

Asseco HOME: Reduction of telecoms operating costs thanks to Big Data solution.

Asseco HOME: Reduction of telecoms operating costs thanks to Big Data solution. Asseco HOME: Reduction of telecoms operating costs thanks to Big Data solution. asseco.pl Client. The client is one of the leading telecommunication operators in Poland, serving over 10 million subscribers.

More information

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved. Apache Spark 2.0 GA The General Engine for Modern Analytic Use Cases 1 Apache Spark Drives Business Innovation Apache Spark is driving new business value that is being harnessed by technology forward organizations.

More information

SAS and Hadoop Technology: Overview

SAS and Hadoop Technology: Overview SAS and Hadoop Technology: Overview SAS Documentation September 19, 2017 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS and Hadoop Technology: Overview.

More information

Big Data Foundation. 2 Days Classroom Training PHILIPPINES :: MALAYSIA :: VIETNAM :: SINGAPORE :: INDIA

Big Data Foundation. 2 Days Classroom Training PHILIPPINES :: MALAYSIA :: VIETNAM :: SINGAPORE :: INDIA Big Data Foundation 2 Days Classroom Training PHILIPPINES :: MALAYSIA :: VIETNAM :: SINGAPORE :: INDIA Content Big Data Foundation Course Introduction Who we are Course Overview Career Path Course Content

More information

ActualTests.C Q&A C Foundations of IBM Big Data & Analytics Architecture V1

ActualTests.C Q&A C Foundations of IBM Big Data & Analytics Architecture V1 ActualTests.C2030-136.40Q&A Number: C2030-136 Passing Score: 800 Time Limit: 120 min File Version: 4.8 http://www.gratisexam.com/ C2030-136 Foundations of IBM Big Data & Analytics Architecture V1 Hello,

More information

Hadoop Integration Deep Dive

Hadoop Integration Deep Dive Hadoop Integration Deep Dive Piyush Chaudhary Spectrum Scale BD&A Architect 1 Agenda Analytics Market overview Spectrum Scale Analytics strategy Spectrum Scale Hadoop Integration A tale of two connectors

More information

REDES ELÉTRICAS INTELIGENTES COMO CONDIÇÃO DE SUCESSO NA CONTENÇÃO DAS ALTERAÇÕES CLIMÁTICAS

REDES ELÉTRICAS INTELIGENTES COMO CONDIÇÃO DE SUCESSO NA CONTENÇÃO DAS ALTERAÇÕES CLIMÁTICAS REDES ELÉTRICAS INTELIGENTES COMO CONDIÇÃO DE SUCESSO NA CONTENÇÃO DAS ALTERAÇÕES CLIMÁTICAS João Peças Lopes Administrador INESC TEC Professor Catedrático Faculdade de Engenharia da Universidade do Porto

More information

Outline of Hadoop. Background, Core Services, and Components. David Schwab Synchronic Analytics Nov.

Outline of Hadoop. Background, Core Services, and Components. David Schwab Synchronic Analytics   Nov. Outline of Hadoop Background, Core Services, and Components David Schwab Synchronic Analytics https://synchronicanalytics.com Nov. 1, 2018 Hadoop s Purpose and Origin Hadoop s Architecture Minimum Configuration

More information

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB Data Analytics and CERN IT Hadoop Service CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB 1 Data Analytics at Scale The Challenge When you cannot fit your workload in a desktop Data

More information