DevSci: Better Software Through Data #KCDC2018

Size: px
Start display at page:

Download "DevSci: Better Software Through Data #KCDC2018"

Transcription

1 DevSci: Better Software Through Data #KCDC2018

2

3 What is data science? Why is it important? How do I get started?

4

5

6

7

8 Job Postings for Data Scientists

9 Top-paying Tech Skills Skill 2016 Change Skill 2016 Change Source: Dice Salary Survey 2017

10

11

12

13 About Me Data Science Consultant Education B.S. in Computer Science (ISU) B.A. in Philosophy (ISU) Community Keynote speaker Pluralsight author DataCamp author Microsoft MVP AI ASPInsider

14 About Me Data Science Consultant Education B.S. in Computer Science (ISU) B.A. in Philosophy (ISU) Community Keynote speaker Pluralsight author DataCamp author Microsoft MVP AI ASPInsider

15 About Me Data Science Consultant Education B.S. in Computer Science (ISU) B.A. in Philosophy (ISU) Community Keynote speaker Pluralsight author DataCamp author Microsoft MVP AI ASPInsider

16 What is data science?

17 Computer Science Data Science Math and Statistics Domain Knowledge

18 Data Knowledge Decision Action

19 What Is a Data Scientist? Performs data science More than a scientist More than an analyst More than a developer

20 What skills are necessary?

21 Data Science Skills Programming Working with data Descriptive statistics Data visualization

22 Data Science Skills Programming Working with data Descriptive statistics Data visualization Statistical modeling Handling Big Data Machine learning Deploying to production

23 What tools are used?

24 SQL Excel Python R MySQL Python tools ggplot SQL Server Tableau JavaScript Matplotlib Java PostgreSQL Oracle D3 Homegrown Hive Spark Cloudera Visual Basic MongoDB Hadoop SAS C++ PowerPivot Scala SQLite C Pig RedShift Weka Hbase (EMR) Perl SPSS Teradata Share of Respondents 70% 60% Data Science Tools 50% 40% 30% 20% 10% 0% Tool: language, platform, analytics Source: O Reilly 2015 Data Science Salary Survey

25 SQL Excel Python R MySQL Python tools ggplot SQL Server Tableau JavaScript Matplotlib Java PostgreSQL Oracle D3 Homegrown Hive Spark Cloudera Visual Basic MongoDB Hadoop SAS C++ PowerPivot Scala SQLite C Pig RedShift Weka Hbase (EMR) Perl SPSS Teradata Share of Respondents 70% 60% Data Science Tools 50% 40% 30% 20% 10% 0% Tool: language, platform, analytics Source: O Reilly 2015 Data Science Salary Survey

26 SQL Excel Python R MySQL Python tools ggplot SQL Server Tableau JavaScript Matplotlib Java PostgreSQL Oracle D3 Homegrown Hive Spark Cloudera Visual Basic MongoDB Hadoop SAS C++ PowerPivot Scala SQLite C Pig RedShift Weka Hbase (EMR) Perl SPSS Teradata Share of Respondents 70% 60% Data Science Tools 50% 40% 30% 20% 10% 0% Tool: language, platform, analytics Source: O Reilly 2015 Data Science Salary Survey

27 SQL Excel Python R MySQL Python tools ggplot SQL Server Tableau JavaScript Matplotlib Java PostgreSQL Oracle D3 Homegrown Hive Spark Cloudera Visual Basic MongoDB Hadoop SAS C++ PowerPivot Scala SQLite C Pig RedShift Weka Hbase (EMR) Perl SPSS Teradata Share of Respondents 70% 60% Data Science Tools 50% 40% 30% 20% 10% 0% Tool: language, platform, analytics Source: O Reilly 2015 Data Science Salary Survey

28 SQL Excel Python R MySQL Python tools ggplot SQL Server Tableau JavaScript Matplotlib Java PostgreSQL Oracle D3 Homegrown Hive Spark Cloudera Visual Basic MongoDB Hadoop SAS C++ PowerPivot Scala SQLite C Pig RedShift Weka Hbase (EMR) Perl SPSS Teradata Share of Respondents 70% 60% Data Science Tools 50% 40% 30% 20% 10% 0% Tool: language, platform, analytics Source: O Reilly 2015 Data Science Salary Survey

29 How is data science performed?

30 The Data Science Process Data

31 The Data Science Process Find a question Data

32 The Data Science Process Find a question Collect the data Data

33 The Data Science Process Find a question Collect the data Data Prepare the data

34 The Data Science Process Find a question Collect the data Data Prepare the data Create a model

35 The Data Science Process Find a question Collect the data Evaluate the model Data Prepare the data Create a model

36 The Data Science Process Find a question Deploy the model Collect the data Evaluate the model Data Prepare the data Create a model

37 The Data Science Process Find a question Deploy the model Collect the data Evaluate the model Data Prepare the data Create a model

38 The Data Science Process Find a question Iterative process Deploy the model Explore the data Evaluate the model Data Prepare the data Create a model

39 The Data Science Process Find a question Iterative process Non-sequential Deploy the model Explore the data Evaluate the model Data Prepare the data Create a model

40 The Data Science Process Find a question Iterative process Deploy the model Explore the data Non-sequential Early termination Evaluate the model Data Prepare the data Create a model

41 Why is data science important?

42 Two Main Approaches Build intelligent software Improve development practices

43 Two Main Approaches Build intelligent software

44

45 Internet Sales Show me sales by gender and marital status. Displaying sum of sales by gender and marital status Marital Status: Married Single Show me sales by gender and marital status. Male Female $0k $5k $10k $15k

46 Machine Learning Human Cat Dog Car

47

48

49

50

51 Anticipatory Design Collect Data Create Algorithm Anticipate Choices

52

53

54

55 Two Main Approaches Improve development practices

56 Data-Driven Decision Making Build Learn Measure

57 Hypothesis-Driven Development Hypothesis Analysis Experiment

58 Hypothesis-Driven Development Hypothesis Hypothesis: Users will prefer feature A over feature B Analysis Experiment

59 Hypothesis-Driven Development Hypothesis Hypothesis: Users will prefer feature A over feature B Analysis Experiment Experiment: Survey 100 users and ask for their preference

60 Hypothesis-Driven Development Hypothesis Hypothesis: Users will prefer feature A over feature B Analysis: 80% of users prefer feature A Analysis Experiment Experiment: Survey 100 users and ask for their preference

61 Hypothesis-Driven Development Hypothesis Hypothesis: Pair programming will increase our long-term velocity Analysis Experiment

62 Hypothesis-Driven Development Hypothesis Hypothesis: Pair programming will increase our long-term velocity Analysis Experiment Experiment: Pair for 4 sprints and track velocity

63 Hypothesis-Driven Development Hypothesis Hypothesis: Pair programming will increase our long-term velocity Analysis: Velocity increased by 20% per sprint Analysis Experiment Experiment: Pair for 4 sprints and track velocity

64 Hypothesis Stories <Hypothesis> We assume that <hypothesis> Will result in<outcome> We will have succeeded when <measurable result>

65 Hypothesis Stories Pair Programming Hypothesis We assume that pair programming Will result in higher long-term velocity We will have succeeded when we have seen a 10% or greater increase in velocity after 4 sprints.

66 A/B Testing

67 A/B Testing

68 Feature Toggles New Feature Feature Toggles User Groups

69 Feature Toggles New Feature Feature Toggles User Groups

70 DevOps Pipeline Code Source Control Build Q/A Deploy Prod

71 DevOps Pipeline Code Source Control Build Q/A Deploy Prod

72 Code Quality Metrics Source: NDepend

73 Source Control Metrics

74 Build Metrics Source: Visual Studio Team Services

75 Q/A Metrics

76 Deployment Metrics Source: Octopus Deploy

77 Software Telemetry

78 DevOps Pipeline Code Source Control Build Q/A Deploy Prod

79 How do I get started?

80 What are the ingredients of a data-driven enterprise?

81 Strategy Culture People Technology Data

82 Strategy

83 People

84 Data

85 Technology

86 Culture

87 What is the process of becoming a data-driven enterprise?

88 AI Predict Analyze Organize Measure

89 1. Measure Transactions Instrumentation Logging Surveys Digitization External data Measure

90 2. Organize Transform Clean Store Data ETL Data Warehouse Data Lake Organize Measure

91 3. Analyze Reports Dashboards KPI monitors Decision support Descriptive analytics Diagnostic analytics Analyze Organize Measure

92 4. Predict Predict Predictive analytics Prescriptive analytics Machine learning Hypothesis testing Experimentation Analyze Organize Measure

93 5. Automate AI Predict Artificial intelligence Expert systems Deep learning Analyze Organize Measure

94 AI Predict Analyze Organize Measure

95 Advice for Success Get buy-in from leadership Focus on low-hanging fruit Don t silo data science teams Democratize your data

96 Advice for Success Embrace smart failure Focus on feedback Embed data collection Avoid the Observer Effect

97 Where to Go Next?

98 Where to Go Next Data Camp: Pluralsight: Coursera:

99 Pluralsight Courses Data Science: The Big Picture Data Science with R Exploratory Data Analysis with R Data Visualization with R (3-part) Deep Learning: The Big Picture

100

101 Feedback Very important to me! What did you like? What could I improve?

102 Conclusion

103 What data science is Why it is important How to get started

104

105 Are you prepared? Is your organization? Is our world prepared?

106

107 Thank You! Matthew Renze Data Science Consultant Renze Consulting Website:

108

Data Science: The Big #SQLServerUserGroupDubai

Data Science: The Big #SQLServerUserGroupDubai Data Science: The Big Picture @MatthewRenze #SQLServerUserGroupDubai Job Postings for Data Scientists Top-paying Tech Skills Skill 2016 Change Skill 2016 Change Source: Dice Salary Survey 2017 What

More information

SQLStarter Intro to Data Science. Dave

SQLStarter Intro to Data Science. Dave SQLStarter Dave Leininger @DaveLeininger SQLStarter Dave Leininger WHO IS FUSION ALLIANCE? SQLStarter: What is Data Science? Why would I want to be a Data Scientist? What are the tools and technologies?

More information

Big data is hard. Top 3 Challenges To Adopting Big Data

Big data is hard. Top 3 Challenges To Adopting Big Data Big data is hard Top 3 Challenges To Adopting Big Data Traditionally, analytics have been over pre-defined structures Data characteristics: Sales Questions answered with BI and visualizations: Customer

More information

The Hunt for the Data Scientist GIEWEE HAMMOND MSCAN, MSCAS LEAD DATA SCIENTIST, ARAMCO SERVICES COMPANY

The Hunt for the Data Scientist GIEWEE HAMMOND MSCAN, MSCAS LEAD DATA SCIENTIST, ARAMCO SERVICES COMPANY The Hunt for the Data Scientist GIEWEE HAMMOND MSCAN, MSCAS LEAD DATA SCIENTIST, ARAMCO SERVICES COMPANY Overview Highlight that data science has domain specific specialties To provide clarity on what

More information

The Importance of good data management and Power BI

The Importance of good data management and Power BI The Importance of good data management and Power BI The BI Iceberg Visualising Data is only the tip of the iceberg Data Preparation and provisioning is a complex process Streamlining this process is key

More information

Digital Transformation 2.0

Digital Transformation 2.0 Digital Transformation 2.0 Job roles and skills that every IT Services company must know We have been hearing for quite some time, that the world is going through digital transformation & HR department

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

More information

Azure ML Data Camp. Ivan Kosyakov MTC Architect, Ph.D. Microsoft Technology Centers Microsoft Technology Centers. Experience the Microsoft Cloud

Azure ML Data Camp. Ivan Kosyakov MTC Architect, Ph.D. Microsoft Technology Centers Microsoft Technology Centers. Experience the Microsoft Cloud Microsoft Technology Centers Microsoft Technology Centers Experience the Microsoft Cloud Experience the Microsoft Cloud ML Data Camp Ivan Kosyakov MTC Architect, Ph.D. Top Manager IT Analyst Big Data Strategic

More information

Databricks Cloud. A Primer

Databricks Cloud. A Primer Databricks Cloud A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to

More information

INTRODUCTION TO R FOR DATA SCIENCE WITH R FOR DATA SCIENCE DATA SCIENCE ESSENTIALS INTRODUCTION TO PYTHON FOR DATA SCIENCE. Azure Machine Learning

INTRODUCTION TO R FOR DATA SCIENCE WITH R FOR DATA SCIENCE DATA SCIENCE ESSENTIALS INTRODUCTION TO PYTHON FOR DATA SCIENCE. Azure Machine Learning Data Science Track WITH EXCEL INTRODUCTION TO R FOR DATA SCIENCE PROGRAMMING WITH R FOR DATA SCIENCE APPLIED MACHINE LEARNING SCENARIOS HDInsight Certificate of DATA SCIENCE ORIENTATION QUERYING DATA WITH

More information

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK Are you drowning in Big Data? Do you lack access to your data? Are you having a hard time managing Big Data processing requirements?

More information

Big Data Introduction

Big Data Introduction Big Data Introduction Who we are Experts At Your Service Over 50 specialists in IT infrastructure Certified, experienced, passionate Based In Switzerland 100% self-financed Swiss company Over CHF8 mio.

More information

Applications and Big Data Converge

Applications and Big Data Converge Applications and Big Data Converge ME: Nevin Pick Big Data Practice Lead MOBIA: Founded 1986 220+ employees Dartmouth (HQ), Montreal, Toronto Windsor, Calgary, Cincinnati Big Data Defined What Big Data

More information

Building Your Big Data Team

Building Your Big Data Team Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.

More information

Data IBM. Education for our Data Scientists. Emily Plachy, Distinguished Engineer, IBM Global Chief Data Office May 1, 2017

Data IBM. Education for our Data Scientists. Emily Plachy, Distinguished Engineer, IBM Global Chief Data Office May 1, 2017 Data Science @ IBM Education for our Data Scientists Emily Plachy, Distinguished Engineer, IBM Global May 1, 2017 Global What is a Data Scientist? Data Scientists are Pioneers Work with business leaders

More information

Big Data Job Descriptions. Software Engineer - Algorithms

Big Data Job Descriptions. Software Engineer - Algorithms Big Data Job Descriptions Software Engineer - Algorithms This position is responsible for meeting the big data needs of our various products and businesses. Specifically, this position is responsible for

More information

EXAMPLE SOLUTIONS Hadoop in Azure HBase as a columnar NoSQL transactional database running on Azure Blobs Storm as a streaming service for near real time processing Hadoop 2.4 support for 100x query gains

More information

Architecting an Open Data Lake for the Enterprise

Architecting an Open Data Lake for the Enterprise Architecting an Open Data Lake for the Enterprise 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Today s Presenters Daniel Geske, Solutions Architect, Amazon Web Services Armin

More information

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop Simplifying the Process of Uploading and Extracting Data from Apache Hadoop Rohit Bakhshi, Solution Architect, Hortonworks Jim Walker, Director Product Marketing, Talend Page 1 About Us Rohit Bakhshi Solution

More information

Deloitte School of Analytics. Demystifying Data Science: Leveraging this phenomenon to drive your organisation forward

Deloitte School of Analytics. Demystifying Data Science: Leveraging this phenomenon to drive your organisation forward Deloitte School of Analytics Demystifying Data Science: Leveraging this phenomenon to drive your organisation forward February 2018 Agenda 7 February 2018 8 February 2018 9 February 2018 8:00 9:00 Networking

More information

Application Integrator Automate Any Application

Application Integrator Automate Any Application Application Integrator Automate Any Application BMC Control-M by applications BMC Control-M by platforms ERP Business Intelligence Data Integration / ETL OS Platform SAP Oracle ebusiness Suite PeopleSoft

More information

Microsoft Azure Essentials

Microsoft Azure Essentials Microsoft Azure Essentials Azure Essentials Track Summary Data Analytics Explore the Data Analytics services in Azure to help you analyze both structured and unstructured data. Azure can help with large,

More information

Agile Software Requirements. Matthew Renze Iowa State University COMS 409 Software Requirements

Agile Software Requirements. Matthew Renze Iowa State University COMS 409 Software Requirements Agile Software Requirements Matthew Renze Iowa State University COMS 409 Software Requirements Purpose Introduce you to Agile software development Discuss Agile software requirements Overview What is Agile?

More information

Modern Analytics Architecture

Modern Analytics Architecture Modern Analytics Architecture So what is a. Modern analytics architecture? Machine Learning AI Open source Big Data DevOps Cloud In-memory IoT Trends supporting Next-Generation analytics Source: Next-Generation

More information

Transforming Analytics with Cloudera Data Science WorkBench

Transforming Analytics with Cloudera Data Science WorkBench Transforming Analytics with Cloudera Data Science WorkBench Process data, develop and serve predictive models. 1 Age of Machine Learning Data volume NO Machine Learning Machine Learning 1950s 1960s 1970s

More information

Jason Virtue Business Intelligence Technical Professional

Jason Virtue Business Intelligence Technical Professional Jason Virtue Business Intelligence Technical Professional jvirtue@microsoft.com Agenda Microsoft Azure Data Services Azure Cloud Services Azure Machine Learning Azure Service Bus Azure Stream Analytics

More information

How to Build Your Data Ecosystem with Tableau on AWS

How to Build Your Data Ecosystem with Tableau on AWS How to Build Your Data Ecosystem with Tableau on AWS Moving Your BI to the Cloud Your BI is working, and it s probably working well. But, continuing to empower your colleagues with data is going to be

More information

Business is being transformed by three trends

Business is being transformed by three trends Business is being transformed by three trends Big Cloud Intelligence Stay ahead of the curve with Cortana Intelligence Suite Business apps People Custom apps Apps Sensors and devices Cortana Intelligence

More information

Azure Data Analytics & Machine Learning Seminar. Daire Cunningham: BI Practice Area Manager

Azure Data Analytics & Machine Learning Seminar. Daire Cunningham: BI Practice Area Manager Azure Data Analytics & Machine Learning Seminar Daire Cunningham: BI Practice Area Manager AGENDA 09:00 AM 09:30 AM Registration & Refreshments 09.30AM 10:00 AM 10:00 AM 10:30 AM Welcome & Keynote, Ger

More information

Designing Business Intelligence Solutions with Microsoft SQL Server 2014

Designing Business Intelligence Solutions with Microsoft SQL Server 2014 Designing Business Intelligence Solutions with Microsoft SQL Server 2014 20467D; 5 Days, Instructor-led Course Description This five-day instructor-led course teaches students how to implement self-service

More information

How Data Science is Changing the Way Companies Do Business Colin White

How Data Science is Changing the Way Companies Do Business Colin White How Data Science is Changing the Way Companies Do Business Colin White BI Research July 17, 2014 Sponsor 2 Speakers Colin White President, BI Research Bill Franks Chief Analytics Officer, Teradata 3 How

More information

Designing Business Intelligence Solutions with Microsoft SQL Server 2014 Course Code: 20467D

Designing Business Intelligence Solutions with Microsoft SQL Server 2014 Course Code: 20467D Designing Business Intelligence Solutions with Microsoft SQL Server 2014 Course Code: 20467D Duration: 5 Days Overview About this course This five-day instructor-led course teaches students how to implement

More information

Bringing the Power of SAS to Hadoop Title

Bringing the Power of SAS to Hadoop Title WHITE PAPER Bringing the Power of SAS to Hadoop Title Combine SAS World-Class Analytics With Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities ii Contents Introduction... 1 What

More information

30 Minutes Overview of Data Science for Business

30 Minutes Overview of Data Science for Business Break Down of 90 Minutes 30 Minutes Overview of Data Science for Business 30 Minutes Team Discussion Use Cases 25 Minutes Sharing 1 IBM Analytics Academy Data Science is Reinventing Business Carlo Appugliese

More information

Course Content. The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight.

Course Content. The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight. Course Content Course Description: The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight. At Course Completion: After competing this course,

More information

Advanced Analytics in Azure

Advanced Analytics in Azure Explore What s Possible. Advanced Analytics in Azure Amie Mason, Practice Lead Data Science & Analytics amiem@attunix.com The Attunix Difference business technology Attunix delivers results at the intersection

More information

20775A: Performing Data Engineering on Microsoft HD Insight

20775A: Performing Data Engineering on Microsoft HD Insight 20775A: Performing Data Engineering on Microsoft HD Insight Duration: 5 days; Instructor-led Implement Spark Streaming Using the DStream API. Develop Big Data Real-Time Processing Solutions with Apache

More information

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW TOPICS COVERED 1 2 Fundamentals of Big Data Platforms Major Big Data Tools Scaling Up vs. Out SCALE UP (SMP) SCALE OUT (MPP) + (n) Upgrade

More information

Evolution or Revolution: Top Ten Development Trends

Evolution or Revolution: Top Ten Development Trends Evolution or Revolution: Top Ten Development Trends Jim Lundy CEO and Lead Analyst IT Development Trends: Building a Fighter Jet Agenda What are the Top Ten Trends in Development? What are the Best Practices

More information

20775 Performing Data Engineering on Microsoft HD Insight

20775 Performing Data Engineering on Microsoft HD Insight Duración del curso: 5 Días Acerca de este curso The main purpose of the course is to give students the ability plan and implement big data workflows on HD. Perfil de público The primary audience for this

More information

Roles and Processes in Analytics Development

Roles and Processes in Analytics Development Roles and Processes in Analytics Development The rapid evolution of data analytics has been accelerated by advances in: large scale Internet connectivity data warehousing data analysis and mining algorithms

More information

Garanti Bank s Journey to Big Data Ayşen Büyükakın Business Intelligence & Analytics Unit Manager

Garanti Bank s Journey to Big Data Ayşen Büyükakın Business Intelligence & Analytics Unit Manager Garanti Bank s Journey to Big Data Ayşen Büyükakın Business Intelligence & Analytics Unit Manager ...through a planned change journey in BI & Analytics Strategic change projects. 1998 1999 2000 2001 2002

More information

Exploratory Data Analysis with #PrDC16

Exploratory Data Analysis with #PrDC16 Exploratory Data Analysis with R @MatthewRenze #PrDC16 Motivation The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it that

More information

20775: Performing Data Engineering on Microsoft HD Insight

20775: Performing Data Engineering on Microsoft HD Insight Let s Reach For Excellence! TAN DUC INFORMATION TECHNOLOGY SCHOOL JSC Address: 103 Pasteur, Dist.1, HCMC Tel: 08 38245819; 38239761 Email: traincert@tdt-tanduc.com Website: www.tdt-tanduc.com; www.tanducits.com

More information

Statistics & Optimization with Big Data

Statistics & Optimization with Big Data Statistics & Optimization with Big Data Technology and data driven decision science company focused on helping academics to solve big data and analytics problems of any kind, from any source, at massive

More information

Introduction to Agile and Scrum

Introduction to Agile and Scrum Introduction to Agile and Scrum Matthew Renze @matthewrenze COMS 309 - Software Development Practices Purpose Intro to Agile and Scrum Prepare you for the industry Questions and answers Overview Intro

More information

Data Visualization with #KCDC

Data Visualization with #KCDC Data Visualization with R @MatthewRenze #KCDC Overview Introduction to R Intro to Data Visualization Types of Data Visualizations Visualizing One Variable Visualizing Two Variables Visualizing

More information

BIG DATA and DATA SCIENCE

BIG DATA and DATA SCIENCE Integrated Program In BIG DATA and DATA SCIENCE CONTINUING STUDIES Table of Contents About the Course...03 Key Features of Integrated Program in Big Data and Data Science...04 Learning Path...05 Key Learning

More information

Joining the disruption in the Asset Management Industry How to evaluate new technologies and implement new ideas like a start up company

Joining the disruption in the Asset Management Industry How to evaluate new technologies and implement new ideas like a start up company Joining the disruption in the Asset Management Industry How to evaluate new technologies and implement new ideas like a start up company Kyle Kung, PhD GX Innovation Lab State Street Global Exchange September

More information

Cloudera Data Science and Machine Learning. Robin Harrison, Account Executive David Kemp, Systems Engineer. Cloudera, Inc. All rights reserved.

Cloudera Data Science and Machine Learning. Robin Harrison, Account Executive David Kemp, Systems Engineer. Cloudera, Inc. All rights reserved. Cloudera Data Science and Machine Learning Robin Harrison, Account Executive David Kemp, Systems Engineer 1 This is the age of machine learning. Data volume NO Machine Learning Machine Learning 1950s 1960s

More information

20775A: Performing Data Engineering on Microsoft HD Insight

20775A: Performing Data Engineering on Microsoft HD Insight 20775A: Performing Data Engineering on Microsoft HD Insight Course Details Course Code: Duration: Notes: 20775A 5 days This course syllabus should be used to determine whether the course is appropriate

More information

Career Center. Resources for Exploring Careers. in Data Science. Explore the Variety of Career Paths with These Example Fields & Roles

Career Center. Resources for Exploring Careers. in Data Science. Explore the Variety of Career Paths with These Example Fields & Roles Career Center Resources for Exploring Careers in Data Science Data science is a growing field across sectors, and data science jobs are often ranked very highly. Graduate students technical skills and

More information

Mass-Scale, Automated Machine Learning and Model Deployment Using SAS Factory Miner and SAS Decision Manager

Mass-Scale, Automated Machine Learning and Model Deployment Using SAS Factory Miner and SAS Decision Manager Mass-Scale, Automated Machine Learning and Model Deployment Using SAS Factory Miner and SAS Decision Manager Jonathan Wexler Principal Product Manager Data Mining and Machine Learning SAS Steve Sparano

More information

ETL challenges on IOT projects. Pedro Martins Head of Implementation

ETL challenges on IOT projects. Pedro Martins Head of Implementation ETL challenges on IOT projects Pedro Martins Head of Implementation Outline What is Pentaho Pentaho Data Integration (PDI) Smartcity Copenhagen Example of Data structure without an OLAP schema Telematics

More information

Venkata Reddy Konasani

Venkata Reddy Konasani 1 Mar-2017 Co-founder of statinfer.com Data Scientist / Corporate Trainer/ Author 21.venkat@gmail.com / venkat@statinfer.com https://www.linkedin.com/in/venkata-reddy-konasani-b2659514 Specializations

More information

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake White Paper Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake Motivation for Modernization It is now a well-documented realization among Fortune 500 companies

More information

8 Steps CIOs Must Take To Transform With Artificial Intelligence

8 Steps CIOs Must Take To Transform With Artificial Intelligence Forrester Consulting Thought Leadership Checklist Commissioned By Dell EMC May 2018 8 Steps CIOs Must Take To Transform With Artificial Intelligence Companies across the globe are embracing artificial

More information

EMBED ANALYTICS EVERYWHERE Tomáš Jurczyk

EMBED ANALYTICS EVERYWHERE Tomáš Jurczyk EMBED ANALYTICS EVERYWHERE Tomáš Jurczyk Email: tomas.jurczyk@quest.com AGENDA Short introduction of Statistica Enabling Collective Intelligence INTEGRATION WITH ANALYTICS MARKETPLACES Empowering Citizen

More information

Confidential

Confidential June 2017 1. Is your EDW becoming too expensive to maintain because of hardware upgrades and increasing data volumes? 2. Is your EDW becoming a monolith, which is too slow to adapt to business s analytical

More information

Data Analytics for Semiconductor Manufacturing The MathWorks, Inc. 1

Data Analytics for Semiconductor Manufacturing The MathWorks, Inc. 1 Data Analytics for Semiconductor Manufacturing 2016 The MathWorks, Inc. 1 Competitive Advantage What do we mean by Data Analytics? Analytics uses data to drive decision making, rather than gut feel or

More information

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand Paper 2698-2018 Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand ABSTRACT Digital analytics is no longer just about tracking the number

More information

Enabling Self-Service Analytics Across The UDA With Teradata AppCenter

Enabling Self-Service Analytics Across The UDA With Teradata AppCenter Enabling Self-Service Analytics Across The UDA With Teradata AppCenter Chaitanya Atreya Director, AppCenter Engineering, Teradata Jeremy Wilken AppCenter Architect, Product Manager, Teradata #TDPARTNERS16

More information

The Alpine Data Platform

The Alpine Data Platform The Alpine Data Platform TABLE OF CONTENTS ABOUT ALPINE.... 2 ALPINE PRODUCT OVERVIEW... 3 PRODUCT ARCHITECTURE.... 5 SYSTEM REQUIREMENTS.... 6 ABOUT ALPINE DATA ADVANCED ANALYTICS FOR THE ENTERPRISE Alpine

More information

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

E-guide Hadoop Big Data Platforms Buyer s Guide part 1 Hadoop Big Data Platforms Buyer s Guide part 1 Your expert guide to Hadoop big data platforms for managing big data David Loshin, Knowledge Integrity Inc. Companies of all sizes can use Hadoop, as vendors

More information

BIG DATA & ADVANCED ANALYTICS ROADSHOW

BIG DATA & ADVANCED ANALYTICS ROADSHOW BIG DATA & ADVANCED ANALYTICS ROADSHOW 2 Copyright 2014, Neudesic. All rights reserved. CO-SPONSORS UPCOMING ROADSHOW STOPS Los Angeles: Wednesday, February 10 th Orange County: Thursday, February 11 th

More information

Brian Macdonald Big Data & Analytics Specialist - Oracle

Brian Macdonald Big Data & Analytics Specialist - Oracle Brian Macdonald Big Data & Analytics Specialist - Oracle Improving Predictive Model Development Time with R and Oracle Big Data Discovery brian.macdonald@oracle.com Copyright 2015, Oracle and/or its affiliates.

More information

H2O Powers Intelligent Product Recommendation Engine at Transamerica. Case Study

H2O Powers Intelligent Product Recommendation Engine at Transamerica. Case Study H2O Powers Intelligent Product Recommendation Engine at Transamerica Case Study Summary For a financial services firm like Transamerica, sales and marketing efforts can be complex and challenging, with

More information

Pentaho Technical Overview. Max Felber Solution Engineer September 22, 2016

Pentaho Technical Overview. Max Felber Solution Engineer September 22, 2016 Pentaho Technical Overview Max Felber Solution Engineer mfelber@pentaho.com September 22, 2016 Industry Leader in Self-Service Big Data Preparation Gartner recently completed a study on 36 selfservice

More information

Big Data Application Engineer/ Developer. Specialization in Apache Spark, Kafka, Airflow, HBase

Big Data Application Engineer/ Developer. Specialization in Apache Spark, Kafka, Airflow, HBase BIG DATA COURSE Big Data Application Engineer/ Developer Specialization in Apache Spark, Kafka, Airflow, HBase In Exclusive Association with 21,347+ Participants 10,000+ Brands 1200+ Trainings 45+ Countries

More information

: What are examples of data science jobs?

: What are examples of data science jobs? by Daniel J. Power Editor, DSSResources.COM Data scientist is the "new", "hot", "sexy" and high paying job associated with decision support and analytics. Why? Because data scientists are "the key to realizing

More information

Make Business Intelligence Work on Big Data

Make Business Intelligence Work on Big Data Make Business Intelligence Work on Big Data Speed. Scale. Simplicity. Put the Power of Big Data in the Hands of Business Users Connect your BI tools directly to your big data without compromising scale,

More information

Cask Data Application Platform (CDAP) Extensions

Cask Data Application Platform (CDAP) Extensions Cask Data Application Platform (CDAP) Extensions CDAP Extensions provide additional capabilities and user interfaces to CDAP. They are use-case specific applications designed to solve common and critical

More information

Analytics for All Your Data: Cloud Essentials. Pervasive Insight in the World of Cloud

Analytics for All Your Data: Cloud Essentials. Pervasive Insight in the World of Cloud Analytics for All Your Data: Cloud Essentials Pervasive Insight in the World of Cloud The Opportunity We re living in a world where just about everything we see, do, hear, feel, and experience is captured

More information

Course 20467C: Designing Self-Service Business Intelligence and Big Data Solutions

Course 20467C: Designing Self-Service Business Intelligence and Big Data Solutions Course 20467C: Designing Self-Service Business Intelligence and Big Data Solutions Length : 5 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server 2014 Delivery Method : Instructor-led

More information

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation Roger Ding Cloudera February 3rd, 2018 1 Agenda Hadoop History Introduction to Apache Hadoop

More information

Microsoft Developer Day

Microsoft Developer Day Microsoft Developer Day Dr Graham Williams Microsoft Developer Day Director of Data Science, Pacific Asia, Data Group, Cloud and Enterprise Data Scientists Transform Data into Information Data Scientists

More information

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration KnowledgeSTUDIO Advanced Modeling for Better Decisions Companies that compete with analytics are looking for advanced analytical technologies that accelerate decision making and identify opportunities

More information

Your Big Data to Big Data tools using the family of PI Integrators

Your Big Data to Big Data tools using the family of PI Integrators 1 Your Big Data to Big Data tools using the family of PI Integrators Presented by Martin Bryant Field Service Engineer @osisoft PI Integrators PI Integrator for Business Analytics PI Integrator for Business

More information

Getting the Most Out of PureConnect Analytics and Reporting

Getting the Most Out of PureConnect Analytics and Reporting Getting the Most Out of PureConnect Analytics and Reporting Doyle Erickson American Family Tele/Customer Contact Tech Manager Karen Torf Genesys Sr. Product Manager PureConnect Analytics Vision Statement

More information

GET MORE VALUE OUT OF BIG DATA

GET MORE VALUE OUT OF BIG DATA GET MORE VALUE OUT OF BIG DATA Enterprise data is increasing at an alarming rate. An International Data Corporation (IDC) study estimates that data is growing at 50 percent a year and will grow by 50 times

More information

Exelon Utilities Data Analytics Journey

Exelon Utilities Data Analytics Journey Exelon Utilities Data Analytics Journey Presented by Dean M Hengst PI System uses with-in Exelon Utilities Intelligent Substation Substation Security Historical Playback / Capacity Planning ComEd as implemented

More information

1% + 99% = AI Popularization

1% + 99% = AI Popularization 1% + 99% = AI Popularization Unifying Data Science and Engineering Jason Bissell General Manager, APAC The beginnings of Apache Spark at UC Berkeley AMPLab funded by tech companies: Got a glimpse at their

More information

Analytics in Action transforming the way we use and consume information

Analytics in Action transforming the way we use and consume information Analytics in Action transforming the way we use and consume information Big Data Ecosystem The Data Traditional Data BIG DATA Repositories MPP Appliances Internet Hadoop Data Streaming Big Data Ecosystem

More information

ALEXANDER PIQUER ANDERSON AMARAL. OSIsoft. OSIsoft

ALEXANDER PIQUER ANDERSON AMARAL. OSIsoft. OSIsoft ALEXANDER PIQUER OSIsoft ANDERSON AMARAL OSIsoft COMO INTEGRAR PROCESSO E INTELIGÊNCIA DE NEGÓCIOS Re-use Data Use Data Capture Data Historian/ Report Disponibilização Web e Móvel de Dashboards Corporativos

More information

Hadoop Course Content

Hadoop Course Content Hadoop Course Content Hadoop Course Content Hadoop Overview, Architecture Considerations, Infrastructure, Platforms and Automation Use case walkthrough ETL Log Analytics Real Time Analytics Hbase for Developers

More information

Market for BI & Data Analytics

Market for BI & Data Analytics Market for BI & Data Analytics April 05, 2017 Akscellence Info Solutions Research Case Study 2017 Akscellence Info. PROBLEM STATEMENT Why Traditional Business Intelligence techniques are failing to resolve

More information

Cask Data Application Platform (CDAP)

Cask Data Application Platform (CDAP) Cask Data Application Platform (CDAP) CDAP is an open source, Apache 2.0 licensed, distributed, application framework for delivering Hadoop solutions. It integrates and abstracts the underlying Hadoop

More information

Who is Databricks? Today, hundreds of organizations around the world use Databricks to build and power their production Spark applications.

Who is Databricks? Today, hundreds of organizations around the world use Databricks to build and power their production Spark applications. Databricks Primer Who is Databricks? Databricks was founded by the team who created Apache Spark, the most active open source project in the big data ecosystem today, and is the largest contributor to

More information

Staffing Services Portfolio Advisory Fulfilment

Staffing Services Portfolio Advisory Fulfilment Staffing Services Portfolio Advisory Fulfilment Helping you plan your medium and long term resourcing needs Snapshot: Leveraging on our deep understanding of Digital Transformation and technologyled disruption

More information

Architecture Overview for Data Analytics Deployments

Architecture Overview for Data Analytics Deployments Architecture Overview for Data Analytics Deployments Mahmoud Ghanem Sr. Systems Engineer GLOBAL SPONSORS Agenda The Big Picture Top Use Cases for Data Analytics Modern Architecture Concepts for Data Analytics

More information

The Fast (Developer) and the Furious (Ops Team)

The Fast (Developer) and the Furious (Ops Team) The Fast (Developer) and the Furious (Ops Team) Martin Percival Solutions Architect, Red Hat @martinpercival An INNOVATION problem? A THROUGHPUT problem? A QUALITY problem? We need to deliver more apps,

More information

Organon Advisors, Inc.

Organon Advisors, Inc. 1 Organon Advisors, Inc. Applications. Analytics. Assurance. @ 202.905.6613 jdaniel@organonadvisors.com 5000 College Ave., #2122 College Park, MD 20740 LinkedIn.com/OrganonAdvisors 2 Table of Contents

More information

HDInsight - Hadoop for the Commoner Matt Stenzel Data Platform Technical Specialist

HDInsight - Hadoop for the Commoner Matt Stenzel Data Platform Technical Specialist HDInsight - Hadoop for the Commoner 10-1-2016 Matt Stenzel Data Platform Technical Specialist SQL Saturday #557 Thank you Sponsors! Please visit the sponsors and enter their end-of-day raffles. Event After

More information

DATA ANALYTICS WITH R, EXCEL & TABLEAU

DATA ANALYTICS WITH R, EXCEL & TABLEAU Learn. Do. Earn. DATA ANALYTICS WITH R, EXCEL & TABLEAU COURSE DETAILS centers@acadgild.com www.acadgild.com 90360 10796 Brief About this Course Data is the foundation for technology-driven digital age.

More information

This document (including, without limitation, any product roadmap or statement of direction data) illustrates the planned testing, release and

This document (including, without limitation, any product roadmap or statement of direction data) illustrates the planned testing, release and Shawn Rogers Orchestrating and Managing Enterprise Analytics DISCLAIMER During the course of this presentation, TIBCO or its representatives may make forward-looking statements regarding future events,

More information

BIG Data Analytics AWS Training

BIG Data Analytics AWS Training BIG Data Analytics AWS Training About Instructor Name: Kesav Total IT work experience: 20+ Years BIG Data Solutions Architect: 5+ Years DW & BI Solution Architect: 15+ Years Big Data Implementations Experience:

More information

Welcome! 2013 SAP AG or an SAP affiliate company. All rights reserved.

Welcome! 2013 SAP AG or an SAP affiliate company. All rights reserved. Welcome! 2013 SAP AG or an SAP affiliate company. All rights reserved. 1 SAP Big Data Webinar Series Big Data - Introduction to SAP Big Data Technologies Big Data - Streaming Analytics Big Data - Smarter

More information

Real Applications of Big Data. Challenges and Opportunities

Real Applications of Big Data. Challenges and Opportunities Real Applications of Big Data. Challenges and Opportunities David Gil University of Alicante Polytechnic School - EPSA Department of Computer Technology Introduction to Big Data Projects Benchmarking (weka,

More information

Two offerings which interoperate really well

Two offerings which interoperate really well Microsoft Two offerings which interoperate really well On-premises Cortana Intelligence Suite SQL Server 2016 Cloud IAAS Enterprise PAAS Cloud Storage Service 9 SQL Server 2016: Everything built-in built-in

More information

Advancing your Big Data Strategy

Advancing your Big Data Strategy Welcome # T C 1 8 Advancing your Big Data Strategy Robbin Cottiss Strategic Customer Consultant Tableau Vindy Krishnan Senior Product Manager Tableau You Know Me And Me DATA TABLEAU AND Audience Poll How

More information