BIG DATA and DATA SCIENCE

Similar documents
ABOUT THIS TRAINING: This Hadoop training will also prepare you for the Big Data Certification of Cloudera- CCP and CCA.

Powered by Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS

Powered by. Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS

BIG WITH BIG DATA ANALYTICS

Big Data & Hadoop Advance

BIG WITH BIG DATA ANALYTICS

BIG WITH BIG DATA ANALYTICS

Powered by. Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS

Powered by Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS

Big Data Application Engineer/ Developer. Specialization in Apache Spark, Kafka, Airflow, HBase

BIG DATA AND HADOOP DEVELOPER

20775A: Performing Data Engineering on Microsoft HD Insight

Powered by Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS

Preface About the Book

Big Data Foundation. 2 Days Classroom Training PHILIPPINES :: MALAYSIA :: VIETNAM :: SINGAPORE :: INDIA

Course Content. The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight.

Transforming Analytics with Cloudera Data Science WorkBench

20775 Performing Data Engineering on Microsoft HD Insight

20775A: Performing Data Engineering on Microsoft HD Insight

20775: Performing Data Engineering on Microsoft HD Insight

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation

Big Data Job Descriptions. Software Engineer - Algorithms

Official Recruitment Partner of Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS. Powered by.

Official Recruitment Partner of Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS. Powered by.

Official Recruitment Partner of Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS. Powered by.

Official Recruitment Partner of Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS. Powered by.

Official Recruitment Partner of Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS. Powered by.

5th Annual. Cloudera, Inc. All rights reserved.

Digital Transformation 2.0

Official Recruitment Partner of Tech Mahindra MAKE IT BIG WITH BIG DATA ANALYTICS. Powered by.

Hadoop Course Content

Data Science Architect Masters

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW

Cloudera Data Science and Machine Learning. Robin Harrison, Account Executive David Kemp, Systems Engineer. Cloudera, Inc. All rights reserved.

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

Statistics & Optimization with Big Data

Spark and Hadoop Perfect Together

Outline of Hadoop. Background, Core Services, and Components. David Schwab Synchronic Analytics Nov.

Post Graduate Program in BIG DATA ENGINEERING. In association with 11 MONTHS ONLINE

Big data is hard. Top 3 Challenges To Adopting Big Data

Leveraging Oracle Big Data Discovery to Master CERN s Data. Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden

Data Analytics Training Program using

Cloudera, Inc. All rights reserved.

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

Big Data Introduction

Microsoft Azure Essentials

BIG DATA ANALYTICS WITH HADOOP. 40 Hour Course

Bringing the Power of SAS to Hadoop Title

Business Analytics using R

Deloitte School of Analytics. Demystifying Data Science: Leveraging this phenomenon to drive your organisation forward

IBM Analytics Unleash the power of data with Apache Spark

Big Data Hadoop Administrator.

Cask Data Application Platform (CDAP)

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

INTRODUCTION TO R FOR DATA SCIENCE WITH R FOR DATA SCIENCE DATA SCIENCE ESSENTIALS INTRODUCTION TO PYTHON FOR DATA SCIENCE. Azure Machine Learning

MapR: Solution for Customer Production Success

Modernizing Your Data Warehouse with Azure

Sunnie Chung. Cleveland State University

Big Data Hadoop Developer Training

Daniels College of Business University of Denver MSBA Program (58 Credit-Hours) and MSBA DUGG (48 Credit-Hours) Revised: May 17, 2018

H2O Powers Intelligent Product Recommendation Engine at Transamerica. Case Study

Databricks Cloud. A Primer

Hortonworks Connected Data Platforms

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Insights to HDInsight

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB

Certified Program in Data science

AZURE HDINSIGHT. Azure Machine Learning Track Marek Chmel

BIG DATA AND MACHINE LEARNING PRODEGREE

Building Your Big Data Team


Leveraging Predictive Tools to Decrease Resolution Time

Exploring Big Data and Data Analytics with Hadoop and IDOL. Brochure. You are experiencing transformational changes in the computing arena.

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand

Model Curriculum JUNIOR DATA ASSOCIATE SECTOR: SUB-SECTOR: OCCUPATION: REFERENCE ID: NSQF LEVEL:

DevSci: Better Software Through Data #KCDC2018

Data Science End to End

SAS and Hadoop Technology: Overview

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

Meta-Managed Data Exploration Framework and Architecture

1Week. Big Data & Hadoop. Why big data & Hadoop is important? National Winter Training program on

Why Big Data Matters? Speaker: Paras Doshi

Operational Hadoop and the Lambda Architecture for Streaming Data

Hadoop and Analytics at CERN IT CERN IT-DB

DATA ANALYTICS WITH R, EXCEL & TABLEAU

Data Analytics for Semiconductor Manufacturing The MathWorks, Inc. 1

Architecture Optimization for the new Data Warehouse. Cloudera, Inc. All rights reserved.

What s New. Bernd Wiswedel KNIME KNIME AG. All Rights Reserved.

Apache Spark and R A (big data) love story?

Supercharge your Data Science Career! High Demand, Soaring Salaries

Charter Global. Digital Solutions and Consulting Services. Digital Solutions. QA Testing

Big Data Hadoop Administrator Training

Oracle Big Data Cloud Service

The Alpine Data Platform

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Hortonworks Data Platform

REDEFINE BIG DATA. Zvi Brunner CTO. Copyright 2015 EMC Corporation. All rights reserved.

Supercharge your Data Science Career! High Demand, Soaring Salaries

Course 20467C: Designing Self-Service Business Intelligence and Big Data Solutions

SAS & HADOOP ANALYTICS ON BIG DATA

Transcription:

Integrated Program In BIG DATA and DATA SCIENCE CONTINUING STUDIES

Table of Contents About the Course...03 Key Features of Integrated Program in Big Data and Data Science...04 Learning Path...05 Key Learning Objectives...06 Step 1 : Data Science with R... 07 Data Science with Python Step 2 : Big Data Hadoop and Spark Developer...09 Step 3 : Tableau Desktop...10 Step 4 : Machine Learning... 11 Electives...12 2 http://learnmore.duke.edu

About the Course The Big Data and Data Science program is a five-course, integrated, all-inclusive certificate program for Big Data and Data Science professionals. The curriculum is comprehensive and spans the major technologies in big data, data science, and reporting/visualization. The recommended learning path for this certificate program has been designed by renowned industry experts and big data influences to maximize your learning potential. As each course of the program builds upon the next, concepts introduced initially in the learning path will contribute to your proficiency with concepts for the later courses of the program. Resources such as live virtual teaching sessions, access to an instructor, and non-graded electives of your choosing reinforce this programs learning experience. 3 http://learnmore.duke.edu

Key Features Industry-recommended learning path Access to 300+ hours of content created by industry experts Hands-on project execution on CloudLabs Aligned with the Cloudera CCA175 Certification and Tableau Desktop 10 Associate Certification Duke University Certificate upon successful completion of the course 30+ real-life industry projects in retail, insurance, healthcare, banking, telecommunication, airline and social media 4 http://learnmore.duke.edu

Learning Path 1 Data Science with R Data Science with Python 3 2 Tableau Desktop Big Data Hadoop and Spark Developer Machine Learning 4 BIG DATA AND DATA SCIENCE 5 http://learnmore.duke.edu

Key Learning Objectives This learning path is designed for a professional interested in the field of analytics who wishes to develop skills in both big data and data science. Data Science with R Learn R programming language and all the important statistical and predictive analytics concepts Data Science with Python Introduces the various packages in Python like NumPy, SciPy, Pandas, and Scikit-learn for performing data analysis. Big Data Hadoop and Spark Developer Learn the various components of Hadoop and Spark ecosystem. The course is aligned to Cloudera CCA175 certification. Tableau Desktop and Visualization Training Learn the various aspects of Tableau. Aligned with Tableau Desktop Qualified Associate certification. Machine Learning Gain an understanding of Machine Learning applications and algorithms. It also covers deep learning and Spark Machine learning. 6 http://learnmore.duke.edu

STEP 1 2 3 4 Data Science with R This course has been designed to impart an in-depth knowledge of the various data analytics techniques that can be performed using R. It includes real-life projects, case studies, and R CloudLabs for practice. Key Learning Objectives Gain a foundational understanding of business analytics. Learn the R programming and how various statements are executed. Gain an in-depth understanding of data structure used in R and learn to import/export data in R. Define and use the various apply functions and DPLYP functions. Recognize and use the various graphics in R for data visualization. Gain a basic understanding of the various statistical concepts. Understand the hypothesis testing method to drive business decisions. Become familiar with regression models and classification techniques. Learn and use the various association rules and the Apriori algorithm. Gain an understanding of clustering methods including K-means, DBSCAN, and hierarchical clustering. 7 http://learnmore.duke.edu

STEP 1 2 3 4 Data Science with Python Learn data analytics, machine learning, and web scraping using Python programming. Gain an in-depth understanding of the various packages in Python like NumPy, SciPy, Pandas, and Scikit-learn for performing data analysis, implementing machine learning models, and NLP. Key Learning Objectives Gain an in-depth understanding of data wrangling, data exploration, data visualization, hypothesis building, and testing. Understand the essential concepts of Python programming like data types, tuples, lists, dicts, basic operators, and functions. Perform high-level mathematical computing using NumPy package and its large library of mathematical functions. Conduct scientific and technical computing using SciPy package and its sub-packages such as Integrate, Optimize, Statistics, IO, and Weave. Perform data analysis and manipulation using data structures and tools provided in Pandas package. Gain knowledge in machine learning using the Scikit-Learn package. Use matplotlib library of Python for data visualization. Extract useful data from websites by performing web scrapping. Integrate Python with Hadoop, Spark, and MapReduce. 8 http://learnmore.duke.edu

STEP 1 2 3 4 Big Data Hadoop & Spark Developer This course has been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course contains real-life projects and case studies to be executed in CloudLabs and aligns with the Cloudera CCA175 certification. Key Learning Objectives Understand the architecture of HDFS and YARN, and learn how to work with them for storage and resource management. Recognize MapReduce and its characteristics. Receive an overview of Sqoop and Flume and how to ingest data. Create databases and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning. Learn Flume architecture, sources, sinks and configurations. Understand HBase, its architecture, data storage. Gain a working knowledge of Pig and its components. Perform functional programming in Spark, understand RDDs and build Spark applications. Learn Spark SQL, and learn about creating, transforming, and querying data frames. Course aligned with the Cloudera Big Data CCA175 certification. 9 http://learnmore.duke.edu

STEP 1 2 3 4 Tableau Desktop 10 The focus of the course is to help you learn Tableau Desktop 10 skills such as visualization building, analytics, and dashboards. This course is also aligned with the Tableau Desktop 10 Qualified Associate exam. Key Learning Objectives Grasp the concepts of Tableau Desktop 10 and learn Tableau statistics and building interactive dashboards. Learn data connections as well as organizing and simplifying data. Understand formatting, annotations, and spatial analysis. Become familiar with special field types and Tableau generated fields. Review the concepts of using charts including Pareto, waterfall, Gantt, box plots, Sparkline and perform market basket analysis. Learn fundamental calculations along with automatic and custom split, ad-hoc analytics, and LOD calculations. Understand process of creating and using parameters and gain command over mapping concepts such as custom geocoding and radial selections. 10 http://learnmore.duke.edu

STEP 1 2 3 4 Machine learning This course provides advanced-level training on Machine Learning applications and algorithms. It will give you hands-on experience in multiple, highly sought-after machine learning skills in both supervised and unsupervised learning. This machine learning training helps you learn to apply machine learning algorithms like regression, clustering, classification, and recommendation. The unique case study approach ensures you are working hands-on with data while you learn. You ll also receive training in deep learning and Spark Machine learning skills which are in high demand today. Key Learning Objectives Classify the types of learning including supervised and unsupervised. Identify the various applications of machine learning algorithms. Perform supervised learning techniques: linear and logistic regression. Understand classification data and models. Use unsupervised learning algorithms including deep learning, clustering, and recommendation systems. Experience using machine learning with Spark. 11 http://learnmore.duke.edu

Elective Courses Data Science with SAS The data science with SAS training is designed to impart an in-depth knowledge of SAS programming language, SAS tools, and various advanced analytics techniques. Apache Spark and Scala With this Apache Spark you will learn the essential skills such as Spark Streaming, Spark SQL, Machine Learning Programming, GraphX Programming, Shell Scripting Spark. MongoDB Developer and Administrator MongoDB training helps you learn data modelling, ingestion, query and Sharding, Data Replication with MongoDB along with installing, updating, and maintaining MongoDB environment. Cassandra The Apache Cassandra training provides you with in depthknowledge of Cassandra architecture, features, configuration and hadoop ecosystem around this NoSQL database. 12 http://learnmore.duke.edu

Business Analytics with Excel Business Analytics with Excel training has been designed to help initiate you to the world of analytics. For this we use the most commonly used analytics tool Microsoft Excel. The training will equip you with the concepts and hard skills required to work in this industry. Apache Storm Apache Storm training provides you with experience in stream processing Big Data technology of Apache Storm. Impala: An Open Source SQL Engine for Hadoop Training Course The Impala: An Open Source SQL Engine for Hadoop is an ideal course package for individuals who want to understand the basic concepts of Massively Parallel Processing or MPP SQL query engine that runs on Apache Hadoop. Upon completing this course, learners will be able to interpret the role of Impala in the Big Data Ecosystem. Apache Kafka The Apache Kafka course guides participants through the Kafka architecture, installation, interfaces, and configuration. The participants are also trained in the fundamental concepts of Big Data in this course. 13 http://learnmore.duke.edu

Tableau Server 10 Qualified Associate The Tableau Server 10 Qualified Associate course is designed to impart in-depth understanding and skills to implement, administer, and manage Tableau 10 server. This course is designed for Tableau server users and administrators. Big Data Hadoop Administrator Big Data and Hadoop Administrator course is aligned with Cloudera s CCAH CCA-500 certification and covers the core Hadoop distributions Apache Hadoop and Vendor specific distribution CDH (Cloudera Distribution of Hadoop). 14 http://learnmore.duke.edu

Instructors Ronald Van Loon Top 10 Big Data & Data Science Influencer, Director - Adversitement Named by Onalytica as one of the three most influential people in Big Data, Ronald writes for a number of leading Big Data and Data Science websites, including Datafloq, Data Science Central, and The Guardian. He is a regular speaker at renowned events. Sina Jamshidi Big Data Lead at Bell Labs Sina has over 10 years of experience in the Technology field as a Big Data Architect at Bell Labs and as a Platinum-level trainer. Sina is a very passionate about building a Big Data education ecosystem and has been a contributor in a number of public and journal publications. Simon Tavasoli Analytics Lead at Cancer Care, Ontario Simon is a Data Scientist with 12 years of experience in Healthcare Analytics. He has a Masters in Biostatistics from the University of Western Ontario. Simon is passionate about teaching data science and has published several journals in preventive medicine analytics. 15 http://learnmore.duke.edu

Instructors Alvaro Fuentes Founder and Data Scientist at Quant Company Alvaro is a Data Scientist who founded Quant Company and has also worked as a lead Economic analyst in the Central Bank of Guatemala. He is a M.S. in Quantitative Economics and Applied Mathematics and is actively involved in consulting and training in the data science space. Paul Sharkov Data Scientist at BMO Financial Group, Member of SAS Canada Community Paul is lead SAS Data Scientist at Bank of Montreal. As a SAS Certified Predictive Modeler, SAS Statistical Business Analyst, and SAS Certified Advanced Programmer, Paul is passionate about sharing his knowledge on how data science can support data-driven business decisions. Live virtual classrooms are facilitated by qualified industry subject matter experts in alignment with the curriculum designed by the instructors listed above. 16 http://learnmore.duke.edu

CONTINUING STUDIES Duke Continuing Studies Box 90700 Duke University East Campus Durham, NC 27708-0700 (919) 684-6259 learnmore@duke.edu http://learnmore.duke.edu