Data Mining using SPSS Modeler 2nd Session

Similar documents
ECM Enterprise Reference Architecture A Phased Approach to What's Next

Business Rules Management System Moving toward Operational Decision Management

IBM Builds a Better Process than JBoss jbpm

z/vm Lockdown: The Road to EAL 4 or: How to dig a moat, raise the drawbridge, lower the portcullis, and prepare the boiling oil.

Cloud 3.0: High Value Industry Cloud Transformation

Transforming into the next generation:

Securing z/vm: The Road to EAL 4

IBM Web Content Solutions

Information Architecture: Leveraging Information in an SOA Environment. David McCarty IBM Software IT Architect. IBM SOA Architect Summit

IBM s own On Demand Workplace

IBM Cognos Insight. Personal Analytics for Everyone

Translating and Interpreting for the European Union

Get Started on SOA. WebSphere s Proven Flexible Entry Points Help you Get Started. Service Oriented Architecture (SOA)

Simply Good Design: 2012 IBM SOA Architect Summit. SOA on Your Terms And Our Expertise

IBM SOA Accelerates the Value of SAP

Service Oriented Architecture

M3 Factory Track M3UG Jesper Lyngsø, Solution Design, Infor Denmark

Agenda. Introduction. LIMS Integration. Business Processes. Results Reporting. Machine Interfacing. BI Analytics

Customer Loyalty. Carmen Raileanu, Europe, Technical Sales Leader for Process Transformation

Count. Customer. Authenticated Measurable Continuous

Oracle Hospitality Inventory Management Mobile Solutions. Quick Reference Guide

The Language Opportunity

Architecting SOA With A Business Focus

Top 5 reasons to upgrade Infor d/epm

SAP Localization Hub Globalization Services May Customer

TABLE OF CONTENTS DOCUMENT HISTORY 3

Rational and Telelogic

Translating and Interpreting for the European Union

New and noteworthy in Rational Asset Manager V7.5.1

SAP Predictive Analytics Hands-On. Andreas Forster December 2015

Key Features of SAP Cash Application THE BEST RUN FEATURE SCOPE DESCRIPTION PUBLIC

Big Data Platform Overview

SAP BusinessObjects Edge BI, Preferred Business Intelligence. SAP Solutions for Small Business and Midsize Companies

Research Integrity & Ethics are we getting it right? -

Enrollment Number Reseller Must Complete the Following

Form H: Governor-nominee Data Form

IBM Lotus Notes Traveler Upgrade Pack 1

GIM Strategy #5: Automate Translations Communicate like you ve never communicated before

TNM033 Data Mining Practical Final Project Deadline: 17 of January, 2011

SWOT Assessment: BMC Remedy v9

2011 IBM Corporation. Optimizing Outcomes with Case Management

WIS/ASRA net. On track with the correct compass

Application Architecture: Reusing Existing Applications in SOA-Based Business Processes

Tivoli Now IBM Corporation

Oracle Planning and Budgeting Cloud

June 21, OCLC Update. Cynthia M. Whitacre Manager, WorldCat Quality

Re-Inventing Customer Experience with Automated Translation. August 11, 2011

IBM Business Process Manager v7.5 Business and Technical Overview

FAQs. Introductory. Q. What is Proteus MMX?

Basic applied techniques

IBM Lotus ActiveInsight Performance Dashboards

PROJECT: PROPOSAL TO THE UN ECE OF AN HARMONIZED, SECURE AND ENHANCED IDP By the FIA / AIT WP 1 73 rd session 2016 FIA / AIT

Keynote Presentation: Driving the Value of SOA in an Enterprise Architecture

webcertain Recruitment pack Ceri Wright [Pick the date]

Product Release Notes

Working Group on the Legal Development of the Madrid System for the International Registration of Marks

Krakow Center IBM Corporation

Software Value Incentive 2007

SUBSCRIPTION AND SaaS FEATURES

A Scientometric Analysis of Aquaculture Literature during 1999 to 2013: Study Based on Scopus Database

IBM ediscovery Manager V2.2.2 and IBM ediscovery Analyzer V2.2.2 accelerate case assessment and development of legal insights and strategies

ORACLE KNOWLEDGE 8.5 RELEASE - PRODUCT SUMMARY OVERVIEW

Driving Translation Management

Localization Company

Chicago BusinessObjects User Group. Jose Hernandez Director of Business Intelligence Dunn Solutions Group

IBM Cognos Analysis for Microsoft Excel V10.1.1

DutchTrans Company Profile. Office NL Transpolispark Siriusdreef WT Hoofddorp The Netherlands Telephone: +31-(0)

Webcertain. Recruitment pack. Ceri Wright [Pick the date]

5/25/2018 Clone of PCC Portfolio Application with Client Coaching Experience Attestation

SWOT Assessment: Remedy v9

GACS and Agrisemantics

Frequently Asked Questions for Suppliers. Achilles Automotive Supply Chain Mapping (SCM) Programme

WebSphere User Group March Business Process Management (BPM) When, Where and Why. Author(s): Kim Clark Version: 1.1 Date: 21 th March 2013

For a serious corporate image and reputa on, anywhere in the world

Supplement to System Requirements and Supported Platforms for Oracle Business Intelligence Applications. Version ,

IBM WebSphere Business Monitor V6.0.1 supports national language translations

Conflict Minerals Reporting Training. Module 1: Introduction To Conflict Minerals Reporting

Overview of A Guide to the Project Management Body of Knowledge (PMBOK Guide) Fourth Edition

What s New in Minitab 17

COURSE LISTING. Courses Listed. Training for Applications with Integration in SAP Business One. 27 November 2017 (07:09 GMT) Advanced

Instant access to all features 2. Multiple visitor flows 3. Reports and analytics 3. ipad features 4. Mobile app features 5.

Accelerator for the PMBOK Release Notes. Service Pack

Top 5 reasons to upgrade Infor Configure Price Quote (CPQ)

France-Paris: Economics and Financial Databases 2019/S Contract notice. Services

EUROPEAN PERSONNEL SELECTION OFFICE (EPSO)

IBM SPSS Modeler Premium

HOW TO DEVELOP ALUMNI COMMUNITIES BARCELONA, NOVEMBER 2017

Achieve Better Insight and Prediction with Data Mining

Delivering a Successful Localization Service Strategy. Denisse Osorio de Large Translation Team Manager

Infor FMS SunSystems v4.3.3

Contact us for the full, detailed 20-page CRM Software Review for Oracle Siebel CRM or any of 30 other major CRM applications.

POWER. for Electrical planning and engineering

630 PHOENIX CONTACT. Courtesy of Steven Engineering, Inc - (800)

Feature Scope Description for Data Quality Management, Microservices for Location Data

IBM SPSS Modeler Personal

Procurement PROGRAMME FOR THE YEAR 2016 TRANSLATION SERVICES CALLS FOR TENDERS REFERENCE FIELD COMMENTS. Open call for tenders. Procedure Closed.

Ask the Expert SAS Text Miner: Getting Started. Presenter: Twanda Baker Senior Associate Systems Engineer SAS Customer Loyalty Team

A Single System to Streamline Your Entire Business

Instant access to all features 2. Multiple visitor flows 3. ipad features 4. Mobile app features 5. Web dashboard features 6. Reports and analytics 6

PartnerNetwork Conference 2018 Rockwell Automation UK. March 19-21, 2018

Transcription:

IBM Taiwan Claire Lin Data Mining using SPSS Modeler 2nd Session 2014 IBM Corporation

Agenda Data Mining Process Business Understanding Data Understanding Live Demo and Exercise Data Preparation and Manipulation Live Demo and Exercise 2 2014 IBM Corporation

What is Data Mining? The analysis step of the Knowledge Discovery in Databases (KDD) process, it encompasses a number of techniques to extract useful information from (large) data files, without necessarily having preconceived notions about what will be discovered. The goal of data mining is to extract information from a data set and transform it into an understandable structure for further use 3 2014 IBM Corporation

Data mining process Cross Industry Standard Process for Data Mining(CRISP) 4 2014 IBM Corporation

Data mining process Cross Industry Standard Process for Data Mining(CRISP) What SPSS Modeler can do? Input raw data Data understanding Check missing data Check anomalous and outlier data Data preparation Filter, derive, reclassify nodes Modeling Output 5 2013 IBM Corporation

Business Understanding Determining business objectives Finding what people will buy together with 粽子 during Dragon Festival Predicting who is likely to not renew and contract for mobile phone service Assessing the situation Determining data mining goals Producing a project plan 6 2014 IBM Corporation

Data Understanding Need to understand Includes What your data resources are What the characteristics of those resources are Collecting initial data Describing data Exploring data Verifying data quality Missing Data Anomalous Data 7 2014 IBM Corporation

Data Understanding - Missing Data Blank Contain no information. White space if the field is string and Null value (non-numeric) if the field is numeric Empty string A string field may be empty, which means that it contains nothing (This is common in databases) Value blanks Represent missing or invalid information 8 2014 IBM Corporation

Data Understanding - Missing Data 9 2014 IBM Corporation

Data Understanding - Anomalous Data What is Anomalous Data? Far from the center of the distribution Measured by the mean or median and using the standard deviation as a measure of spread Far from other values Whether close to the center of the distribution, or not 10 2014 IBM Corporation

Data Understanding Anomaly detection 11 2013 IBM Corporation

SPSS Modeler User Interface 12 2013 IBM Corporation

13 2013 IBM Corporation

Data Sources Database: ODBC source Var. File: free-field text file Fixed File: fixed-field text file Statistics File/SAS File/Excel File 14 2014 IBM Corporation

Data Understanding The Data Audit node Provide report Missing values Outlier data and Extreme data Information on a field s distribution 15 2013 IBM Corporation

Data Understanding Anomaly detection models identify outliers or unusual cases by using clustering analysis Each record is assigned an anomaly index It's the ratio of the group deviation index to its average over the cluster that the case belongs to Cases with an index value greater than 2 could be good anomaly candidates 16 2013 IBM Corporation

Data Understanding Outliers Data Live Demo Live Demo SPSS Modeler UI Read data into SPSS Modeler Check missing data Check anomalous and outlier data Data Audit Node Anomaly Node 17 2013 IBM Corporation

Live Demo & Exercise I 18 2013 IBM Corporation

Data Preparation and Manipulation Objective: Construct the final dataset for modeling Record Operations Select partial data from dataset Sort the data Field Operations 19 2013 IBM Corporation

Type: Specifies field metadata and properties Type Continuous Categorical Nominal Description Used to describe numeric values, such as a range of 0 100 or 0.75 1.25. A continuous value can be an integer, real number, or date/time. String values Used to describe data with multiple distinct values, each treated as a member of a set. Ordinal Used to describe data with multiple distinct values that have an inherent order. 20 2013 IBM Corporation Flag Used for data with two distinct values that indicate the presence or absence of a trait. Such as true and false, Yes and No or 0 and 1.

Filter: Filters, renames fields 21 2013 IBM Corporation

Derive: Modifies data values or creates new fields 22 2013 IBM Corporation

Reclassify 23 2013 IBM Corporation

Live Demo & Exercise II 24 2013 IBM Corporation

Trugarez Breton Merci French Gracias Spanish Grazie Italian Hindi Arabic Obrigado Brazilian Portuguese Traditional Chinese Korean go raibh maith agat Gaelic Dankon Esperanto Simplified Chinese Hebrew Tack så mycket Swedish Tak Danish Danke German Japanese Thank You English Tamil Dank u Dutch Thai Dekujeme Vam Czech 25 2013 IBM Corporation