Using the Blaze Engine to Run Profiles and Scorecards

Similar documents
Setting Up and Running PowerCenter Reports

Bringing the Power of SAS to Hadoop Title

Concur Expense Integrator

Vendor Management v4.3 For Passageways Portal Framework

Big Data Analytics met Hadoop

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

Oracle Big Data Discovery Cloud Service

IBM Cognos 8 Framework Manager - Durable Models

Informatica Cloud Spring Oracle E-Business Suite Interface Connector Guide

Alphatax Ireland Release Notes. Version T: E: W:

Informatica Cloud Siebel-Salesforce Vibe integration package. Siebel to Salesforce Asset Bundle

Solutions Implementation Guide

Deltek Vision 6.2 SP1. Custom Reports and Microsoft SQL Server Reporting Services

Procurement. User Guide

Configuring IBM Cognos Controller 8 to use Access Manager Authentication

What's New New Features in Primavera P6 EPPM 18

Deltek Costpoint Enterprise Reporting 7.2. Release Notes

SAS Enterprise Guide

Microsoft Dynamics GP. Personal Data Keeper

Survey Reports Document Version: 1.1 November 2015

Morningstar Direct SM Performance Reporting

SmartFulfillment User Guide

David Taylor

BI Portal User Guide

Trimble AllTrak Cloud

Salesforce Lightning Partner Management

Microsoft Dynamics GP Business Portal. Project Time and Expense User s Guide Release 3.0

Qfiche Toolkit Custom Workflow Activities. January 2018

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

MICROSOFT FORECASTER 7.0 UPGRADE TRAINING

HYPERION SYSTEM 9 PLANNING

What s new in D365 for Finance & Operations Reporting? Hope Enochs

What's New New Features in Primavera P6 EPPM 18

Navigating in ADP Workforce Now for Practitioners

Predictive Analytics Reimagined for the Digital Enterprise

Product Data Management Working with Databases

Agilent Quality Tool Online Help. Printable Version. Intelligent Test Software Solutions. ITFSS p January 2005

Sage ERP Accpac Online 5.6

StarterPak: HubSpot and Dynamics CRM Lead and Contact Synchronization. Version 1.1

CASELLE Classic Cash Receipting. User Guide

Ultimate Study Guide: Foundations Microsoft Project Dale A. Howard Gary L. Chefetz

SBW For Windows TM. Version Advanced Quoter Guide Module 7.2: Pricing

Finance Month End. Deltek Maconomy Deltek, Inc. Deltek proprietary, all rights reserved.

Oracle Knowledge Analytics User Guide

IBM Cognos Business Intelligence Version Getting Started Guide

Xerox Configurator Pricing Manager Guide

Got Hadoop? Whitepaper: Hadoop and EXASOL - a perfect combination for processing, storing and analyzing big data volumes

The Open Source Enterprise Billing System User Guide jbilling User Guide

Report Designer Add-In v1.0. Frequently Asked Questions

Your inside track for making your job easier!

Tuning Job Servers in IBM Cognos Planning

SAGE ACCPAC. Sage Accpac ERP. Project and Job Costing 5.5A. Update Notice

EMC M&R (WATCH4NET) Cross-Domain Performance, Capacity and SLA Management. Ensure high service quality to users ESSENTIALS

Morningstar Direct SM Scorecard

Sage 100. Sage Payroll Services Getting Started Guide

AZURE HDINSIGHT. Azure Machine Learning Track Marek Chmel

Risk Management User Guide

CAREER PATHWAY PERFORMANCE MANAGEMENT SYSTEM EMPLOYEE USER GUIDE

RELEASE NOTES. Practice Management. Version 11

Oracle Big Data Discovery The Visual Face of Big Data

JOB COST USER S GUIDE

Review Manager Guide

What you need to know before upgrading from IBM Cognos ReportNet 1.1 to IBM Cognos 8 BI

Welcome to the Bin Locations - Overview course. This course is part of a series of courses available for the bin locations topic and presents a high

A-Z of Online Journal Titles

Analytics in Action transforming the way we use and consume information

Business Portal for Microsoft Dynamics GP. Human Resources Management Self Service Suite Administrator s Guide Release 10.0

This tutorial helps you to learn all the fundamentals of Talend tool for data integration and big data with examples.

Deltek Costpoint Cumulative Release Notes for August 2017

CAT Enterprise. Reports Manual

ADP Vantage HCM: Manage Employees Time Off Requests

Function Point Modeler ISBSG Import Interface

Microsoft Dynamics GP. Purchase Order Processing

Adding a New Employee to ADP Time & Attendance

COST ASSESSMENT DATA ENTERPRISE. CSDR Submit-Review Website: Reviewer Guide

SAS Activity-Based Management 6.4

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand

IBM Cognos Series 7 to PowerPlay for IBM Cognos 8 Migration Installation and Configuration Checklist

Managing Data Warehouse Growth in the New Era of Big Data

Oracle Infinity TM. Key Components

Oracle Financial Services FCCM Analytics User Guide. Release March 2017

Demand Management User Guide. Release

Sage Accpac ERP Integration with SageCRM 6.1

Oracle's Big Data analytics portfolio gains critical mass

THE STRATEGIC IMPORTANCE OF OLAP AND MULTIDIMENSIONAL ANALYSIS A COGNOS WHITE PAPER

Quick cards

ITracker v1.0 Timesheet User Guide. itracker Timesheet User Guide

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Common Customer Use Cases in FSI

Performance Management Content and Functionality Overview

Getting Started with SAS Activity-Based Management 6.3

MAGAYA CORPORATION August 17, Release Notes

icare's Clinical, Care & Medication Management Solution

Data Exchange Module. Vendor Invoice Import

In-Memory Analytics: Get Faster, Better Insights from Big Data

About Configuring BI Publisher for Primavera Unifier. Getting Started with BI Publisher Reports

Microsoft Business Solutions Axapta Enterprise Portal makes it easy for you to connect with your business community over the Internet.

Omni-Channel Supervisor

SAS and Hadoop Technology: Overview

Analyzing Data with Power BI

Transcription:

Using the Blaze Engine to Run Profiles and Scorecards 1993, 2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica LLC. All other company and product names may be trade names or trademarks of their respective owners and/or copyrighted materials of such owners.

Abstract Informatica Blaze engine is an Informatica proprietary engine for distributed processing on Hadoop. You can run profiles and scorecards on the Blaze engine. This article discusses how you can use the Blaze engine to run profiles and scorecards in Informatica Developer and Informatica Analyst. Supported Versions Data Quality 10.1 Big Data Management 10.1 Table of Contents Blaze Engine Overview.... 2 Profiles and Scorecards on the Blaze Engine.... 2 Creating a Column Profile in Informatica Developer.... 3 Creating and Running a Column Profile in Informatica Analyst.... 4 Creating and Running a Scorecard in Informatica Analyst.... 4 Blaze Engine Overview You can use the Blaze engine to run profiles or scorecards on data sources with large volume of data, on a variety of data sources, and on Big data. The Blaze engine increases performance and scalability of the Hadoop cluster. The Blaze engine is a part of Informatica Big Data Management. After you download and install Informatica Big Data Management, you can use the Blaze engine to run profiles and scorecards in Informatica Developer and Informatica Analyst. The Blaze engine is built using a memory-based data exchange framework which runs natively on YARN without the dependence of MapReduce or Hive. The Blaze distributed processing engine has the ability to scale and perform highspeed data processing of large complex batch workloads using a natively-embedded Informatica data transformation engine on Hadoop. The profiles and scorecards are processed faster and the results are retrieved quickly from the cluster. To run a profile or scorecard on the Blaze engine, the Data Integration Service submits jobs to the Blaze engine executor. The Blaze engine executor is a software component that enables communication between the Data Integration Service and the Blaze engine components on the Hadoop cluster. You can create the Hadoop connections using the Developer tool, Administrator tool, and infacmd. You can use the Blaze engine to run column profiles and enterprise discovery profiles in the Developer tool. You can use the Blaze engine to run column profiles, enterprise discovery profiles, and scorecards in the Analyst tool. Profiles and Scorecards on the Blaze Engine You can run profiles or scorecards in the Hadoop environment on the Blaze engine after you install Informatica Big Data Management. When you run the profiles or scorecards in the Hadoop run-time environment on the Blaze engine, the following process occurs: 1. The Analyst tool or Developer tool submits the job to the Profiling Service Module. 2. The Profiling Service Module breaks down the job into a set of mappings. 2

3. The Data Integration Service pushes the mapping execution to the Hadoop cluster through a Hadoop connection. 4. The profile results or scorecard results are saved in the profiling warehouse. 5. You can view the profile results in the Analyst tool or Developer tool and view the scorecard results in the Analyst tool. Creating a Column Profile in Informatica Developer You can choose to run column profiles and enterprise discovery profiles in Informatica Developer. In the Developer tool, column profiles include single data object profiles and multiple data object profiles. To create a single data object profile, perform the following steps: 1. In the Object Explorer view, select the data object you want to profile. 2. Click File > New > Profile to open the profile wizard. 3. Select Profile and click Next. 4. Enter a name for the profile and verify the project location. Optionally, enter a text description of the profile. Click Next. 5. Review and edit the column selection, filter and sampling options, inference options, drill-down options, and data domain selection as per your requirements. 6. In the Run Settings section: Choose Hadoop as the validation environment. Choose Hadoop to use the Blaze engine to run the profile and select a Hadoop connection. The following image shows the run-time environment options that you can choose to run a column profile in the Developer tool: 7. Click Finish. The Blaze engine runs the profile and the profile results appear in the Developer tool. 3

Creating and Running a Column Profile in Informatica Analyst You can create and run column profiles and enterprise discovery profiles on the Blaze engine in the Analyst tool. To create a column profile in the Analyst tool, perform the following steps: 1. In the Discovery workspace, select New > Profile from the header area. 2. The Single source option is selected by default. Click Next. 3. Enter information about the profile in Specify General Properties screen and Select Source screen. Click Next 4. In the Specify Settings screen, choose the options, as required. 5. Choose Hadoop as the run-time environment. Click Browse to select a Hadoop connection in the Select a Hadoop Connection dialog box. The following image shows the run-time environment options that you can choose to run a column profile in the Analyst tool: 6. In the Specify Rules and Filters screen, create, edit, or delete a rule or filer, as required. 7. Click Save and Run to create and run the profile. The Blaze engine runs the profile and the profile results appears in the summary view. Creating and Running a Scorecard in Informatica Analyst You can create and run scorecards in Informatica Analyst. You can also create scorecards in Informatica Developer and run the scorecards in the Analyst tool. You can use the Blaze engine to run scorecards. 1. In the Library workspace, select the project or folder that contains the profile. 2. Click the profile to open the profile. The profile results appear in the summary view in the Discovery workspace. 3. The Add to Scorecard wizard appears 4. In the Add to Scorecard screen, choose to create a new scorecard. Click Next. 5. In the Step 2 of 8 screen, enter a name for the scorecard and a description for the scorecard. Select the project and folder where you want to save the scorecard. Click Next 6. Enter information from Step 3 of 8 screen through Step 7 of 8 screen, as required. Click Next. 7. In the Step 8 of 8 screen, choose Hadoop as the run-time environment. Click Choose to select a Hadoop connection in the Select a Hadoop Connection dialog box. The following image shows the run-time environment options that you can choose to run the scorecard: 4

8. Click Save & Run to save and run the scorecard. The Blaze engine runs the scorecard in the Hadoop cluster and the scorecard results appears in the Scorecard workspace. Author Lavanya S Senior Technical Writer 5