Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect

Size: px
Start display at page:

Download "Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect"

Transcription

1 (technical) Updates & demonstration Robert Voermans Governance architect Amsterdam

2 Please note IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice and at IBM s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here. 2

3 IBM InfoSphere Information Server Information Empowerment for Your Data Ecosystem Integrating and transforming data and content to deliver accurate, consistent, timely and complete information through a unified platform with a common metadata foundation Data Quality Information Governance Catalog InfoSphere Information Server Data Integration Information Governance Catalog Understand & Collaborate Catalog technical metadata & align w/ business language Mange (big) data lineage BCBS compliance reporting Data Quality Cleanse & Monitor Analyze, validate, classify Cleanse & standardize Define, manage & monitor data rules + exceptions Data Integration Transform & Deliver Massive scalability Power for any complexity Deliver in batch and/or realtime with change capture Common Connectivity / Shared Metadata / Security / Common Execution Engine With Flexible Deployments (Hadoop, Grid, Cloud) 3

4 IBM Information Server Release Themes July 2014 September 2015 December Reducing the Platform Footprint 11.5 Utilizing the Power of Hadoop 11.7 Empowering the user through tailored design and automation 4

5 IBM Information Server V moving towards a user centric micro-service based architecture Automation & M/L Strengthen the Data Lake Empower the User Hybrid (Cloud) Deployment Increased automation for the Governance & Data Quality process Increasing speed and resilience on Hadoop New Self-service / User centered experiences for Integration and Governance More deployment options for Information Server components Simplified Licensing Enable GDPR Expanding the Reach Single License for Unified Governance & Integration Combining structured & unstructured data/content governance in ONE catalog More out of the box connectivity for Cloud, Hadoop & Enterprise 5

6 Where Are We Heading? Services Tier Flow Designer Shape & Curate Pattern & ML driven flow builder Built-in Governance Comprehensive Flow Design Operations, Administration & Management Open API Projects Services Engine Tier PX Spark Batch Real-time Event-driven 6

7 Modernizing the Platform Simplified Stack Reduction to 3 Tier Micro-service based architecture Enabling 1-Click managed container-based deployment Embracing Open Source and removal of legacy software Components Open API concept for easier integration into larger application stack Loosely coupled runtimes for greater elasticity & flexibility 7

8 Functional Enhancements Enrich integration design with machine learning for faster, easier and more accurate design Source/Design Recommendation Transformation automation Utilizing machine learning for workload optimization Governance driven integration Design Seamless Continues Engineering support 8

9 User-Centric Integration Design Empower the User Automator Business Analyst Data Analyst Data Scientist Data-driven/ fully guided self-service Shadow Integrator Integrator Data Engineer Integration Specialist Graphical flow design with full control API Full Stack Developer Developer Developer Service & API orchestration Front End Developer NEW 9

10 Empower the User DataStage Flow Designer The New Integration Experience Intuitive, browser-based (no-install) experience Reducing total cost of ownership Full backwards compatibility Accelerated productivity through: Automatic schema propagation Highlighted design errors Powerful type-ahead search Server-side compilation 10

11 Enhanced Architecture to Support DFD DataStage Ops Console UI EXTENSIONS Windows-based DS / QS Designer Empower the User DataStage Flow Designer New New Designer + Engine API Common Services Metadata Services, IMAM Security, Logging, Reporting Monitor Cluster, Pipeline and Jobs Unified Service Deployment Design Metadata Engine XMETA Operational Parallel Processing JSON Extensions New PX Engine Operators Connectors Prepare, Join, Copy, Merge, Lookup, Sort, Funnel, Filter, Map DB2 ORA... 11

12 Let s take a look 12

13 11.7 Data Quality Advances Unified Governance Automation Rules Auto Discovery Auto Classification and Term Assignment Governed Quality Spark engine for Information Analyzer (Technology Preview) 13

14 Automation Rules Automatic Actions/Rules and DQ threshold based on Term assignments Enable/Disable all or individual built-in data quality dimensions Auto-bind one or more Data Rule Definitions 14

15 Auto Discovery From a data Connection a simple dialogue starts a bulk operation to (optionally)perform the following: Import the table/file metadata Run Column Analysis Run Quality Analysis Auto Term Assignment All analysis is controlled by the Settings of the designated target Workspace (i.e. sampling settings, etc.) 15

16 Auto Classification and Term Assignment 16

17 Governed Quality 17

18 Spark engine for Information Analyzer (Technology Preview) IA sitting outside Hadoop dispatches work to run inside Hadoop Hadoop data stays put No Information Server components required on the Hadoop nodes (only Spark and Livy required) IADB results data stored on Hadoop under Hive control Only summary results (think Publish to IGC) must leave Hadoop 18

19 Let s take a look 19

20 Changing Enterprise Landscape New Dimensions to the Data Challenge Diverse Locations for Data Public Cloud, Private Cloud Structured & Unstructured Data Need for regulation and insight goes across Speed of Data creation- IoT Many more Regulations GDPR, BCBS, ARDR Data Democratization Self Service anyone? Value Generation from Data Analytics & AI 20

21 Data Challenges & Needs Challenges Data Duplication Dark Data Self Service driven Chaos A more complicated data life cycle Needs Finding the right data Trusted Data High Quality Data Auditability Speed and Agility Need for Governed Analytics 21

22 22

23 23

24 24

25 Enterprise Search Search and uncover insights about information from across the entire Catalog, across all Asset Types and their relationships Knowledge graph driven contextual search Uncover and visualize key relationships and usage of data Benefit from auto-discovery and classification services Interpret data through a common language (Glossary) and documented set of Policies Explore and graphically navigate thru data usage and consumption or relationship Benefit from Ranking, Quality Assessment and Comments Expand the graph explorer, revealing containment, usage, dependency or other relationship Easily comprehend and learn the ancestry of data, or its consumption Benefit from discovered and governed data 25

26 Let s take a look 26

27 27

28 Movies 28

29 Governance Catalog 29

30 Information Governance Catalog Risk Data Aggregation 30

31 Unified Governance and Integration Use Case Video: A Personally Identifiable Information (PII) Story 31

32 Find, Use, and Govern Data with Information Governance Catalog 32

33 Data Quality 33

34 Using Information Server to Perform Data Quality Assessments 34

35 An Automated Data Discovery and Classification Story 35

36 A Data Quality and Shop for Data Story 36

37 Data Integration 37

38 Using DataStage to transform and synchronize cloud data repositories with on-premise transactional data 38