Common Customer Use Cases in FSI 1
Marketing Optimization 2014 2014 MapR MapR Technologies Technologies 2
Fortune 100 Financial Services Company 104M CARD MEMBERS 3
Financial Services: Recommendation Engine & Real-time Targeting Making personalized real-time offers to credit card customers GLOBAL FINANCIAL SERVICES CORPORATION OBJECTIVES Increase revenue and customer loyalty with real-time personalized offers CHALLENGES Many different CRM tools and siloed targeting engines Developers and analysts are unable to access all customer data Want to increase speed and relevance of recommendations SOLUTION MapR M7 centralizes analytics and operational apps on one platform Integrates all customer online and offline data into HBase in real-time: card member spend graph, merchant data, location, and feedback Centralized customer data repository provides more accurate insights Uses Mahout machine learning to provide real-time personalized offers Business Impact Increases revenue and improves customer experience through real-time targeting A more flexible, scalable platform that s a fraction of the cost of traditional technologies Ensures reliability with MapR s high availability and disaster recovery features 4
One Platform for Operations, Analytics Teams, and Apps Data Sources Operational Applications Model Development WEB PERSONALIZED OFFERS CALL CENTER CONTEXTUAL CUSTOMER SUPPORT MODEL TRAINING, TESTING, DEPLOYMENT Clickstream activity (logs) Transactions Merchant Information Streaming (Storm, Spark Streaming) Batch (MR, Spark, Hive, Pig, ) HBase, other data stores MapR Data Platform, NFS MapR M7 Tables Snapshots Interactive (Drill, Impala, Presto,...) MAPR DISTRIBUTION FOR HADOOP Developer/ Data Scientist Benefits - Time: Run models at will - Accuracy: Model at scale - Skills: Use existing tools & libraries (NFS) - Consistent data (snapshots) Operations Benefits: - Ingest data correctly and coherently - Snapshots: Isolate Modeling teams from production data - Cost: Fewer systems and less data movement 5
Fraud Detection 2014 2014 MapR MapR Technologies Technologies 6
Fraud Detection Retrospective Analysis Real-time Fraud Analysis Batch oriented manual process Custom/3rd party Fraud detection engines supplemented by manual checks and machine learning Traditional method deployed at large banks and insurance companies Threat and Fraud detection in real-time New technologies for real-time advantage but supplemented by large-scale machine learning and retrospective analysis Apple s and Amex s of the world 7
Zions Bank: Fraud Detection Cost effective security analytics and fraud detection on one platform OBJECTIVES Fraud Operations and Security Analytics team at Zions maintains data stores, builds statistical models to detect fraud, and then uses these models to data mine and evaluate suspicious activity CHALLENGES Existing technology infrastructure could not scale Timeliness of reports degraded over the last several years SOLUTION Chose MapR and cut storage costs by 50% Querying time reduced from 24 hours to 30 min on 1.2 PB of data Leverage MapR scale for increased model accuracy and deeper insights Business Impact We initially got into centralizing all of our data from an information security perspective. We then saw that we could use this same environment to help with fraud detection Michael Fowkes - SVP Fraud Operations and Security Analytics 8
Detecting Fraud: Using Anomaly Detection Anomaly Detection Machine Learning technique that tells you what you were not looking for Can be adjusted to varying levels of precision - Less false positives / less false negatives Typically followed by retrospective analysis for final validation and action Typically done in batch analytics mode There is constant adjustment of models 9
Zions Bank on CDH Retrospective Analysis Web Server Data PRD CDH ETL and Batch Analytics 3 rd Party Real Time Fraud Detection Landing Area X Duplicated Processing X Doubled Maintenance Transactional Data DEV CDH ETL and Batch Analytics Develop New Models 10
Zions Bank with MapR Faster Ops at Low Costs Retrospective Analysis Web Server Data PRD CDH ETL and Batch Analytics 3 rd Party Real Time Fraud Detection Landing Area N F S PRD and Dev on MapR X Duplicated Processing X Doubled Maintenance Transactional Data Technical Benefits ü High availability ü Multi-tenancy ü Snapshots ü Performance DEV CDH Business Benefits Unified platform for data Develop New Models ü Lower operating costs ü Operational guarantees ü Faster model development ETL and Batch Analytics 11
Global Financial Services: Fraud Detection and Beyond Analytics + Operational Applications on one platform Real-time Operational Applications Online transactions Fraud detection Personalized offers Fraud model Recommendations table MapR Distribution for Hadoop Fraud investigator Fraud investigation tool Clickstream analysis Interactive marketer Analytics 12
Insurance Companies: Claims Fraud Detection High-performance machine-learning to detect fraud patterns OBJECTIVES To identify and combat fraudulent claims on a daily basis by leveraging more data sources (Costs of insurance industry fraud: $80B) CHALLENGES Rising insurance premiums due to increase in operational costs Increased fraud sophistication Existing technologies and sampling methods proving ineffective Increasing variety of data that cannot be processed by traditional means SOLUTION MapR provides cost-effective Hadoop platform to process large volumes and variety of data MapR provides world record performance delivering the fastest ROI for Hadoop Existing software and analytical libraries work on MapR with no code changes Business Impact MapR enterprise-grade features coupled with real-time data ingestion provide the most reliable realtime platform to build deep analytical applications to combat fraud 13
Security Log Analysis & Enterprise Data Vault F100 bank accelerates log analytics to meet investigation and compliance mandates LARGE FINANCIAL SERVICES INSTITUTION OBJECTIVES Meet compliance requirements to minimize lawsuits and fines Complete IT audits more quickly CHALLENGES Prior system (flat files on Unix) was difficult to maintain for operations team HA and data protection issues in HDFS put critical data at risk File volume (300K files/day) was straining system SOLUTION Seamless Hadoop file movement & management: MapR NFS MapReduce enables archival of data for historical search and analysis Data is indexed into Elasticsearch from MapR for real-time search Customizable user interface and dashboard: Kibana (ELK stack) Business Impact Ø Bullet-proof data vault that meets SEC and FINRA requirements Ø 46x cost savings over Splunk - $60K/yr vs. $2.8Million/yr Ø Efficiency of MapR cluster that can store the Elasticsearch index for real-time search 14
Enterprise Data Hub (DW Optimization) 2014 2014 MapR MapR Technologies Technologies 15
TDWI: Evolving Data Warehouse Architectures Hadoop Uses in Data Warehouse Environment 1 Data Staging 2 Data Archiving 3 ETL 4 Big Data Analytics TDWI April 2014 16
Data Warehouse Optimization Improve data services to customers while reducing enterprise architecture costs FORTUNE 100 TELCO OBJECTIVES Provide cloud, security, managed services, data center, & comms Report on customer usage, profiles, billing, and sales metrics Improve service: Measure service quality and repair metrics CHALLENGES Reduce customer churn identify and address IP network hotspots Cost of ETL & DW storage for growing IP and clickstream data; >3 months Reliability & cost of Hadoop alternatives limited ETL & storage offload SOLUTION MapR Data Platform for data staging, ETL, and storage at 1/10th the cost MapR provided smallest datacenter footprint with best DR solution Enterprise-grade: NFS file management, consistent snapshots & mirroring Business Impact Increased scale to handle network IP and clickstream data Reduced workload on DW to maintain reporting SLA s to business Unlocked new insights into network usage and customer preferences 17
MapR Optimized Data Architecture Optimized Data Architecture Data Movement Data Access Machine Learning Sources RELATIONAL, SAAS, MAINFRAME DOCUMENTS, EMAILS Streaming (Spark Streaming, Storm) NoSQL ODBMS (HBase, Accumulo, ) MAPR DISTRIBUTION MapR Data Platform FOR HADOOP MapR-DB Interactive (Impala, Drill) MapR-FS Batch/Search (MR, Spark, Hive, Pig) Operational Apps Recommendations Fraud Detection Logistics BLOGS, TWEETS, LINK DATA LOG FILES, CLICKSTREAMS SENSORS MAPR DISTRIBUTION FOR HADOOP Data Transformation, Enrichment and Integration Analytics Search Schema-less data exploration DATA WAREHOUSE BI, reporting Ad-hoc integrated analytics 18
Informatica: Data Integration Optimization Architecture Enrich data in Hadoop Analyze Load more data sources POWER -CENTER PARSE, PROFILE, ETL DATA QUALITY CLEANSE, MATCH BI REPORTS AND APPLICATIONS RELATIONAL, SAAS, MAINFRAME DOCUMENTS, EMAILS POWER EXCHANGE LOAD DATA REPLICATION MapR Control System (MCS) Hadoop User Experience (HUE) Batch Processing MR, YARN, Hive, Pig, etc. Interactive Querying Drill, Impala, Presto, etc. DATA WAREHOUSE DATA MARTS BLOGS, TWEETS, LINK DATA REPLICATE, CDC M7, HBase, other data stores POWER EXCHANGE LOG FILES, CLICKSTREAMS VIBE DATA STREAM STREAMING High speed streaming MapR File System (HDFS), NFS MapR M7 Tables MAPR DISTRIBUTION FOR HADOOP LOAD Offload / Enrich / Reload 19
Hadoop Use Cases 2014 2014 MapR MapR Technologies Technologies 20
Common Use Cases: Taking Advantage of Hadoop ENTERPRISE DATA HUB MARKETING OPTIMIZATION RISK & SECURITY OPTIMIZATION OPERATIONS INTELLIGENCE Multi-structured data staging & archive ETL / DW optimization Mainframe optimization Data exploration Recommendation engines & targeting Customer 360 Click-stream analysis Social media analysis Ad optimization Network security monitoring Security information & event management Fraudulent behavioral analysis Supply chain & logistics System log analysis Manufacturing quality assurance Preventative maintenance Smart meter analysis 21