Developing Industry Solutions using IBM Counter Fraud Management

Size: px
Start display at page:

Download "Developing Industry Solutions using IBM Counter Fraud Management"

Transcription

1 Developing Industry Solutions using IBM Counter Fraud Management Rishi S Balaji ribalaji@in.ibm.com Bhavik H Shah bhavik.shah@in.ibm.com Suraj Kumar surajkum@in.ibm.com Sunil Lakshmana sunilaks@in.ibm.com Document Version 1.0 Abstract: IBM Counter Fraud Management (ICFM) is a product from IBM that provides an integrated platform for businesses that need to develop counter fraud solutions. The platform provides the necessary components that are commonly required by counter fraud solutions, across all industries thus reducing the time and effort required in the development of counter fraud solutions. It also provides end to end implementations of certain industry use cases such as Know Your Customer, List Screening, Anti-Money Laundering, to name a few. Developing industry solutions leveraging the ICFM platform requires an understanding of what comes out of the box and how to leverage the platform features to accelerate the development process. This article describes these steps, starting from requirements, gathering to design and implementation. The target audience includes technical architects and developers who are involved in solution development using the IBM ICFM platform. The expected technical level of expertise is intermediate or above.

2 Table of Contents 1 Introduction What does the ICFM product contain? The Counter Fraud Management Process Flow The Counter Fraud Solution Development Process Integrating the ICFM Architecture with the Existing IT Architecture Data Modelling & Ingestion Data Modeling Data Ingestion Fraud Detection Watch list Screening & Deterministic Rules Integration with Identity Insight Integration with SPSS Integration with Rules Customizing the ICFM Case Management Solution Customizing the ICFM case solution Conclusion Resources About the authors...15 Table of Figures Figure 1: A typical Counter Fraud Management Flow...4 Figure 2: Counter Fraud Solution Development Process...6 Figure 3: IBM Counter Fraud Management High Level Archiecture...8 Figure 4: Fact Store Data Model...11 Figure 5: Asynchronous Integration with Identity Insights...12 Figure 6: Integration with SPSS...13 Figure 7: End to End Analysis Flow...14

3 1 Introduction IBM Counter Fraud Management (hereafter referred to as ICFM) is a product from IBM that delivers a single integrated solution addressing all the phases of enterprise counter fraud. Out-of-the-box, it provides most of the common functionality that is required by counter fraud solutions in the detect, respond, discover, investigate phases of counter fraud management. It includes the necessary tools and technology that is required to develop new counter fraud solutions and integrate them with the core systems in the enterprise and other external systems based on the context. For more information on these phases and IBM approach to counter fraud management in general, please refer to the link provided in the Resources section of this white paper. 1.1 What does the ICFM product contain? As a first step towards developing counter fraud solutions using ICFM, it is important to understand what ICFM contains in it. A complete installation of the ICFM comes with : 1. Technology components to design,implement and deploy fraud detection, scoring, investigation and reporting capabilities. Some of the key capabilities include: Entity Analytics enabled through Identity Resolution & Watchlist Screening Predictive Analytics enabled through Statistical modelling and Deterministic Rules Content Analytics for unstructured text analysis Case Management and network analysis for fraud investigation Dashboarding and reporting capabilities 2. A cross industry data model (logical & physical) with key entities required for counter fraud management. Additionally there are industry based extensions of the data model for the banking and insurance industries. 3. Complete end-to-end implementations of certain common industry use cases in the banking and insurance domains. These implementations include data model extensions, fraud detect rules, statistical models and case types for investigation. 4. A case management solution with the specific industry case types, roles and workflows that can be used out-of-the-box for investigating fraudulent cases. 5. Dashboards with a number of common reports that may be of interest to executives, case supervisors and administrators. 6. Frameworks and patterns for a number of tasks such as for development, deployment and governance of analysis flows, bringing in external events into the ICFM model (subscription framework), case creation/updation and so on.

4 1.2 The Counter Fraud Management Process Flow. The following figure shows the end-to-end (run time) process involved in counter fraud management. Figure 1: A typical Counter Fraud Management Flow As seen in figure 1: 1. As the first step, data needs to be ingested into the Counter Fraud system. The input data could be transaction, party or any other event data that is relevant to the context. Note that note all the data attributes that the core systems store are necessary for fraud analytics. Only the data fields that are necessary are brought into the ICFM data store. 2. Based on the scenario that is being implemented, summary information is computed for use during fraud detection. For example in a banking scenario, monthly credit/debit averages, total yearly value of transactions could be summaries that are computed from the transaction history of an entity.note that this is an option step. Such summaries are not always relevant, based on the use case in context. 3. Once the data is into the system, the next step is to resolve the identities of the involved parties and screen them against watch lists. These parties could (for example) be new customers or transacting parties in a banking scenario or the claimant, hospital and third parties involved in a claim in a insurance scenario. Watch list screening is also performed at this stage to identify parties who may appear in internal or external black lists. ICFM handles identity resolution and watch list screening out of the box. Parties ingested into the fact store are asynchronously resolved and screened against watch lists. Necessary alerts are created and cases are queued for investigation, if required. New industry use case implementations can leverage the watchlist screening assessments through a list screening service that is provided by ICFM List Screening content pack. 4. With the identities resolved, the counter fraud system knows the entities that it is dealing with and so it proceeds to detect fraud based on the entity's history, the current input data and computed summaries data. The detection process involves the application of statistical models and deterministic rules to detect fraudulent behavior and rate the level of risk involved in the input

5 data. The rating would typically indicate a high, medium or low risk. 5. Based on the risk rating of the input record, the counter fraud system decides to create a case for further investigation. The case creation process is handled by the ICFM platform, based on the fraud assessment that is created. ICFM also provides a case solution with specific case types for certain banking and insurance scenarios and also a default case type with basic workflows.the investigation process goes through a workflow and may also involve network analysis and unstructured text analysis before the case is closed out. 6. The results of the fraud assessments made in step 3 and the case statistics in step 4 may be used to create executive dashboards to get a quick overview of the overall counter fraud management status of an enterprise. A number of dashboards are available out of the box with ICFM. Any additional dashboards can be built using the reporting tools provided by ICFM. 7. The results of investigation may be used to continuously feedback into the rules and models to make them smarter for future detection. 1.3 The Counter Fraud Solution Development Process This section provides a short answer to what does it take to develop a counter fraud solution for a specific industry use case using the ICFM platform. Figure 2 (below) summarizes the sequence of steps involved in developing a typical counter fraud solution using ICFM. Requirements gathering, architecture definition, data design and interface specifications need to be performed in sequence as the first steps. The individual components, can then be designed and developed in parallel. The analysis flow that integrates all the components in the fraud detection process, has a dependency on the design of the individual components such as rules and models.the interface definitions for each of the components allow the analysis flow to be designed. For example the rule service definition will provide part of the necessary information to design the input data for the analysis flow. This information may change as the rule design matures. As and when the individual components mature, the analysis flow is incrementally updated with more details. This enables the end-to-end testing of the integrated flow to be performed incrementally over the course of the development phase. The user interfaces in ICFM are all integrated into a single web-based interface in Content Navigator which provides ICFM's user interface platform. Case management interfaces are the primary ones which involve design and development. A good portion of this comes out of the box from ICFM. If customer dashboards are required, that involves additional user interface development as well.

6 Figure 2: Counter Fraud Solution Development Process Requirements Gathering The first step in developing counter fraud solutions like in any other solution development is to gather the functional and non-functional requirements. The functional requirements would primarily be around fraud detection, investigation and reporting. The non-functional requirements would list out constraints around data ingestion, real time vs batch mode of execution of analytics, response time constraints on analytics and so on. Functional requirements for counter fraud applications are usually defined in terms of: Detection rules based on the business domain or as mandated by regulatory authorities Case investigation requirements including workflow definitions and user interfaces. Requirements around specific kinds of analysis such relationship analysis and unstructured data analysis Reporting requirements Governance requirements around various aspects of fraud detection and investigation (for example rule change governance) Integrating the ICFM Architecture with Existing IT Architecture ICFM provides a high level solution architecture for counter fraud solutions. All the industry content that is provided on ICFM are based on this architecture. It is very likely that the enterprise implementing the counter fraud solution has an IT architecture in place. The ICFM architecture needs to be integrated with the existing IT architecture at the data, application and user interface levels.this step is detailed further in the Integrating the Integrating the ICFM Architecture with the Existing IT Architecture section in this document. Data Modelling & Ingestion ICFM provides a data model covering most of the important aspects of fraud detection and investigation. This data model needs to be validated and optionally extended in the context of the current solution. The detection rules and reporting requirements collected in the requirements gathering step serve as a key input for data modeling. This step is detailed further in the Data Modelling &

7 Ingestion section in this document. Fraud Detection : Identity Resolution, Watchlist Screening, Rules & Models The detection rules gathered during the requirements gathering phase translate into one or of the following categories of fraud detection : Rules for identity resolution and watch list screening of entities Deterministic rules that work on the specific data elements Statistical models that perform anomaly detection and predictive analytics ICFM provides industry content for specific industries such as banking and insurance. These ICFM Content Packs contain rule and model implementations for specific use cases. Based on the solution in context, these content packs may be leveraged to serve as starting point to develop further rules and models. This step is detailed further in the Fraud Detection Watch list Screening & Deterministic Rules section in this document. Fraud Investigation: The following 3 keys features of ICFM enable fraud investigation: 1. The Case Management Solution : ICFM provides a case management solution that is deployed on the IBM Case Manager product. This solution can be extended with additional properties, workflow and user interface pages. This step is detailed further in Customizing the ICFM case solution section in this document. 2. Network Analysis: ICFM provides sophisticated network analysis features through the Intelligent Analysis Platform (IAP). ICFM provides adapters that connect the fact store with the IAP thus enabling detailed analysis of the relationships between the entities that play a role counter fraud data. 3. Unstructured Data Analysis: ICFM supports unstructured data analysis through a IBM Watson Content Analytics (WCA) product. WCA can be accessed through the same Content Navigator interface that hosts the case solution. Case investigators can use WCA to analysis unstructured text during the investigation process. Reporting ICFM provides a number of basic reports in the out of the box dashboard that comes with the installation. Additional reports can be built leveraging the data from the fact store and the case repositories. Note: To keep the contents simple and provide an overview of the end-to-end process of developing a counter fraud solution, this white paper does not deal with the details of Network Analysis, Unstructured Data Analysis and Reporting in the context of an industry use case. Sample Use Case The rest of this document will use the following sample use case to demonstrate the best practices that are being described.

8 Use Case Description Consider a banking transaction fraud scenario where in two parties are transacting. One of the parties is blacklisted on a terrorist list. Also the currency is flowing into a country that is considered high risk for fraud. The transaction is fed into the counter fraud system in real time and a case needs to be created for offline investigation if the risk rating of the transaction is detected to be high. 2 Integrating the ICFM Architecture with the Existing IT Architecture The following figure shows the technical architecture that ICFM is built on. Counter Fraud solutions leveraging ICFM will use the same base architecture. However not all solutions require the breadth of features provided by ICFM. Also there might be instances where specific architectural changes may have to be made based on the target constraints and requirements. This section provides guidelines to adopt the ICFM architecture in such instances. Figure 3: IBM Counter Fraud Management High Level Archiecture

9 The key components of this architecture are: 1) The Party, Account, Transaction and other relevant data are ingested into the ICFM database which is called the Fact Store. The fact store is the single source of truth for all counter fraud related data for the enterprise. While ingesting a copy of this data is a recommended way of operation, there are options to federate the source databases into the ICFM to overcome replication of data. 2) Watchlist data from internal and external sources are published into the identity repository that is hosted by InfoSphere Identity Insights (ISII) which is the entity resolution engine. Identity Insight performs watchlist screening as well as identity resolution. 3) The analysis director is a key component in the ICFM that mediates a number of tasks such as selection of registered analysis flows for execution, persisting fraud assessments returned by the analysis flows and initiating case creation. 4) The Analysis Flows primarily perform integration of fraud detection components such as watch list screening, statistical models and rules. They may optionally fetch the data that is required to perform fraud detection, gather the results of the scoring from the individual scoring components and pass on the assessment results to the analysis director which may, based on the context, initiate case creation through the case wrapper. Analysis flows in ICFM are built on the IBM Integration Bus and registered with the ICFM platform. 5) Case investigation workflows are enabled through IBM Case Manager which is supported by IBM I2 Analyst Notebook for network analysis and IBM Content Analytics for unstructured text analysis. 6) Data from the fact store and case repository are used to generates dashboards using Cognos Business Intelligence. The ICFM architecture provides the foundational underpinnings required to build counter fraud solutions. The necessary detection rules and models that are specific to a industry use case will have to be built and integrated into the ICFM platform. Figure 3 shows such industry content in yellow boxes. The following are some of the guidelines for leveraging the ICFM archiecture for a specific use case: 1. Identify sources of data that need to be used for fraud detection. This will inturn involve concerns around how the data will be brought in to the fact store. This will also include watch list data from external sources. Detailed data modelling will involve decisions around extending the fact store to accommodate additional data items. In the context of the sample use case, the data required to detect fraud is transaction, customer and blacklist data. The source of this data would typically be the core banking system. 2. Filter out components in the architecture based on the functional requirements.for example, in the context of the sample use case, there is no need of unstructured data analysis, the Content Analytics product is not required.there is also no need to perform any anamoly detection using SPSS. Simple rules can be developed on ODM, to detect this kind of fraud. Also there is a need to perform watch list screening to identify blacklisted customers. 3. Decide on batch vs interactive mode of risk scoring. There may be need for both based on the use case. Related decisions around how, when and what data will move in and out of the counter fraud solution will have to be made based on batch vs interactive operation.

10 4. Decide on how the analysis flows will integrate the fraud models and rules into a single flow. Typically this may be web service based integration. However, there are other possibilities such as message queue based integration. In the sample use case scenario this is indicated by 5. Integration with core systems and external systems : the risk scoring results need to be fed back to the core systems for further processing. Based on the existing enterprise architectural standards, specific integration mechanisms will have to be identified. 1) In the context of the sample use case, the interface with the core banking system needs to be resolved. A web service interface could be provided to ingest and score the transaction in real time. Alternatively a queue based messaging interface may also be feasible. Note that the steps mentioned above are only indicative of what kind of decisions will have to be made while adopting the ICFM architecture for a specific use case. They are not meant to be complete or exhaustive. The following sections will detail further on how to go about doing the design and development on ICFM with reference to the sample use case. 3 Data Modelling & Ingestion Data modeling & ingestion is a crucial step once the solution architecture is laid out. The rest of the components in the solution are closely tied to the data model to retrieve data for fraud detection and to persist the fraud assessment results. 3.1 Data Modeling As a first step in data modelling, it is important to understand what data is required to implement the requirements specified by the use case. The next step would be to see what part of it is provided by the ICFM base data model (fact store model) and then decide on the extensions if there is a gap. As per the sample use case there are parties that transacting and money is flowing to a high risk country. In order to realize the use case, we have to identify the entities and attributes required for fraud detection. Looking at the use case, we would need the following key entities : Party with attributes like, party's name, address, contact numbers, s etc., Account with account number, account address & specifically country and account branch. Transaction with details like transaction id, amount, from account, to account, currency etc. The following figure shows a portion of the fact store data model. As seen in this figure, the fact store contains party, transaction and account tables in it. So these base tables can be used as is for the sample use case.

11 Figure 4: Fact Store Data Model 3.2 Data Ingestion There are several options for ingesting data into the fact store. The most common method is to use ETL jobs to transform and load the data. The ICFM platform provides an ingestion utility as well, which may be used to map and load data from source databases or CSV files. In the sample use case one of the important requirements is to identify if any of the transacting parties is blacklisted. This list may come from a source that is external to the enterprise or may be maintained internally. In either case, this data needs to be ingested into the ICFM system in order to leverage it for watch list screening. IBM InfoSphere Identity Insight (hereafter referred to as ISII) provides the watch list screening capability. So the watch list data needs to be ingested in to ISII. When party data is ingested into the ICFM, the internal framework of ICFM publishes this data from the fact store to ISII. In order to ingest watch lists in to ISII, it is required to create a data model in ISII. Most of the basic data elements (like, names, address, identifications like passport, DL, frequent flyers etc., )are included as out-of-box configuration of ISII.In instances where there are identification elements in the customer data that are not present in the ISII these elements have to be manually created in ISII. For the sample use case, let us assume we use the default configuration on ISII. 4 Fraud Detection Watch list Screening & Deterministic Rules When data ingestion has been completed, the rules and models for fraud detection need to be designed and coded. Rules are typically developed in IBM ODM and statistical models are developed in IBM SPSS Modeller. This section shows how to integrate watch list screening results (from ISII) and risk scoring results (from ODM) and finally prepare the necessary data for case creation.

12 4.1 Integration with Identity Insight ISII is configured to generates alerts when ISII detects a match between a party say a bank's customer and a party in a black list. These alerts are consumed by the ISII subscriber which is part of the ICFM core platform. The ISII plug-in converts the alerts into events in the CF database. These events are consumed by the List Screening analysis flow which is part of the ICFM Industry Content Pack. The List Screening analysis flow generates assessments for each event based on certain rules. These assessments are available to other use cases such as transaction monitoring through a List Screening service. Cases for watch list screening matches are created by the List Screening use cases. In the context of the sample use case, the assessments for the parties involved in the transaction are are accessed by the Transaction Monitoring analysis flow, through the List Screening service.the results are then aggregated with the other transaction assessment scores (from rules and statistical models) to geneate the final transaction assessment. Figure 5 below, shows the sequence of data flow through various components, as described above. Figure 5: Asynchronous Integration with Identity Insights Inferences from the results returned by the ISII Service, like the watchlist that the party matched with, watchlist party's name etc can be stored in IIB's environment data structure to be used later in the Analysis flow.

13 4.2 Integration with SPSS Integration with statistical models in SPSS is typically done through a web service call. To enable this integration, the Predictive/Anamoly model in SPSS must be deployed and exposed as a service. The webservice request created by the IIB flow should include all the input parameters as defined while confguring the service in SPSS and its corresponding values. Figure 6 shows this integration flow. Note that in the context of the sample use case, there is no need for a statistical model for risk scoring. So this step can be ignored. Figure 6: Integration with SPSS 4.3 Integration with Rules Deterministic rules for a fraud detection are developed and deployed on ODM and are exposed as a Web Service. In the context of the sample use case, one of the rules would indicate that the country of currency transfer in the transaction is a high risk country. Also the results from watch list screening are aggregated along with the high risk country rule results. This gives out a cumulative risk score and risk category for the transaction. For example, assume the High Risk Country rule adds a risk score of 50 to the transaction. The watch list screening results show a match with the blacklist with a confidence score of 99. These two results are combined using an aggregation matrix that takes into consideration different ranges of the rule score and watch list screening score and comes up with a final risk cateogory of high, medium or low. All the data elements of the rule request are created in a compute node or a mapping node or any other appropriate node in the IIB flow and delivered to Rules using IIB SoapRequest node. The rules response is then parsed to know the final Risk categorization of the analysis. All the individual component integrations are built into a comprehensive analysis flow as shown in figure 7. The score and risk status returned by ODM is passed to the analysis director through the XML response that the analysis flow generates. If the risk category (called banded assessment) for the concerned context has an associated case action, then certain case properties should also be passed on to the analysis director through the XML response. At this stage, end of analysis message is sent to the response aggregator flow.

14 Figure 7: End to End Analysis Flow 5 Customizing the ICFM Case Management Solution ICFM's case management solution is built and run on IBM Case Manager product. This section details how to leverage the ICFM content in the case management area to develop a new counter fraud industry solution. 5.1 Customizing the ICFM case solution ICFM comes with a case solution with pre-built functionality for certain industry use cases. In most instances the case solution may be the right place to start. It contains a number of pre-built industry case types with properties, tasks and work flows. One should be able to use an existing case type if it is relevant or create a new one in the same solution. As design best-practice, perform a case solution design irrespective of the usage of ICFM content. This design should include the following key elements of a case solution: 1. Case properties 2. Investigation Roles 3. Investigation tasks & associated workflows 4. User interface specification for case investigation For each design element mentioned above, the ICFM case solution provides some content out of the box. The following guidelines indicate the steps to leverage the ICFM case solution and extend it for a specific use case: 1. ICFM's case solution comes with a case type by name Case Investigation. This case type contains certain basic properties and a workflow for triaging, investigation and finalization of the case. In simple scenarios this should be sufficient. The ICFM case solution provides case types for specific use cases in the banking and insurance domains. These may be used out of the box or customized for specific requirements. A new case type may have to be created for a completely new industry use case implementation that does not exist in the out of the box ICFM case solution. 2. The ICFM case solution provides certain case properties like account id, customer name and so on. It is very likely that additional properties may be required for a specific use cases. Add the additional properties that have been identified to the case type. In the sample use case, these would

15 be properties such as transaction id, transaction dates, transacting accounts and bank details and so on. 3. The Case Investigation case type provides a basic workflow that takes a case from triaging to the point of finalization and generating a report. It is likely that enterprise guidelines and regulatory compliance mandate certain workflow constraints. For example in the sample transaction monitoring scenario it is required to prepare and file a Suspicious Activity Report with the government authorities, if a transaction fraud is detected. Such workflow enhancements have to be built in the case solution. 4. The ICFM case solution provides a number of pages to display case information such as work details pages, case details pages and solution pages. It is very likely that one of these pages can be reused with specific layout changes required for a use case. For advanced users, any specific functionality that cannot be achieved with the ICFM or Case Manager widgets will have to be built as custom widgets and deployed in the Case Manager. 5. ICFM also provides a number of widgets like the Party Information Widget, Related Cases Widget and so on that are relevant to counter fraud solutions. New case types built on ICFM, can leverage these widget to enhance the functionality of the investigation process. 6 Conclusion This white paper presented an overview of what it takes to develop a counter fraud solution using the IBM Counter Fraud Management product. It highlighted what ICFM provides out of the box and how it can be leveraged in the context of a sample use case. For more details on the specifics of ICFM please refer to the links in the Resources section of this document. 7 Resources Visit the IBM Counter Fraud Management page for an overview of ICFM. Visit the ICFM System Requirements page for a complete listing of product versions included in the ICFM. 8 About the authors Rishi S Balaji is an Application Architect at IBM Global Business Services. His experience ranges from product development to reusable asset development and service delivery engagements. His core expertise is in technologies related to counter fraud analytics, SOA and JEE. He is currently involved in the design and development of counter fraud industry use cases for the IBM Counter Fraud Management platform. Suraj Kumar is a Solution Architect at IBM s lab in Bangalore, India. He has more than sixteen years of experience working in software development and Integration Technologies. He is currently working in the Financial Sector Industry Solutions. He holds a degree in Electronics & Communications Engineering from National Institute of Technology, Surathkal, India. Sunil Lakshmana played a role of Data Architect on the AML solution development on the ICFM platform, constitutes ICFM Data Model extension, configuration & solution of Identity Insight and data

16 ingestion, i2 schema define and link analysis, SPSS models on customer statistical profile, peer group profile, Transaction profile and monitoring and services. Bhavik Shah is a Solution Architect with Industry Solutions team in India working on Banking and Insurance domain. He has a decade of experience with most of the part in Financial Sector. He also leads MDM initiatives for Banking Frameworks in the Industry Solutions team. He has worked with various clients in various roles like consultant, designer etc involving various IBM Products. He has his Masters in Computer science and Bachelor of Engineering from Information Science. Thanks to Bob Patten Deployment Architect, IBM Counter Fraud Management, for reviewing the contents of this document.