Audit Sampling With MindBridge Written by: Corey Yanofsky and Behzad Nikzad
Introduction One of the responsibilities of an auditor to their client is to provide assurance that the rate of non-compliance with established operating procedures is low. This determination impacts business processes, valuation, and planning. Therefore the auditor must determine whether they can provide the assurance for a given file using robust methodologies. The most thorough method to determine the rate of non-compliance would be to extensively examine every entry in the file, but of course this is not feasible. Instead the auditor provides reasonable assurance by examining a subset of the data. The process of using a subset of the data to make conclusions about all of the data is called sampling. Sampling is a methodology that is used in a variety of scientific and financial fields and is a valid method for performing an audit. Naturally there are drawbacks in performing an audit using a sample rather than the entire data set. One of the drawbacks is sampling risk: the risk that the auditor reaches a different conclusion by examining a subset of the data than they would have if they examined the entire data set. Sampling risk is accentuated when the entries of interest are rare, such as those containing non-compliance with established procedures. There are two methods that are currently used to mitigate the effects of sampling risk: 1) Sampling risk can be decreased by increasing the sample size. Auditors are provided with guidance on the number of samples they are required to examine in order to provide reasonable assurance. 2) Sampling risk can be decreased by selecting subsets which are more likely to be non-compliant. These subsets can be selected using one of two methods: A senior auditor can plan the audit to focus on high risk portions of the file. Using statistical methods to identify entries which are more likely to be non-compliant. Let us take a journey to understand how Audrey, the senior auditor, uses sampling to give reasonable assurance. This journey will include an introduction to the robust methodologies used in an audit. These include sampling, hypothesis testing, and sampling from enriched subsets. 2
Sampling Audrey receives a file (File 1) which contains 10,000 transactions. Unknown to Audrey the file has 100 transactions which are non-compliant with established procedures. This is a 1% rate of non-compliance. A visual representation of this file is shown in the figure below: Audrey tells four junior auditors to examine 100 transactions each. These transactions are chosen at random. The four junior auditors examine the 100 transactions assigned to them and report back to Audrey. The junior auditors summarize their reports as follows: Junior Auditor 1: I found 1 transaction(s) which are non-compliant. Junior Auditor 2: I found 0 transaction(s) which are non-compliant. Junior Auditor 3: I found 1 transaction(s) which are non-compliant. Junior Auditor 4: I found 2 transaction(s) which are non-compliant. 3
This is a demonstration of sampling risk. A visualization of the samples examined by each of the junior auditors is shown below: Even though File 1 has a 1% rate of non-compliance, due to the random sample selection, the observed rate of non-compliance will not necessarily be equal to 1%. The chance of finding different numbers of transactions that are non-compliant in a sample set of 100 from File 1 is summarized below: # of Non-Compliant Transactions Probability in Sample of 100 0 36.6% 1 37.0% 2 18.5% More Than 3 7.9% 4
Statisticians view the chart above as a probability distribution. The probability distribution of finding any number of non-compliant transactions in a sample of 100 from a file with 1% non-compliance rate is shown in the graph below: The graph above shows that if we take a sample of 100 transactions from a file with 1% non-compliance rate, we will most likely find 0,1, 2, or 3 non-compliant transactions. It is unlikely that we will find 4+ non-compliant transactions. These distributions are used in designing robust methodologies for performing audits. 5
The Audit Experiment The sampling results above were based on a file with 1% non-compliance rate. Suppose we have a second file (File 2) which has a non-compliance rate of 5%. The difference in the two files is shown in a visual representation below: If we take a sample of 100 transactions from each file, it is very likely that we will find more non-compliant transactions in the the file with a 5% non-compliance rate than the file with 1% non-compliance rate. The likelihood of finding any number of non-compliant transactions in a sample set of 100 from File 2 is summarized below: # of Non-Compliant Transactions Probability in Sample of 100 Less Than 3 11.8% 3 14.0% 4 17.8% 5 18.0% 6 15.0% 7 10.6% More Than 7 12.7% 6
The probability distribution of finding any number of non-compliant transactions in each of the files with a sample of 100 transactions is shown below: Audrey, the senior auditor, receives 3 files. Each of the files has a non-compliance rate of either 1% or 5%. She does not know the non-compliance rate of any given file. Audrey assigns each of the files to a junior auditor. She asks them to examine 100 transactions from the file that is assigned to them. These transactions are chosen at random. The four junior auditors examine the 100 transactions assigned to them and report back to Audrey. The junior auditors summarize their reports as follows: Junior Auditor 1: I found 0 non-compliant transactions. It is unlikely that a file with 5% non-compliance rate would give this result. Therefore this sample must come from a file with 1% non-compliance rate. Junior Auditor 2: I found 5 non-compliant transactions. It is unlikely that a file with 1% non-compliance rate would give this result. Therefore this sample must come from a file with 5% non-compliance rate. Junior Auditor 3: I found 2 non-compliant transactions. Junior Auditor 1 and Junior Auditor 2 were able to reach conclusions that, although not absolutely certain, are backed by strong evidence that provides reasonable assurance of their correctness. The amount of sampling risk associated with these conclusions is determined by the overlap between the two distributions. The level of overlap between the two distributions also determines the likelihood a scenario like the one encountered by Junior Auditor 3 occurs. He was not able to reach a conclusion because his result of 2 non-compliant transactions in the region of overlap between the two distributions. I can t rule out either non-compliance rate as there is a substantial chance I could have gotten this result from either of the files. 7
In order to decrease the amount of overlap in the distributions, the auditor can increase the sample size. The probability distribution of finding any number of non-compliant transactions in each of the files with a sample of 500 transactions is shown below: With the smaller level of overlap that occurs when 500 sample transactions are used, Audrey s team will be able to determine whether a file has a 1% or 5% non-compliance rate, with a high degree of confidence. They will also be able to make this determination in almost every file that they examine. It is not feasible however to examine 500 samples for most audits, thus auditors are limited on the level of assurance that they can offer their clients. The audit, like a scientific experiment, is based on exactly this type of hypothesis testing. The auditor sets a Tolerable Deviation Rate (TDR). This is the highest rate of non-compliance that a client is willing to accept. They also sets a Confidence Interval (CI) that will be used to provide reasonable assurance, e.g., 90%. Using the TDR, the CI, and historic values for the non-compliance rate (NCR) they are able to determine the number of samples required to make a determination on a given file. 8
Enriched Samples Sampling risk is accentuated when the entries of interest are rare, such as transactions containing non-compliance in financial data. We saw that sampling risk can be decreased by increasing the sample size. Another method for decreasing sampling risk is by focusing on transactions which are more likely to contain non-compliance. A sample from a subset of transactions that are more likely to contain non-compliance, is called an enriched sample. By focusing on a subset of the data that contains a higher rate of non-compliance, we are decreasing sampling risk. There are two methods commonly used by auditors to achieve this goal. Judgment Sampling: Patricia, a partner in the audit firm, receives a file from a client. She has expert domain knowledge regarding the client and the industry in which the client conducts business. She is aware that most non-compliance occurs in certain accounts and wants to focus on those. Using her expertise she is able to focus on half of the file where non-compliance is likely to be found. The sample from this high risk half of the file is now an enriched sample. The use of expert knowledge to determine high risk portions of the file introduces its own risk. The auditor may reach a different conclusion by excluding some transactions from the audit than they would have if they had examined all of the data. It is not feasible to examine all of the data, therefore it is reasonable to use the senior auditor s expertise and domain knowledge to plan the audit. Monetary Unit Sampling: Another common method is Monetary Unit Sampling (MUS), also known as Probability-Proportional to Size sampling. Unlike simple random sampling (in which every transaction has an equal chance of being picked for the sample), every dollar (or other appropriate monetary unit) is regarded as distinct and is given an equal chance of being in a transaction that is picked for the sample. As a result, transactions with large amounts have a higher probability of being picked in a sample than transactions with small amounts; in fact, transactions are selected from the population, in proportion to their size. The logic behind MUS is clear and general: it is more likely that non-compliance will be found in transactions with high monetary values than transactions with low monetary values. By incorporating the information about the size of the transaction we get a sample set which is enriched with transactions of interest. 9
The MindBridge Audit The fundamental problem addressed by sampling is that it is not feasible to apply auditor judgment and expertise to the entire set of data. Sampling provides an objective basis for the claim that audit findings have reasonable assurance, but this assurance comes at the cost of restricting the application of the auditor s judgment and expertise to just the inputs of the sampling procedure. In the ideal situation, the auditor would examine every transaction; she would come to understand which sorts of transactions represented ordinary business and which sorts of transactions are unusual and deserve closer attention. Although this is not feasible for an auditor, artificial intelligence can be applied to assess every transaction to determine which transactions are most anomalous and in need of auditor attention. Using MindBridge, every transaction is analyzed along multiple dimensions, both globally (i.e., relative to all transactions) and locally (i.e., relative to just the transactions most similar to it). Domain expert knowledge is also incorporated into the analysis by identifying monetary flows between account classes that have been categorized as risky by domain experts. These flows either involve high-importance accounts or are not a part of common business processes. MindBridge also applies ordinary rules-based analytics available in standard CAAT packages. All of the information in these procedures is summarized in an overall risk score that helps the auditor focus attention on the most anomalous transactions. The results of this comprehensive analysis can be used by the auditor at multiple stages of the audit process. The auditor can use the analysis provided by Mindbridge to form an enriched sample set and to identify non-compliant transactions in the sample set. The tool can be used in the following ways: 1) Plan the audit using summary visualizations A view of risk by account A view of risk over time 2) Explore the data using faceted search to focus on subsets of the data All transactions involving an account ranked by risk All transactions involving a user ranked by risk All transactions in a specific period ranked by risk 3) Understand transactions using the Control Points which are triggered Identify the flags triggered by each entry in the transaction Identify other transactions which trigger the same flags Identify other transactions with similar features The auditor can use MindBridge to design a robust methodology to provide reasonable assurance to their client with a high level of confidence. The MindBridge tool combines the power of domain expert knowledge, rules-based risk analysis and machine learning algorithms. This helps the auditor to identify and explore sample sets which are enriched with transactions of interest, which in turn increases audit thoroughness and resource efficiency. 10
About MindBridge Analytics Inc. MindBridge Ai is a venture-backed FinTech company based in Ottawa, Canada. Through the application of machine learning and artificial intelligence technologies, the MindBridge platform detects anomalous patterns of activities, unintentional errors and intentional misstatements. Using the MindBridge Ai Auditor, organizations across multiple industries can minimize financial loss, reduce corporate liability and enhance their professional judgment. For more information, visit www.mindbridge.ai If you have any questions about this paper, please reach out to Corey Yanofsky, Data Scientist at cyanofsky@mindbridge.ai Copyright 2018, MindBridge Analytics Inc. All Rights Reserved. The MindBridge word and logo, and the Ai Auditor word are trademarks of MindBridge Analytics Inc. 11