Model Evaluation & Validation. Overview. Nature of Model Validation (Cont.) Nature of Model Validation. Wei-Tsong Wang IIM, NCKU

Size: px

Start display at page:

Download "Model Evaluation & Validation. Overview. Nature of Model Validation (Cont.) Nature of Model Validation. Wei-Tsong Wang IIM, NCKU"

Crystal Casey
5 years ago
Views:

1 Nature of Model Validation Model Evaluation & Validation Wei-Tsong Wang IIM, NCKU A model cannot be expected to have absolute validity. (Forrester, 1968) A model is developed for a purpose. A model should be valid for its purpose but may be irrelevant or wrong for some other purposes. No positive proof is conceptually possible for any theory or model. (Forrester, 1968) No absolute proof but only a degree of hope and confidence on the validity of a model (validity is a relative concept). No universal standards for comparing the validity of models constructed for different purposes. 1 3 Overview Nature of model validation What is validation Two aspects of model validation Historical review of the discussions on System Dynamics model validation Role of statistical tests in model validation Some statistical tests for model validation 2 Nature of Model Validation (Cont.) Model validation is inherently a social, judgmental, qualitative process: models cannot be proved valid but can be judged so. (Barlas & Carpenter, 1990) A model is a imperfect representation of the reality and is valid if it serves as a useful tool for decision making. (Barlas & Carpenter, 1990) 4

2 Nature of Model Validation (Cont.) Some statistical tests can be applied to SD model validation, but some of them are inappropriate (Barlas, 1989; Forrester & Senge, 1980) What is Validation (Cont.) An operating definition of validation: an on-going mix of activities embedded throughout the iterative model-building process. (Lane, 1998) Features of model validation (Barlas, 1989; Barlas, 1996; Forrester & Senge, 1980; Lane, 1998) Focus on structure Focus on behavior 5 7 What is Validation Determine whether the conceptual simulation model is... an accurate representation of the system under study. (Kleijnen, 1995) Validation is a process of establishing confidence in the soundness and usefulness of a model. (Forrester 1973; Forrester & Senge 1980) Evaluate the usefulness of the model in a specific social context, or for a particular purpose. (Barlas & Carpenter, 1990; Forrester, 1968; Lane, 1998) 6 The system dynamics modeling process Perceptions of System Structure Reconcilation Representation of Model Structure System Conceptualization Model Formulation Empirical and Inferred Time Series Reconciliation. Deduction Of Model Behavior Adapted from G. P. Richardson s presentation on Model Validation Source: Saeed,

3 Processes focusing on system structure Two Kinds of Validating Processes Mental Models, Experience, Literature Perceptions of System Structure Diagramming and Description Tools Reconcilation Representation of Model Structure Empirical Evidence System Conceptualization Model Formulation 9 Mental Models, Experience, Literature Perceptions of System Structure Diagramming and Description Tools Reconcilation Structure Validating Processes Representation of Model Structure Empirical Evidence System Conceptualization Model Formulation Behavior Validating Processes Empirical and Inferred Time Series Reconciliation. Deduction Of Model Behavior Literature, Experience Computing Aids 11 Processes focusing on system behavior Forrester and Senge s Propositions on Confidence Empirical Evidence System Conceptualization Model Formulation Empirical and Inferred Time Series Reconciliation. Deduction Of Model Behavior Literature, Experience Computing Aids Forrester and Senge s (1980) tests for model validation Tests of model structure Tests of model behavior Tests of policy implication 10 12

4 Tests of Model Structure Tests of Model Structure (Cont.) Structure-verification test Compare the structure of a model with structure of the real system Parameter-verification test Whether or not the parameters in a model correspond conceptually and numerically to the real system Dimensional-consistency test No unit errors Other tests Tests of Model Structure (Cont.) Tests of Model Behavior Extreme-conditions test A model should allow extreme conditions of levels in the system to be represented Boundary-adequacy (structure) test Whether or not the structure satisfies a model s purpose? Proper level of aggregation? Includes all relevant structure? Behavior-reproduction test How well model-generated behavior matches observed behavior of the real system? A number of tests are available for this matter, such as symptom-generation test and multi-mode test Behavior-prediction test Pattern prediction test: generating qualitatively correct patterns of the future? Event prediction test: generating qualitatively correct patterns in response to a particular change in circumstances? 14 16

5 Tests of Model Behavior (Cont.) Tests of Model Behavior (Cont.) Behavior-anomaly test Do anomalous features of model behavior that conflict with the behavior of the real system exist? Family-member test A model usually represents a family of social systems With proper changes on parameters and tables, the model should be able to represent a particular member in the family Behavior-sensitivity test Do plausible shifts in model parameters cause a model to fail to generate observed patterns of behavior or behave implausibly? Other tests Tests of Model Behavior (Cont.) Tests of Policy Implication Surprise-behavior test Is the unexpected behavior generated by the model the same as that in the real system? Extreme-policy test Does the model behave like the real system when the same extreme policy is applied? Boundary-adequacy (behavior) test Does the model include all the structures that are necessary to address the issues to which the model is designed for? System-improvement test Does the model help us find policies which are beneficial? Changed-behavior-prediction test Does the model accurately predict the behavioral change of the system corresponding to a change in the original policy? 18 20

6 Tests of Policy Implication (Cont.) Boundary-adequacy (policy) test Does the addition of a new structure dramatically change the policy recommendations? Policy-sensitivity test To what degree policy recommendations are influenced by uncertainty in parameter values? Barlas Framework on Model Validation (Cont.) Structure validity tests Direct structure tests Directly compare the model structure with the structure of the real system Structure-oriented behavior tests Behavior validity tests Assess how accurately a model can reproduce the behavioral patterns exist in the real system Barlas Framework on Model Validation Barlas (1996) proposes that confidence on a model builds on its structure validity and behavior validity However, he also proposes that behavior validity is meaningful only if we have sufficient confidence in the structure Lane s Practical Criteria for Confidence on SD Models Lane (1998) discusses about building confidence on the generic structures of simulation models 3 Generic structures Canonical Situation Model (fully formulated simulation model) Abstracted Micro-structure (stock and flow structure) Counter-intuitive System Archetype (causal-loop structure) 22 Source: Lane,

7 Criteria for Confidence on SD Models (Cont.) Summary of Model Validation Tests Features of the criteria for confidence Each criterion include concerns on both structure and behavior Each criterion considers about representativeness and usefulness Each criterion can be applied to each of the 3 generic structures Testing SUITABILITY for PURPOSE Testing CONSISTENCY with REALITY Contributing to UTILITY & EFFECTIVENESS Focusing on STRUCTURE Focusing on BEHAVIOR Source: Lane, Summarized by G. P. Richardson Sources: Forrester, 1973; Forrester & Senge, 1980; Richardson & Pugh, Criteria for Confidence on SD Models (Cont.) Role of Statistical Tests in Model Validation Lane s (1998) 3 criteria for confidence Perceived representativeness of models (PRoM) Whether a model s structure, data, and behavior represent the real system? Analytical quality of policy insights (AQ) Whether a model produce policy insights and recommendations? Process effectiveness of the intervention (PEI) Whether a model satisfies the target clients? Source: Lane, General problems on using statistical tests for SD model validation (Barlas, 1996): Technical problem #1: Most statistical tests assume that data: Serially independent (not autocorrelated) Not cross-correlated Normally distributed However, data in an SD model is autocorrelated and crosscorrelated by nature Technical problem #2: In SD models no single output variable one can focus on in validity testing (multiple-hypothesis problem may occur when using statistical hypothesis testing) 28

8 Role of Statistical Tests in Model Validation (Cont.) Role of Statistical Tests in Model Validation (Cont.) Philosophical problem: The cost of rejecting the true null hypothesis or accepting the false null hypothesis is difficult to determine. What would be our tolerable level of significance? Alternative: Report the obtained validity statistics, but no binary significance testing (no reject or fail to reject decision). For model behavior: Kalman filter might be useful (Peterson, 1979, quoted in Forrester & Senge, 1980). It aims to compare model behavior to data instead of comparing model structure to data like the traditional statistical tests do. Might be more appropriate for SD since it extracts measurement errors when testing causal hypotheses in the SD models Role of Statistical Tests in Model Validation (Cont.) From the perspective of model structure and behavior (Forrester & Senge, 1980) For model structure: Traditional statistical tests are not sufficient for testing the causal hypotheses in a system dynamics model, but might be useful for discovering flaws. Some Statistical tests for Model Validation Barlas (1989) proposes a series of tests (in a sequential order) for evaluating model behavior Step 1: Trend comparison and removal Step 2: Comparing the periods Step 3: Comparing the means Step 4: Comparing the variations Step 5: Testing for phase lag Step 6: Discrepancy coefficient U as a summary measure 30 32

9 Some Statistical tests for Model Validation (Cont.) Kleijnen (1995) proposes that some simple tests are available for comparing the output of a SD model with historical output of the real system, such as: Eyeball time paths Schruben-Tuning test Traditional statistical tests Discussion What do you think about Model validation now? What can we do about model validation in general? Validation, as an integrated social process, is present at every step Conceptualizing: Do we have the right people? The right dynamic problem definition? The right level of aggregation? Mapping: Developing promising dynamic hypotheses Formulating: Clarity, logic, and extremes Simulating: Right behavior for right reasons Deciding: Implementable conclusions Implementing: Requires conviction! Reference Barlas, Y. (1989). Multiple tests for validation of system dynamics type of simulation models. European Journal of Operational Research, 42(1), Barlas, Y. (1996). Formal aspects of model validity and validation in system dynamics. System Dynamics Review, 12(3), Barlas, Y., & Carpenter, S. (1990). Philosophical roots of model validation: two paradigms., System Dynamics Review, 6(2), Forrest, J. W. (1968). Industrial Dynamics A Response to Ansoff and Slevin. Management Science, 14(9), Sources: G. P. Richardson s presentation on Model Validation 34 36

10 Reference (Cont.) Forrester, J. W. (1973). Confidence in Models of Social Behavior With Emphasis on System Dynamics Models. M.I.T. System Dynamics Group Forrester, J. W., & Senge, P. M. (1980). Tests for building confidence in system dynamics models. in A. A. Legasto, JR, J. W. Forrester, & J. M. Lyneis (Ed.), System Dynamics: TIMS Studies in the Management Science, 14, New York: North-Holland, Kleijnen, J. P. C. (1995). Verification and validation of simulation models. European Journal of Operational Research, 82(1), Reference (Cont.) Lane, D. C. (1998). Can we have confidence in generic structures? The Journal of the Operational Research Society, 49(9), Peterson, D. W. (1979). Statistical tools for system dynamics, in J. Randers (Ed.), Elements of the System Dynamics Method, Cambridge, MA: MIT Press Richardson, G. P., & Pugh, A. L. (1981). Introduction to System Dynamics Modeling with DYNAMO. Cambridge, MA: Productivity Press. Reprinted by Pegasus Communications Saeed, K. (1992). "Slicing a complex problem for systems dynamics modeling." System Dynamics Review, 8(3),