Chapter 7: Process Analysis

Size: px
Start display at page:

Download "Chapter 7: Process Analysis"

Transcription

1 Chapter 7: Process Analysis

2 Contents 1 Introduction 2 Process analysis and simulation 3 Process performance management and warehousing 4 Process mining 5 Process compliance 6 Summary & outlook References 2

3 1 Introduction Let us recall the different analysis perspectives 2015 Springer-Verlag Berlin Heidelberg 3

4 1 Introduction Process analysis helps to understand the operational perspective based on process models or event data Taking a fresh approach to data science Process Science 2015 Springer-Verlag Berlin Heidelberg 4

5 1 Introduction Goals: The aim of this chapter is to convey process analysis methods and techniques, in particular: Understanding the difference between static and dynamic process analysis Introducing the concept of process warehousing Introducing the basic idea of process mining Presenting selected process mining algorithms Pointing to further process mining scenarios Summarizing the key challenges of business process compliance 5

6 1 Introduction 2015 Springer-Verlag Berlin Heidelberg 6

7 1 Introduction On top of static/dynamic analysis distinction into qualitative and quantitative analysis questions, for example: Examples: Qualitative: Static: Which activities are not assigned a role? Dynamic: Given a set of process instance executions, how does the underlying process model look like? Quantitative: Static: Average duration of process task X based on simulation data? Dynamic: How many running instances do currently exceed the imposed deadline? 7

8 Contents 1 Introduction 2 Process analysis and simulation 3 Process performance management and warehousing 4 Process mining 5 Process compliance 6 Summary & outlook References 8

9 2 Process analysis and simulation Several analysis tasks at the process model, without any simulation, for example, estimated process costs 2015 Springer-Verlag Berlin Heidelberg 9

10 2 Process analysis and simulation Dynamic analysis at design time is based on simulation data, i.e., artificial runs of process instances based on a given process model The goal is to predict their behavior Several input data can be typically specified, e.g.: Number of simulated instances Probabilities at alternative branchings Capacities of resources Costs and processing times for tasks (fixed, based on probability distribution) Strategies for resolving bottlenecks Expected output comprises Overall process duration Processing time/costs for certain tasks Number of possible process executions in given time frame 10

11 2 Process analysis and simulation 11

12 2 Process analysis and simulation Tools: Signavio BOC ARIS Bonapart CPN Tools 12

13 2 Process analysis and simulation Process optimization Can be based on insights from static and dynamic analysis Example: eliminate identified bottlenecks Optimization dimensions 1 : cost, time, quality, flexibility Partly conflicting Structural optimizations include, e.g., eliminating, combining, reordering activities Organization optimizations, e.g., empowering users (specialist generalist) No automatism, only best practices Important to analyze process after optimizations again 1 Selma Limam Mansar, Hajo A. Reijers: Best practices in business process redesign: use and impact. Business Proc. Manag. Journal 13(2): (2007) 13

14 Contents 1 Introduction 2 Process analysis and simulation 3 Process performance management and warehousing 4 Process mining 5 Process compliance 6 Summary & outlook References 14

15 3 Process performance management and warehousing Now real process execution data is analyzed (not simulated) First level : process monitoring Observe process behavior based on events emitted by the process instances during runtime Predictive monitoring that aims at forecasting potential problems during process execution before they occur 2 Definition and monitoring of Key Performance Indicators such as throughput time Tools: IBM Business Modeler Advanced: ARIS Process Performance Manager: 2 Andreas Metzger, Philipp Leitner, Dragan Ivanovic, Eric Schmieders, Rod Franklin, Manuel Carro, Schahram Dustdar, Klaus Pohl: Comparing and Combining Predictive Business Process Monitoring Techniques. IEEE Trans. Systems, Man, and Cybernetics: Systems 45(2): (2015) 15

16 3 Process performance management and warehousing Now real process execution data is analyzed (not simulated) Second level : process warehousing Offline analysis of process execution data Data is structured in a multi-dimensional way (cf. Chapter 3) 16

17 3 Process performance management and warehousing Dimension instance InstanceState InstanceID Dimension processtype Process Dimension time ProcessVersion ProcessName ProcessID Duration Cost Day Month Quarter Year Week Dimension activitytype Activity Dimension organization Duration ActivityType ActivityName ActDefID Quality Actor Role RoleType 2015 Springer-Verlag Berlin Heidelberg OrgUnit OrgUnitType 17

18 3 Process performance management and warehousing Application of OLAP operations on process cube Application of cross-sectional analysis methods (cf. Chapter 5) Decision tree: The activities with a duration above 40 are performed by Actor XY 2015 Springer-Verlag Berlin Heidelberg 18

19 Contents 1 Introduction 2 Process analysis and simulation 3 Process performance management and warehousing 4 Process mining 5 Process compliance 6 Summary & outlook References 19

20 4 Process mining Process mining has three key tasks 3 : Process discovery Process conformance Process enhancement 3 Wil M. P. van der Aalst: Process Mining - Data Science in Action, Second Edition. Springer 2016, ISBN , pp

21 4 Process mining Particularly exploration ( finding ) process models is often a cumbersome and errorneous task Are there alternatives? Observation: Processes are often implicitly executed (maybe distributed over different systems) Prerequisite: Log data of processes available Process / Workflow mining offers techniques to automatically derive process / workflow models from such log data Start A Start B End A. Register order Prepare shipment (Re)send bill Ship goods Archive order Receive payment Process discovery Contact customer 21

22 4 Process mining Different process mining algorithms Focus on discovering the control flow, i.e., the tasks and their relations Alpha Miner 4 Heuristic Miner 5 Genetic Miner 6 4 Ana Karla A. de Medeiros, Wil M. P. van der Aalst, A. J. M. M. Weijters: Workflow Mining: Current Status and Future Directions. CoopIS/DOA/ODBASE 2003: A. J. M. M. Weijters, J. T. S. Ribeiro: Flexible Heuristics Miner (FHM). CIDM 2011: Wil M. P. van der Aalst, Ana Karla A. de Medeiros, A. J. M. M. Weijters: Genetic Process Mining. ICATPN 2005:

23 4 Process mining Alpha Miner 2015 Springer-Verlag Berlin Heidelberg 23

24 4 Process mining EXERCISE: For the following log, apply the α-algorithm and derive the corresponding Petri Net: - Case1: <A, C, D, E, F, G> - Case2: <A, B, D, E, F, G> - Case3: <A, C, D, F, E, G> - Case4: <A, B, D, F, E, G> - Case5: <A, B, D, E, F, G> 24

25 4 Process mining Result: 25

26 4 Process mining Result produced using ProM 5.2 Alpha algorithm might lead to complex models Alpha Miner applied to Higher Education Data 2015 Springer-Verlag Berlin Heidelberg 26

27 4 Process mining Heuristics Miner 6 1. Read a log 2. Get the set of tasks 3. Infer the ordering relations based on their frequencies 4. Build the net based on inferred relations 5. Output the net 6 Weijters, AJMM (Ton), van der WMP (Wil) Aalst, und de AKA (Ana Karla) Medeiros Process mining with the HeuristicsMiner algorithm. Technische Universiteit Eindhoven. 27

28 4 Process mining Heuristics Miner 6 : Let W be an event log over T, and a, b T: aa > WW bb is the number of times aa > WW bb occurs in W, aa WW bb = aa > WW bb bb > WW aa aa > WW bb + bb > WW aa +1 Insight: The more frequently a task A directly follows another task B, and the less frequently the opposite occurs, the higher the probability that A causally follows B! 6 Weijters, AJMM (Ton), van der WMP (Wil) Aalst, und de AKA (Ana Karla) Medeiros Process mining with the HeuristicsMiner algorithm. Technische Universiteit Eindhoven. 28

29 4 Process mining Heuristic Miner applied to example log 2015 Springer-Verlag Berlin Heidelberg Result produced using ProM

30 4 Process mining Result produced using ProM 5.2 Heuristic Miner applied to HEP log 2015 Springer-Verlag Berlin Heidelberg 30

31 4 Process mining Genetic Miner 7 steps: 1. Read event log 2. Encoding individuals: causal matrix 3. Creating initial solution: all activities in the log, create causal relationshops randomly 4. Fitness and selection of individuals: based on conformance (preciseness and completeness) 5. Creating new offspring: mutation and crossover on causal matrices 6. Evaluating offspring 7 A.K.A. Medeiros, A.J.M.M. Weijters, und W.M.P. Aalst, Genetic process mining: an experimental evaluation, Data Mining and Knowledge Discovery, 14: (2007) 31

32 4 Process mining Genetic Miner, Step 2: Abstraction from Petri Nets WHY? However, no information should be lost! Representation as causal matrix: DEFINITION (CAUSAL MATRIX): A Causal Matrix is a tuple CM = (A, C, I, O), where - A is a finite set of activities, - C A A is the causality relation, - I: A P(P(A)) is the input condition function - O: A P(P(A)) is the output condition function, such that - C = {(a 1, a 2 ) A A a 1 I(a 2 )} - C = {(a 1, a 2 ) A A a 2 O(a 1 )} - C {(a o, a i ) A A a o = a i = } is a strongly connected graph, 32

33 4 Process mining 2015 Springer-Verlag Berlin Heidelberg 33

34 4 Process mining Genetic Miner, Step 3: Individuals are causal matrices Given a log, all individuals in any population of the genetic algorithm have the same set of activities (or tasks) A. This set contains the tasks that appear in the log. The setting of the causality relation C can be done via a completely random approach or a heuristic one. The random approach uses 50% probability for establishing (or not) a causality relation between two task in A The heuristic approach uses the information in the log to determine the probability that two tasks are going to have a causality relation set: The more often a task t 1 is directly followed by a task t 2 (i.e. the subtrace t 1, t 2 " appears in traces in the log), the higher the probability that individuals are built with a causality relation from t 1 to t 2 (i.e., (t 1, t 2 ) C) 7 7 A.K.A. Medeiros, A.J.M.M. Weijters, und W.M.P. Aalst, Genetic process mining: an experimental evaluation, Data Mining and Knowledge Discovery, 14: (2007) 34

35 4 Process mining Genetic Miner, Step 4: Conformance as fitness function f 8 ff = 1 1 kk ii=1 2 kk ii=1 nn ii mmmm nn ii cccc kk ii=1 2 n i is the number of traces c i is the number of consumed tokens p i is the number of produced tokens m i is the number of missing tokens r i is the number of remaining tokens kk ii=1 nn ii rrrr nn ii PPPP where 8 Anne Rozinat, Wil M. P. van der Aalst: Conformance checking of processes based on monitoring real behavior. Inf. Syst. 33(1): (2008) 35

36 4 Process mining Genetic Miner, Step 5: Elitism: certain percentage of the best individuals is copied into next population Crossover: two parents produce two offsprings Selection of parents: tournament Mutation: Insertion of new material into current population Randomly choose subset and add a task ( A) into subset Randomly choose subset and remove task Randomly redistribute elements in the subsets of I/O into new subsets Example for Mutation: I(D) = {{F, B, E}, {E, C}, {G}} Mutation by adding tasks: {{F, B, E}, {E, C}, {G, D}} Mutation by removing tasks: {{F, B, E}, {C}, {G}} Mutation by redistribution: {{F}, {E, C, B}, {G}, {E}} 36

37 2.4 Heuristic and Genetic Miner Genetic Miner, Step 6: Stop Criteria: n generations allowed Fittest individual has not changed for n/2 generations in a row 37

38 4 Process mining 2015 Springer-Verlag Berlin Heidelberg 38

39 4 Process mining Process mining has three key tasks 3 : Process discovery Process conformance Process enhancement 3 Wil M. P. van der Aalst: Process Mining - Data Science in Action, Second Edition. Springer 2016, ISBN , pp

40 4 Process mining Conformance checking 8 Problem: given a process model P and a set of traces T, determine how good P reflects the behavior set out by T Based on fitness function f ff = 1 1 kk ii=1 2 kk ii=1 nn ii mmmm nn ii cccc kk ii=1 2 n i is the number of traces c i is the number of consumed tokens p i is the number of produced tokens m i is the number of missing tokens r i is the number of remaining tokens kk ii=1 nn ii rrrr nn ii PPPP where 8 Anne Rozinat, Wil M. P. van der Aalst: Conformance checking of processes based on monitoring real behavior. Inf. Syst. 33(1): (2008) 40

41 4 Process mining 2015 Springer-Verlag Berlin Heidelberg 41

42 Contents 1 Introduction 2 Process analysis and simulation 3 Process performance management and warehousing 4 Process mining 5 Process compliance 6 Summary & outlook References 42

43 5 Process compliance Problem: given a process model P or a set of traces T and a set of compliance constraints C: does P or T respectively comply with the constraints in C Comply means does not violate any constraint in C Basic dinstinction into design and runtime compliance checking (see next slide) 43

44 5 Process compliance 44

45 Contents 1 Introduction 2 Process analysis and simulation 3 Process performance management and warehousing 4 Process mining 5 Process compliance 6 Summary & outlook References 45

46 6 Summary & outlook So far only discovery of control flow aspects Development of approaches for discovery of other important aspects, for example: Organizational mining 9, e.g., social network mining (e.g., who is working with whom)? Decision mining 10 : Mining of transition conditions determining routing in alternative branchings (e.g., age > 20) combines process mining and data mining (decision trees) combined approaches, see Chapter 8 Change mining based on change logs 9 W.M.P. van der Aalst and M. Song: Mining Social Networks: Uncovering interaction patterns in business processes. International Conference on Business Process Management (BPM 2004), LNCS 3080, pages (2004) 10 A. Rozinat and W.M.P. van der Aalst: Decision Mining. International Conference on Business Process Management (BPM 2006), Vienna, pages (2006) 46

47 6 Summary & outlook 47