Learning Hybrid Process Models from Events Process Discovery Without Faking Confidence

Size: px
Start display at page:

Download "Learning Hybrid Process Models from Events Process Discovery Without Faking Confidence"

Transcription

1 SIKS-Day, , Utrecht Learning Hybrid Process Models from Events Process Discovery Without Faking Confidence prof.dr.ir. Wil van der Aalst

2 Outline - Positioning process mining - Concurrency & semantics - Learning hybrid models - causal graphs - hybrid Petri nets - scalability - Closing remarks

3 Positioning PM

4 Uptake of process mining Start of process mining at TU/e (1999) Alpha miner & Heuristic miner ( ) First version of ProM (2004) Evolved from 29 plug-ins in 2004 to 274 plug-ins 2009 (ProM 5.2) to over 1500 plug-ins today. Conformance checking, other perspectives, prediction, etc. ( ) Process mining book, courses, uptake commercial tools, etc. ( ) publications on process mining (2016/17 incomplete) based on Scopus data

5 optimization operations management & research stochastics process mining formal methods & concurrency theory process automation & workflow management process science business process improvement business process management

6 process mining as the missing link

7 discovered using ILP miner 12,666 cases (orders) 80,609 events

8 Taxonomy: Not just CF discovery! Task Perspective Type Process discovery Control Flow (CF) only Offline Conformance checking CF + time Online Extending process models CF + data Decision making CF + resources

9 Taxonomy: Not just CF discovery! Task Perspective Type offline control-flow discovery Process discovery Control Flow (CF) only Offline Conformance checking CF + time Online Extending process models CF + data Decision making CF + resources

10 process model discovered using the inductive miner (showing only the most frequent paths)

11 zooming in

12 using conformance checking to see all deviations

13 bottleneck analysis: enriching the model with performance information

14 animating the event log showing real cases

15 seamless abstraction: one log many views

16 Data-driven & process-centric P data-driven process-centric event data M process models

17 Answering two types of questions performance questions Why are these cases late? Where are the real bottlenecks? Which resources are overloaded? P data-driven process-centric event data M process models How often is the four-eyes principle violated? Which activities are often skipped? Which resources cause deviations? conformance questions

18 Process mining results may hurt and trigger resistance, but this only supports the need for it.

19 Process Mining Software

20 1500+ plug-ins available covering the whole process mining spectrum >130k downloads

21

22 Interaction with industry challenges ideas data new techniques and approaches

23 Job done? Not really

24 Concurrency and Semantics

25 Boxes and arrows: What do they mean?

26 An example log and proper models 1000 cases and 5000 events Inductive miner, ILP miner, Alpha miner, etc.

27 Model with 3 concurrent activities Too difficult?

28 Concurrency & Semantics Much easier? Semantics?

29 Concurrency & Semantics Do frequencies make sense?

30 Concurrency & Semantics Do frequencies make sense? do not add up were performed in any order

31 Concurrency & Semantics Do times make sense?

32 Concurrency & Semantics Do times make sense?

33 Concurrency & Semantics Correct numbers

34 Concurrency & Semantics hours hours hours hours hours hours Correct times

35

36 faking confidence (even with fitness = 1.0)

37 Need to have a PhD in Petri nets?

38 How to combine the best of both worlds? informal when needed & fast and scalable precise (having formal semantics) whenever possible & useful commercial tools heuristic miner fuzzy miner etc. basic inductive miner ILP miner other region-based approaches etc.

39 Idea: Hybrid process models

40 People do not hate Petri nets (BPMN, etc.): they hate to be precise (when )! Vagueness can be a feature! Semi-structuredness can be deliberate!

41 Wil van der Aalst, Riccardo De Masellis, Chiara Di Francescomarino, Chiara Ghidini: Learning Hybrid Process Models From Events: Process Discovery Without Faking Confidence. BPM July 2016

42 Hybrid Petri nets have three types of arcs a sure and precise b strong causality logic can be captured* determined based on thresholds a sure but not precise b determined based on thresholds logic cannot be captured* a? unsure b weak causality * = easily / of a certain quality / within the representational bias

43 Phase 0: Get data

44 Phase 1: Learn a Causal Graph? strong causal relation R S weak causal relation R W start place order? send invoice send reminder cancel order? confirm payment pay prepare delivery end make delivery Different approaches possible (business as usual)

45 Phase 2: Learn a Hybrid Petri Net p1 start p2 place p3 order? send invoice send reminder cancel order? p5 confirm payment p7 pay p4 prepare delivery end p9 p6 make delivery p8

46 Phase 2: Learn a Hybrid Petri Net start place order? send invoice send reminder? cancel order confirm payment green = subsets of strong red = strong causalities causalities having a that do not have corresponding a place corresponding place pay prepare delivery end make delivery red green = strong = subsets causalities of strong that causalities do not have having a a corresponding place place p1 start p2 place p3 order? pay send invoice p4 send reminder? prepare delivery cancel order p5 confirm payment p7 end p9 p6 make delivery p8

47 Phase 2: Learn a Hybrid Petri Net start place order? send invoice send reminder cancel order? pay prepare delivery confirm payment make delivery end p1 start p2 place p3 order? send invoice send reminder cancel order? pay p4 prepare delivery p5 confirm payment p7 end p9 p6 make delivery p8

48 Generate candidate places and evaluate cancel order confirm payment end {1} {2} {3} {1,2} {1,3} {2,3} {1,2,3} {1} {2} {3} {1,2} {1,3} {2,3} {1,2,3} cancel order confirm payment end make delivery make delivery

49 ProM Implementation sliders start place order? send invoice send reminder cancel order? confirm payment pay prepare delivery end make delivery fuzzy causal graph

50 ProM Implementation sliders sure arc unsure arc fuzzy causal graph p1 start p2 place p3 order? pay send invoice p4 send reminder? prepare delivery cancel order p5 p6 confirm payment make delivery p7 p8 end p9 fuzzy Petri net

51 Dealing with Big event logs

52 Dealing With Big Data Two expensive steps Building a directly-follows graphs with frequencies. Testing candidate places. Both steps can be easily distributed. It is possible to prune the set of candidate places. See Spark implementation of a similar approach by Long Cheng.

53 Summary

54 Process Mining process discovery

55 p1 start p2 place p3 order? send invoice send reminder cancel order? Learning Hybrid Process Models from Events pay p4 prepare delivery p5 p6 confirm payment make delivery p7 p8 end p9 Process Discovery Without Faking Confidence

56 Closing remarks

57 Times are great for SIKS! Gartner Hype Cycle for Emerging Technologies, 2017 But,

58

59 IT Pizza Metaphor

60 The focus is mostly on the toppings (= applications)

61 , but the base actually matters!

62

63 2000 Ranking Dutch scientists listed (H>40) Top 10: US only hoogvlakte? US 975 / 1864 = 52% NL 51 / 1864 = 2.7% US 975 / 326m = 3% NL 51 / 17m = 3% prob. top 10 US only = =

64 How can SIKS help to strengthen the foundation of our digital society?

65 Thanks

66 Pointers W.M.P. van der Aalst. Process Mining: Data Science in Action. Springer-Verlag, Berlin,