Risk Management. Welcome to. an indispensable endeavour to build and to ensure safe operation of installations. Peter Kafka. Peter Kafka.

Size: px
Start display at page:

Download "Risk Management. Welcome to. an indispensable endeavour to build and to ensure safe operation of installations. Peter Kafka. Peter Kafka."

Transcription

1 Welcome to 欢迎 Risk Management an indispensable endeavour to build and to ensure safe operation of installations RelConsult 1

2 List of Item What we should know What we should ask What we should do How we should work What I finally want to stress 2

3 What we should know The world is changing man-made made rapidly for more freedom, more wealth, and growing peace The man-made made technologies are not free of risk.. Absolute safety is not possible We acknowledge daily examples for disasters in railway systems in motorcar traffic in coal mines in space and aviation transport in process industry, and off-shore plants in fire work production and storage for a happy new year!! We should not be pessimistic if we improve systems and plants by engineering acts we result normally in less risky products and installations This is valid for many technologies if we consider the trend versus time 3

4 What we should know Traditional engineers are educated to realise the function of the system The safety engineers and the risk experts need the capability to dream about the malfunctions of a system Hey that s easy no no gentleman!! please consider: the function of a system is the single case related to all the other possibilities of malfunctions (in terms of quantity theory) as a consequence: we need safety engineers which are never tired to learn from the past, and be ready to shape the future for a less risky world 4

5 What we should know RAMS: Reliability Availability Maintainability Safety R Risk A Risk a Global Term Risk interrelated with all elements of RAMS S M 5

6 What we should know Typical Modelling Approach in Safety Engineering Trial and Error (past) Historical Experience System Model Future Behaviour Living Process (modern) 6

7 What we should know Traditionally, the so-called deterministic approach is applied to resolve all the safety cases and the problems in installations Nowadays, we acknowledge the extension of this approach toward the probabilistic approach It opens a window to more insights, the characteristics of the installation and the consequences of malfunctions of components and systems Using this probabilistic approach risk assessment and risk management can be realised Again, it is not a competition between these two approaches; the more modern is a fruitful extension of the more traditional one Both are required in safety engineering 7

8 What we should ask Risk Management must be a living mission for each installation and process Therefore the three basic questions are always on: What can go wrong? What are the frequencies? What are the consequences? To be ready for these prognostic questions we have to ask retrospective: please tell me what happens in the past in your installation It is a shame that many organisations classify risk information as there own property they are hinder global progress in safety engineering Nobody should go nervous if we use the term risk It s the right term for the three questions, and these three questions are the best one for a safety engineer 8

9 What we should ask Some industries ask routinely for row date from the field and evaluate these data statistically to be ready for the question: what has happen with my product, and how frequently This is valid for off-shore industry (OREDA), aviation (EADS), some telecom orgs (e. g. CNET), nuclear industry (e. g. T-book (S), ZEDB (D)), EDF (Euredata), motorcar industry, etc. Based an such verbal and statistical information Risk Management can be based on hard facts from the real world But again - be not pessimistic: Risk Management based on so-called generic data (lacking of plant specific) is more appropriate as to do nothing! 9

10 What we should do Risk is an inherent characteristic of all technical installations Here, we concentrate on Technical Risk Management; ; not on project risk or financial risk Ideal Risk Management require three essential documents and the subsequent managerial and technical activities throughout the life cycle of the installation: Global Risk Strategy Risk Management Program Risk Management Plan As the names indicate, the content follows from the general statements to the detailed working procedures To establish these Doc s is the first step - the doing the second one 10

11 What we should do The main content of these docs focus on the four issues: Establishment of the global risk goal Transformation of this goal into the design and operation of the installation Proof of the compliance of the given risk versus the global risk goal Execution of all managerial actions to control the risk level throughout the life cycle The Doc s and the respective actions has to be inaugurated and assisted by the top management of the installation. This is required because shut downs, incidents or accidents within an installation dominate the success or un-success at the market place The main content of the Doc s must be known to the various contractors Details of the Doc s content are given in the paper 11

12 What we should do The establishment of the global risk goal can be oriented on the laws and regulations the market, or by specific public interests For some technologies exists worldwide or country specific regulations for global risk goals (e. g. aviation, nuclear, process industry, transport systems) In some countries such regulations are at the level of laws (e. g. NL, CH, USA); in some other countries such regulations represents best practice The orientation on the market require data collection and data evaluation A general harmonisation of risk goals across countries and industries seems to be not wise and possible However, large differences in risk goals can force an unwanted strangu- lation of an entire industrial sector 12

13 What we should do The transformation of the global risk goal to local goals and to the design and operation of the installation is complex and hard work. It is typically a top-down process Global Risk Goal Environment Level Installation Level System Level Component Level Specific Targets Top-Down for Targets Bottom-up for Compliance 13

14 What we should do The proof of the compliance of the given risk versus the global risk goal is typically an aggregation process (bottom-up), and it requires the risk contribution from all the different components and subsystems of an installation In practice, this process is never straight forward. Normally iterations are required, including redesign actions and recalculations The process was illustrated in yellow colour in the last picture To perform this process we need a bundle of adequate methods, tools and procedures; some of them are illustrated in the next pictures 14

15 How we should work Nowadays, a well established and matured tool box for the essential tasks in Risk Management is available Essential tasks are: Systems Familiarisation and Analysis Reliability and Risk Assessment, including Internal and External Event Analysis Human Factor Assessment Hardware, Software and Paperware Assessment Data Collection and Evaluation Uncertainty Assessment Display and Interpretation of Results Managerial and Controlling Actions (with respect to risk goals,, costs and time frame) 15

16 How we should work Well established tools are the following: Failure Mode and Effect Analysis (FMEA) Block Diagram Method (BDM) Fault Tree Analysis (FTA) Event Tree Analysis (ETA) Probabilistic Safety Assessment (PSA) Graph Methods like Petri Net, Markov Task Analysis (e. g. Therp) Stochastic Structural Reliability Assessment Data Evaluation (e. g. Bayes Approach) Uncertainty Simulation In the following some of these tools are shown 16

17 How we should work FMEA A structured framework to identify failure modes causes and effects DIN 25448, IEC 60812, MIL 1629A Cause 1 Effect 1 Fault / Failure Possibility Cause n Effect n 1 to i 17

18 How we should work FTA simplified Cut Set Example TOP Event TE = A + BD + BE + CD + CE or Fault Tree Principle: it represents a qualitative / quantitative structure for what are the causes of a top event BD+BE+CD+CE A and B + C or D + E or B C D E IF A, B, C, D, E = 0,1 then TE = 0,045 18

19 How we should work ETA Using probabilities we can perform the event tree quantification Event Sequence Condition ES k1 Initiating Event IE i e. g. Steering Failure Event Sequence Condition ES k2 p (yes) p (no) Event Tree Principle: it represents a qualitative / quantitative structure for what can be happen Damage State TDS j Large Event Trees consist of dozen s of branches 19

20 How we should work PRA Logic Fault Tree Event Tree FMEA Basic Events Causes Initiating Events Consequences PRA Logic is a tricky combination of FMEA, FTA and ETA Decision Tree Fault Tree Basis Events 20

21 How we should work Allocation Basic Events Fault Tree Causes Example (serial system): n = items λ = 10E-4/h for each item creates each hour a problem Initiating Events each hour a problem Event Tree Consequences Fault Tree n = 100 items; UA = 10E-4/d for each item creates per 100 demands one problem every 100h a problem Proof / Review 21

22 How we should work Simple Example Control System: Markov Modelling / Chains Two processors; 1 active, 1 hot backup Fault coverage may be imperfect c = pr {fault detected and recovery is successful given processor fault occurs} 1 c = pr {fault is not detected or recovery is unsuccessful given processor fault occurs} λ = Failure Rate µ = Repair Rate 2(1-c)λ 2 2cλ µ F 1 λ 22

23 How we should work Simple Example: Petri Net Input Place Transition Output Place Wording: If the condition fire the transition the input loose their token and the output receive the token Condition Place 23

24 How we should work pdf Structural Reliability (simplified one-dimensional case) Safety Factor traditional approach Stress Strength a measure for probability of failure N/mm² 24

25 What I finally want to stress To design a safety installation and to operate it safely is of vital interest Not any installation can be 100% safe; risk is inherent in all installations Risk Management is the challenge to minimise the inherent risk level, and to control that risk level throughout the life cycle of the installation This approach identify and visualise the main risk contributors and therefore it offers the road map for the most effective improvements and risk management actions The Probabilistic Risk Assessment is not in competition to the deterministic approach; it is a powerful extension Methods and tools are available; ; a larger problem is the availability of experts Data are available; ; one have to be active to identify the sources Shut downs, incidents and accidents can create immense costs and damages - a powerful risk management is therefore cheaper Cookbooks for risk management are inadequate; we need Safety Instinct" 25

26 Thank you very much for Attention 26