A New Approach to System Safety Engineering

Size: px
Start display at page:

Download "A New Approach to System Safety Engineering"

Transcription

1 A New Approach to System Safety Engineering Nancy G. Leveson, MIT System Safety Engineering: Back to the Future

2 Outline of Day 1 Why a new approach is needed STAMP A new accident model based on system theory (control) Uses Accident and incident investigation Hazard Analysis (STPA) and Design for Safety Non-Advocate Safety Assessment Cultural and organizational risk analysis

3 Why need a new approach Traditional approaches developed for relatively simple electro-mechanical systems Accidents in complex, software-intensive systems are changing their nature We need more effective techniques in these new systems

4 Chain-of-Events Model Explains accidents in terms of multiple events, sequenced as a forward chain over time. Simple, direct relationship between events in chain Events almost always involve component failure, human error, or energy-related event Forms the basis for most safety-engineering and reliability engineering analysis: e,g, FTA, PRA, FMECA, Event Trees, etc. and design: e.g., redundancy, overdesign, safety margins,.

5 Chain-of-events example

6 Accident with No Component Failures

7 Types of Accidents Component Failure Accidents Single or multiple component failures Usually assume random failure System Accidents Arise in interactions among components Related to interactive complexity and tight coupling Exacerbated by introduction of computers and software New technology introduces unknowns and unk-unks

8 Interactive Complexity Critical factor is intellectual manageability A simple system has a small number of unknowns in its interactions (within system and with environment) Interactively complex (intellectually unmanageable) when level of interactions reaches point where can no longer be thoroughly Planned Understood Anticipated Guarded against

9 Tight Coupling Tightly coupled system is one that is highly interdependent Each part linked to many other parts Failure or unplanned behavior in one can rapidly affect status of others Processes are time-dependent and cannot wait Little slack in system Sequences are invariant, only one way to reach a goal System accidents are caused by unplanned and dysfunctional interactions Coupling increases number of interfaces and potential interactions

10 Other Types of Complexity Non-linear complexity Cause and effect not related in an obvious way Dynamic Complexity Related to changes over time Decompositional Structural decomposition not consistent with functional decomposition

11 Limitations of Chain-of-Events Model Social and organizational factors in accidents System accidents Software Adaptation Systems are continually changing Systems and organizations migrate toward accidents (states of high risk) under cost and productivity pressures in an aggressive, competitive environment

12 Human error Limitations (2) Define as deviation from normative procedures, but operators always deviate from standard procedures Normative vs. effective procedures Sometimes violation of rules has prevented accidents Cannot effectively model human behavior by decomposing it into individual decisions and acts and studying it in isolation from Physical and social context Value system in which takes place Dynamic work process Less successful actions are natural part of search by operator for optimal performance

13 Mental Models

14

15 Exxon Valdez Shortly after midnight, March 24, 1989, tanker Exxon Valdez ran aground on Bligh Reef (Alaska) 11 million gallons of crude oil released Over 1500 miles of shoreline polluted Exxon and government put responsibility on tanker Captain Hazelwood, who was disciplined and fired Was he to blame? State-of-the-art iceberg monitoring equipment promised by oil industry, but never installed. Exxon Valdez traveling outside normal sea lane in order to avoid icebergs thought to be in area Radar station in city of Valdez, which was responsible for monitoring the location of tanker traffic in Prince William Sound, had replaced its radar with much less powerful equipment. Location of tankers near Bligh reef could not be monitored with this equipment.

16 Congressional approval of Alaska oil pipeline and tanker transport network included an agreement by oil corporations to build and use double-hulled tankers. Exxon Valdez did not have a double hull. Crew fatigue was typical on tankers In 1977, average oil tanker operating out of Valdez had a crew of 40 people. By 1989, crew size had been cut in half. Crews routinely worked hour shifts, plus extensive overtime Exxon Valdez had arrived in port at 11 pm the night before. The crew rushed to get the tanker loaded for departure the next evening Coast Guard at Valdez assigned to conduct safety inspections of tankers. It did not perform these inspections. It s staff had been cut by one-third.

17 Tanker crews relied on the Coast Guard to plot their position continually. Coast Guard operating manual required this. Practice of tracking ships all the way out to Bligh reef had been discontinued. Tanker crews were never informed of the change. Spill response teams and equipment were not readily available. Seriously impaired attempts to contain and recover the spilled oil. Summary: Safeguards designed to avoid and mitigate effects of an oil spill were not in place or were not operational By focusing exclusively on blame, the opportunity to learn from mistakes is lost. Postscript: Captain Hazelwood was tried for being drunk the night the Exxon Valdez went aground. He was found not guilty

18 Hierarchical models

19 Hierarchical analysis example

20 The Role of Software in Accidents

21 The Computer Revolution General Purpose Machine + Software = Special Purpose Machine Software is simply the design of a machine abstracted from its physical realization Machines that were physically impossible or impractical to build become feasible Design can be changed without retooling or manufacturing Can concentrate on steps to be achieved without worrying about how steps will be realized physically

22 Advantages = Disadvantages Computer so powerful and useful because has eliminated many of physical constraints of previous technology Both its blessing and its curse No longer have to worry about physical realization of our designs But no longer have physical laws that limit the complexity of our designs.

23 The Curse of Flexibility Software is the resting place of afterthoughts No physical constraints To enforce discipline in design, construction, and modification To control complexity So flexible that start working with it before fully understanding what need to do And they looked upon the software and saw that it was good, but they just had to add one other feature

24 Abstraction from Physical Design Software engineers are doing physical design Autopilot Expert Requirements Software Engineer Design of Autopilot Most operational software errors related to requirements (particularly incompleteness) Software failure modes are different Usually does exactly what you tell it to do Problems occur from operation, not lack of operation Usually doing exactly what software engineers wanted

25 Safety vs. Reliability Safety and reliability are NOT the same Sometimes increasing one can even decrease the other. Making all the components highly reliable will have no impact on system accidents. For relatively simple, electro-mechanical systems with primarily component failure accidents, reliability engineering can increase safety. But accidents in high-tech systems are changing their nature, and we must change our approaches to safety accordingly.

26 It s only a random failure, sir! It will never happen again.

27 Reliability Engineering Approach to Safety Reliability: The probability an item will perform its required function in the specified manner over a given time period and under specified or assumed conditions. (Note: Most accidents result from errors in specified requirements or functions and deviations from assumed conditions) Concerned primarily with failures and failure rate reduction: Redundancy Safety factors and margins Derating Screening Timed replacements

28 Reliability Engineering Approach to Safety Assumes accidents are caused by component failure Positive: Techniques exist to increase component reliability Failure rates in hardware are quantifiable Negative: Omits important factors in accidents May decrease safety Many accidents occur without any component failure Caused by equipment operation outside parameters and time limits upon which reliability analyses are based. Caused by interactions of components all operating according to specification. Highly reliable components are not necessarily safe

29 Software-Related Accidents Are usually caused by flawed requirements Incomplete or wrong assumptions about operation of controlled system or required operation of computer Unhandled controlled-system states and environmental conditions Merely trying to get the software correct or to make it reliable will not make it safer under these conditions.

30 Software-Related Accidents (2) Software may be highly reliable and correct and still be unsafe: Correctly implements requirements but specified behavior unsafe from a system perspective. Requirements do not specify some particular behavior required for system safety (incomplete) Software has unintended (and unsafe) behavior beyond what is specified in requirements.

31 MPL Requirements Tracing Flaw SYSTEM REQUIREMENTS 1. The touchdown sensors shall be sampled at 100-HZ rate. 2. The sampling process shall be initiated prior to lander entry to keep processor demand constant. 3. However, the use of the touchdown sensor data shall not begin until 12 m above the surface. SOFTWARE REQUIREMENTS 1. The lander flight software shall cyclically check the state of each of the three touchdown sensors (one per leg) at 100-HZ during EDL. 2. The lander flight software shall be able to cyclically check the touchdown event state with or without touchdown event generation enabled.????

32 Reliability Approach to Software Safety Using standard engineering techniques of Preventing failures through redundancy Increasing component reliability Reuse of designs and learning from experience will not work for software and system accidents

33 Preventing Failures Through Redundancy Redundancy simply makes complexity worse NASA experimental aircraft example Any solutions that involve adding complexity will not solve problems that stem from intellectual unmanageability and interactive complexity Majority of software-related accidents caused by requirements errors Does not work for software even if accident is caused by a software implementation error Software errors not caused by random wear-out failures

34 Increasing Software Reliability (Integrity) Appearing in many new international standards for software safety (e.g., 61508) Safety integrity level (SIL) Sometimes give reliability number (e.g., 10-9 ) Can software reliability be measured? What does it mean? What does it have to do with safety? Safety involves more than simply getting the software correct : Example: altitude switch 1. Signal safety-increasing Require any of three altimeter report below threshhold 1. Signal safety-decreasing Require all three altimeter to report below threshhold

35 Software Component Reuse One of most common factors in software-related accidents Software contains assumptions about its environment Accidents occur when these assumptions are incorrect Therac-25 Ariane 5 U.K. ATC software Mars Climate Orbiter Most likely to change the features embedded in or controlled by the software COTS makes safety analysis more difficult

36 Safety and (component or system) reliability are different qualities in complex systems! Increasing one will not necessarily increase the other. So what do we do?

37 A Possible Solution Enforce discipline and control complexity Limits have changed from structural integrity and physical constraints of materials to intellectual limits Improve communication among engineers Build safety in by enforcing constraints on behavior Controller contributes to accidents not by failing but by: 1. Not enforcing safety-related constraints on behavior 2. Commanding behavior that violates safety constraints

38 Example (Chemical Reactor) System Safety Constraint: Water must be flowing into reflux condenser whenever catalyst is added to reactor Software (Controller) Safety Constraint: Software must always open water valve before catalyst valve

39 Conclusion The primary safety problem in complex, softwareintensive systems is the lack of appropriate constraints on design The job of the system safety engineer is to Identify the constraints necessary to maintain safety Ensure the system (including software) design enforces them

40 Introduction to System Safety Engineering

41 A Non-System Safety Example: Nuclear Power (Defense in Depth) Multiple independent barriers to propagation of malfunction Emphasis on component reliability and use of lots of redundancy Handling single failures (no single failure of any components will disable any barrier) Protection ( safety ) systems: automatic system shutdown Emphasis on reliability and availability of shutdown system and physical system barriers (using redundancy)

42 Why is this effective? Relatively slow pace of basic design changes Use of well-understood and debugged designs Ability to learn from experience Conservatism in design Slow introduction of new technology Limited interactive complexity and coupling

43 System Safety Grew out of ballistic missile systems of 1960s Emphasizes building in safety rather than adding it on to a completed design Looks at systems as a whole, not just components A top-down systems approach to accident prevention Takes a larger view of accident causes than just component failures (includes interactions among components) Emphasizes hazard analysis and design to eliminate or control hazards Emphasizes qualitative rather than quantitative approaches

44 System Safety Overview A planned, disciplined, and systematic approach to preventing or reducing accidents throughout the life cycle of a system. Organized common sense (Mueller, 1968) Primary concern is the management of hazards Hazard Through identification analysis evaluation design elimination management control MIL-STD-882

45 Analysis: System Safety Overview (2) Hazard analysis and control is a continuous, iterative process throughout system development and use. Design: Hazard resolution precedence 1. Eliminate the hazard 2. Prevent or minimize the occurrence of the hazard 3. Control the hazard if it occurs 4. Minimize damage Management: Audit trails, communication channels, etc.

46 System Safety in Software-Intensive Systems While system safety approach was developed for and works for complex, technologically advanced systems, new methods are required Software particularly stretches traditional methods Need new hazard analysis and other approaches for complex, software-intensive systems Rest of day 1 shows new way to implement the basic system safety approach

47 STAMP A new accident causation model using Systems Theory (vs. Reliability Theory)

48 Introduction to Systems Theory Ways to cope with complexity 1. Analytic Reduction 2. Statistics

49 Analytic Reduction Divide system into distinct parts for analysis Physical aspects Separate physical components Behavior Events over time Examine parts separately Assumes such separation possible: 1. The division into parts will not distort the phenomenon Each component or subsystem operates independently Analysis results not distorted when consider components separately

50 Analytic Reduction (2) 2. Components act the same when examined singly as when playing their part in the whole Components or events not subject to feedback loops and non-linear interactions 3. Principles governing the assembling of components into the whole are themselves straightforward Interactions among subsystems simple enough that can be considered separate from behavior of subsystems themselves Precise nature of interactions is known Interactions can be examined pairwise Called Organized Simplicity

51 Statistics Treat system as a structureless mass with interchangeable parts Use Law of Large Numbers to describe behavior in terms of averages Assumes components are sufficiently regular and random in their behavior that they can be studied statistically Called Unorganized Complexity

52 Complex, Software-Intensive Systems Too complex for complete analysis Separation into (interacting) subsystems distorts the results The most important properties are emergent Too organized for statistics Too much underlying structure that distorts the statistics Called Organized Complexity

53

54 Systems Theory Developed for biology (von Bertalanffly) and engineering (Norbert Weiner) Basis of system engineering and system safety ICBM systems of the 1950s Developed to handle systems with organized complexity

55 Systems Theory (2) Focuses on systems taken as a whole, not on parts taken separate Some properties can only be treated adequately in their entirety, taking into account all social and technical aspects These properties derive from relationships among the parts of the system How they interact and fit together Two pairs of ideas 1. Hierarchy and emergence 2. Communication and control

56 Hierarchy and Emergence Complex systems can be modeled as a hierarchy of organizational levels Each level more complex than one below Levels characterized by emergent properties Irreducible Represent constraints on the degree of freedom of components at lower level Safety is an emergent system property It is NOT a component property It can only be analyzed in the context of the whole

57

58 Example Control Structure

59 Communication and Control Hierarchies characterized by control processes working at the interfaces between levels A control action imposes constraints upon the activity at a lower level of the hierarchy Open systems are viewed as interrelated components kept in a state of dynamic equilibrium by feedback loops of information and control Control in open systems implies need for communication

60 Control processes operate between levels Controller Control Actions Model of Process Feedback Process models must contain: - Required relationship among process variables - Current state (values of process variables - The ways the process can change state Controlled Process

61 Relationship Between Safety and Process Models Accidents occur when models do not match process and Incorrect control commands given Correct ones not given Correct commands given at wrong time (too early, too late) Control stops too soon (Note the relationship to system accidents and to software s role in accidents)

62 Relationship Between Safety and Process Models (2) How do they become inconsistent? Wrong from beginning Missing or incorrect feedback Not updated correctly Time lags not accounted for Resulting in Uncontrolled disturbances Unhandled process states Inadvertently commanding system into a hazardous state Unhandled or incorrectly handled system component failures

63 Relationship Between Safety and Human Mental Models Explains most human/computer interaction problems Pilots and others are not understanding the automation What did it just do? Why won t it let us do that? Why did it do that? What caused the failure? What will it do next? What can we do so it does How did it get us into this state? not happen again? How do I get it to do what I want? Or don t get feedback to update mental models or disbelieve it

64 Mental Models

65 Relationship Between Safety and Human Mental Models (2) Also explains developer errors. May have incorrect model of Required system or software behavior for safety Development process Physical laws Etc.

66 STAMP Systems-Theoretic Accident Model and Processes Accidents are not simply an event or chain of events but involve a complex, dynamic process Based on systems and control theory Accidents arise from interactions among humans, machines, and the environment (not just component failures) prevent failures _ enforce safety constraints on system behavior

67 STAMP (2) View accidents as a control problem O-ring did not control propellant gas release by sealing gap in field joint Software did not adequately control descent speed of Mars Polar Lander Events are the result of the inadequate control Result from lack of enforcement of safety constraints

68 A Broad View of Control Does not imply need for a controller Component failures and dysfunctional interactions may be controlled through design (e.g., redundancy, interlocks, fail-safe design) or through process Manufacturing processes and procedures Maintenance processes Operations Does imply the need to enforce the safety constraints in some way New model includes what do now and more

69 STAMP (3) Safety is an emergent property that arises when system components interact with each other within a larger environment A set of constraints related to behavior of system components enforces that property Accidents occur when interactions violate those constraints (a lack of appropriate constraints on the interactions) Controllers embody or enforce those constraints

70 Example Safety Constraints Build safety in by enforcing safety constraints on behavior Controllers contribute to accidents not by failing but by: 1. Not enforcing safety-related constraints on behavior 2. Commanding behavior that violates safety constraints System Safety Constraint: Water must be flowing into reflux condenser whenever catalyst is added to reactor Software Safety Constraint: Software must always open water valve before catalyst valve

71 STAMP (4) Systems are not treated as a static design A socio-technical system is a dynamic process continually adapting to achieve its ends and to react to changes in itself and its environment Migration toward states of high risk Preventing accidents requires designing a control structure to enforce constraints on system behavior and adaptation

72 Example Control Structure

73 Intelligent Cruise Control

74

75 Accident Causality Accidents occur when Control structure or control actions do not enforce safety constraints Unhandled environmental disturbances or conditions Unhandled or uncontrolled component failures Dysfunctional (unsafe) interactions among components Control structure degrades over time (asynchronous evolution) Control actions inadequately coordinated among multiple controllers

76 Dysfunctional Controller Interactions Boundary areas Controller 1 Controller 2 Process 1 Process 2 Overlap areas (side effects of decisions and control actions) Controller 1 Controller 2 Process

77 Uncoordinated Control Agents UNSAFE SAFE STATE BOTH TCAS TCAS ATC and ATC provides provide uncoordinated instructions & independent to to both planes instructions Control Agent (TCAS) Instructions Instructions No Coordination Instructions Instructions Control Agent (ATC)

78 Root Cause Analysis Example: Exxon Valdez Congress For each component identify: Responsibility (Safety Requirements and constraints Inadequate control actions Context in which decisions made Mental model flaws Exxon Tanker Captain And Crew Tanker Coast Guard

79 Applying STAMP To understand accidents, need to examine safety control structure itself to determine why inadequate to maintain safety constraints and why events occurred To prevent accidents, need to create an effective safety control structure to enforce the system safety constraints Not a blame model but a why model

80 Modeling Accidents Using STAMP Three types of models are used: 1. Static safety control structure 2. Dynamic safety control structure Shows how control structure changed over time 3. Behavioral dynamics Dynamic processes behind changes, i.e., why the system changed over time

81 Simplified System Dynamics Model of Columbia Accident

82 STAMP vs. Traditional Accident Models Examines inter-relationships rather than linear cause-effect chains. Looks at the processes behind the events Includes entire socio-technical system Includes behavioral dynamics (changes over time) Want to not just react to accidents and impose controls for a while, but understand why controls drift toward ineffectiveness over time and Change those factors if possible Detect the drift before accidents occur

83 Uses for STAMP Basis for new, more powerful hazard analysis techniques (STPA) Inform early architectural trade studies Identify and prioritize hazards and risks Identify system and component safety requirements and constraints (to be used in design) Perform hazard analyses on physical and social systems Safety-driven design (physical, operational, organizational) More comprehensive accident/incident investigation and root cause analysis

84 Uses for STAMP (2) Organizational and cultural risk analysis Identifying physical and project risks Defining safety metrics and performance audits Designing and evaluating potential policy and structural improvements Identifying leading indicators of increasing risk ( canary in the coal mine ) New holistic approaches to security

85 Does it Work? Is it Practical? MDA risk assessment of inadvertent launch (technical) Architectural trade studies for the space exploration initiative (technical) Safety driven design of a NASA JPL spacecraft (technical) NASA Space Shuttle Operations (risk analysis of a new management structure) NASA Exploration Systems (risk management tradeoffs among safety, budget, schedule, performance in development of replacement for Shuttle)

86 Does it Work? Is it Practical? (2) Accident analysis (spacecraft losses, bacterial contamination of water supply, aircraft collision, oil refinery explosion, train accident, etc.) Pharmaceutical safety Hospital safety (risks of outpatient surgery at Beth Israel MC) Corporate fraud (are controls adequate?, Sarbanes- Oxley) Food safety Train safety (Japan)

87 Accident/Incident Investigation and Causal Analysis

88 Using STAMP in Root Cause Analysis Identify system hazard violated and the system safety design constraints Construct the safety control structure as it was designed to work Component responsibilities (requirements) Control actions and feedback loops For each component, determine if it fulfilled its responsibilities or provided inadequate control. If inadequate control, why? (including changes over time) Determine the changes that could eliminate the inadequate control (lack of enforcement of system safety constraints) in the future.

89 Components surrounding Controller in Zurich

90 Links degraded due to poor and unsafe practices

91 Links lost due to sectorization work

92 Links lost due to unusual situations

93 Safety Requirements and Constraints Must follow TCAS mandate Tupulov Crew Context in which decisions were made Flying over Western Europe ( TCAS is mandatory) TU Crew doesn t have radio communication with Boeing Crew Flight Crew has no simulator experience with TCAS Flight Training is unclear on what to do in case of conflict between ATC/TCAS Flying at night Inadequate Decisions and Control Actions Reliance on optical contact Ignores minority report from spare member of the crew Follows controller instructions rather than TCAS Mental Model Flaws Optical illusion of distance Belief that ATC is aware of everything that is happening Belief that pilot, not TCAS has the last said in the evasion action Understanding of TCAS as a backup system rather than a final resort

94 Zurich ATC Operations Safety Requirements and Constraints Maintain safe separation between planes in airspace Context in which decisions were made Phone system prevented communication from other ATCs Inadequate radar coverage Insufficient personnel (only one controller) Unaware of TCAS and/or impact of TCAS during a RA Etc. Inadequate Decision and Control Actions Failure to communicate with DHL plane Failure to adequately monitor situation Mental Model Flaws Unaware of conflicting TCAS procedures between Russian and European pilots Etc.

95 Regulatory Agencies (FAA, CAA, Eurocontrol) (No significant influence on accident according to the report) Safety Requirements (Responsibilities) Clearly articulate procedures for compliance with TCAS RA s. Clearly articulate right of way rules in airspace. Define the role of air traffic controllers and pilots in resolving conflicts in the presence of TCAS. Flawed Control Actions AIP Germany regulations not up to date for current version of TCAS. Procedural instruction for the actions to be taken by the pilots (from AIP Germany) in case of an RA not worded clearly enough. LuftVO Air Traffic Order Pilots are granted a freedom of decision which is not compatible with the system philosophy of TCAS II, Version 7 use of term recommendation is inadequate. Reasons for Flawed Control Actions, Dysfunctional Interactions Overlapping control authority by several nations & organizations. Asynchronous evolution between regulatory guidance documents and adopted technology.

96 Björn Koberstein Uberlingen, Operators: STAMP s Static Control Structure (Action Feedback) Filnamn: Uberlingen enl STAMP (C) FMV 2007 utg 9, Radar CoC Missing feedback ATCoperators EuroControl Flight operators ICAO Aircraft authorities (incl FAA) Missing feedback Aircraft manufacturer TCAS manufacturer ICAO (a UN agency) -Cooperative aviation regulation -Standards and recommendations.. (including rules of the air,. responsibilities of ATCOs & pilots,. conflicts btw ATCO & TCAS) EUROCONTROL (European org) -Standardization and guidance -Certification, education, information JAA (Joint Aviation Authorities) -Guidance and training (TCAS-RA s. precedence over ATCO instructions) TCAS manufacturer -TCAS 2000 Pilots Guide. (including TCAS-ATC conflicts) Flight operators -TCAS training (B /TU154M) * (obey RA unless/after conflict contact) Missing feedback ATC Station (ATC Zurich) Radar display Pilots No direct feedback Missing feedback Aircraft TCAS Missing feedback -TU154M Flight Operations (ATC has. precedence over TCAS) Swiss Air Navigation Services. (ATC Zürich management) -Air traffic control in Swiss airspace + in..delegated airspaces of adjoining states -ATC Safety policy (not fully implemented,. also SMOP in practice not prevented) CoC -safety, quality and risk management -the implementation of the safety & risk. management procedures were delayed. due to their in-house development -unaware of the sectorisation work ATC operator -overloaded -insufficiently trained -unofficial practice deviating from the. regular roster (night shift SMOP) * BFU Investigation Report pp 62, 65

97 STPA A new hazard analysis technique based on the STAMP model of accident causation

98 STAMP-Based Hazard Analysis (STPA) Supports a safety-driven design process where Hazard analysis influences and shapes early design decisions Hazard analysis iterated and refined as design evolves Goals (same as any hazard analysis) Identification of system hazards and related safety constraints necessary to ensure acceptable risk Accumulation of information about how hazards can be violated, which is used to eliminate, reduce and control hazards in system design, development, manufacturing, and operations

99 Safety-Driven Design Define initial control structure, refining system safety constraints and design in parallel. Identify potentially hazardous control actions by each of system components that would violate system design constraints. Restate as component safety design requirements and constraints. Perform hazard analysis using STPA to identify how safetyrelated requirements and constraints could be violated (the potential causes of inadequate control and enforcement of safety-related constraints). Augment the basic design to eliminate, mitigate, or control potential unsafe control actions and behaviors. Iterate over the process, i.e. perform STPA on the new augmented design and continue to refine the design until all hazardous scenarios are eliminate, mitigated, or controlled. Document design rationale and trace requirements and constraints to the related design decisions.

100 Step 1: Identify hazards and translate into highlevel requirements and constraints on behavior TCAS Hazards: 1. A near mid-air collision (NMAC): Two controlled aircraft violate minimum separation standards) 2. A controlled maneuver into ground 3. Loss of control of aircraft 4. Interference with other safety-related aircraft systems 5. Interference with the ground-based ATC system 6. Interference with ATC safety-related advisory System Safety Design Constraints: TCAS must not cause or contribute to an NMAC TCAS must not cause or contribute to a controlled maneuver into the ground

101 Step 2: Define basic control structure

102 TCAS: Component Responsibilities Receive and update information about its own and other aircraft Analyze information received and provide pilot with Pilot Information about where other aircraft in the vicinity are located An escape maneuver to avoid potential NMAC threats Maintain separation between own and other aircraft using visual scanning Monitor TCAS displays and implement TCAS escape maneuvers Follow ATC advisories Air Traffic Controller Maintain separation between aircraft in controlled airspace by providing advisories (control action) for pilot to follow

103 Aircraft components (e.g., transponders, antennas) Execute control maneuvers Receive and send messages to/from aircraft Etc. Airline Operations Management Provide procedures for using TCAS and following TCAS advisories Train pilots Audit pilot performance Air Traffic Control Operations Management Provide procedures Train controllers, Audit performance of controllers Audit performance of overall collision avoidance system

104 Step 3a: Identify potential inadequate control actions that could lead to a hazardous state. In general: 1. A required control action is not provided or not followed 2. An incorrect or unsafe control action is provided 3. A potentially correct or inadequate control action is provided too late or too early (at the wrong time) 4. A correct control action is stopped too soon.

105 For the NMAC hazard: TCAS: Pilot: 1. The aircraft are on a near collision course and TCAS does not provide an RA 2. The aircraft are in close proximity and TCAS provides an RA that degrades vertical separation. 3. The aircraft are on a near collision course and TCAS provides an RA too late to avoid an NMAC 4. TCAS removes an RA too soon 1. The pilot does not follow the resolution advisory provided by TCAS (does not respond to the RA) 2. The pilot incorrectly executes the TCAS resolution advisory. 3. The pilot applies the RA but too late to avoid the NMAC 4. The pilot stops the RA maneuver too soon.

106 Step 3b: Use identified inadequate control actions to refine system safety design constraints When two aircraft are on a collision course, TCAS must always provide an RA to avoid the collision TCAS must not provide RAs that degrades vertical separation The pilot must always follow the RA provided by TCAS

107 Step 4: Determine how potentially hazardous control actions could occur (scenarios of how constraints can be violated). Eliminate from design or control in design or operations. Step4a: Augment control structure with process models for each control component. Step4b: For each of inadequate control actions, examine parts of control loop to see if could cause it. Guided by a set of generic control flaws Step 4c: Design controls and mitigation measures Step4d: Consider how designed controls could degrade over time.

108

109

110 Generic Control Loop Flaws 1. Inadequate Enforcement of Constraints (inadequate Control Actions) - Design of control algorithm (process) does not enforce constraints Flaws in creation process Process changes without appropriate change in control algorithm (asynchronous evolution) Incorrect modification or adaptation - Inadequate coordination among controllers and decision makers

111 - Process models inconsistent, incomplete, or incorrect Flaws in creation process Flaws in updating (inadequate or missing feedback) Not provided in system design Communication flaw Time lag Inadequate sensor operation (incorrect or no information provided) Time lags and measurement inaccuracies not accounted for Expected process inputs are wrong or missing Expected control inputs are wrong or missing Disturbance model is wrong Amplitude, frequency, or period is out of range Unidentified disturbance

112 2. Inadequate Execution of Control Actions Communication flaw Inadequate actuator operation Time lag

113 Comparison with Traditional HA Techniques Top-down (vs bottom-up like FMECA) Considers more than just component failure and failure events (includes these but more general) Guidance in doing analysis (vs. FTA) Handles dysfunctional interactions and system accidents, software, management, etc.

114 Comparisons (2) Concrete model (not just in head) Not physical structure (HAZOP) but control (functional) structure General model of inadequate control (based on control theory) HAZOP guidewords based on model of accidents being caused by deviations in system variables Includes HAZOP model but more general Compared with TCAS II Fault Tree (MITRE) STPA results more comprehensive Included Ueberlingen accident

115 Thermal Tile Robot Example 1. Identify high-level functional requirements and environmental constraints. e.g. size of physical space, crowded area 2. Identify high-level hazards a. Violation of minimum separation between mobile base and objects (including orbiter and humans) b. Mobile robot becomes unstable (e.g., could fall over) c. Manipulator arm hits something d. Fire or explosion e. Contact of human with DMES f. Inadequate thermal control (e.g., damaged tiles not detected, DMES not applied correctly) g. Damage to robot

116 3. Try to eliminate hazards from system conceptual design. If not possible, then identify controls and new design constraints. For unstable base hazard System Safety Constraint: Mobile base must not be capable of falling over under worst case operational conditions

117 First try to eliminate: 1. Make base heavy Could increase damage if hits someone or something. Difficult to move out of way manually in emergency 2. Make base long and wide Eliminates hazard but violates environmental constraints 3. Use lateral stability legs that are deployed when manipulator arm extended but must be retracted when mobile base moves. Two new design constraints: Manipulator arm must move only when stabilizer legs are fully deployed Stabilizer legs must not be retracted until manipulator arm is fully stowed.

118 Define preliminary control structure and refine constraints and design in parallel.

119 Identify potentially hazardous control actions by each of system components 1. A required control action is not provided or not followed 2. An incorrect or unsafe control action is provided 3. A potentially correct or inadequate control action is provided too late or too early (at the wrong time) 4. A correct control action is stopped too soon. Hazardous control of stabilizer legs: Legs not deployed before arm movement enabled Legs retracted when manipulator arm extended Legs retracted after arm movements are enabled or retracted before manipulator arm fully stowed Leg extension stopped before they are fully extended

120 Restate as safety design constraints on components 1. Controller must ensure stabilizer legs are extended whenever arm movement is enabled 2. Controller must not command a retraction of stabilizer legs when manipulator arm extended 3. Controller must not command deployment of stabilizer legs before arm movements are enabled. Controller must not command retraction of legs before manipulator arm fully stowed 4. Controller must not stop leg deployment before they are fully extended

121 Do same for all hazardous commands: e.g., Arm controller must not enable manipulator arm movement before stabilizer legs are completely extended. At this point, may decide to have arm controller and leg controller in same component

122 To produce detailed scenarios for violation of safety constraints, augment control structure with process models Arm Movement Enabled Disabled Unknown Stabilizer Legs Extended Retracted Unknown Manipulator Arm Stowed Extended Unknown How could become inconsistent with real state? e.g. issue command to extend stabilizer legs but external object could block extension or extension motor could fail

123 Problems often in startup or shutdown: e.g., Emergency shutdown while servicing tiles. Stability legs manually retracted to move robot out of way. When restart, assume stabilizer legs still extended and arm movement could be commanded. So use unknown state when starting up Do not need to know all causes, only safety constraints: May decide to turn off arm motors when legs extended or when arm extended. Could use interlock or tell computer to power it off. Must not move when legs extended? Power down wheel motors while legs extended. Coordination problems

124 Some Examples and References to Papers on Them

125 Example: Early System Architecture Trades for Space Exploration Part of an MIT/Draper Labs contract with NASA Wanted to include risk, but little information available Not possible to evaluate likelihood when no design information available Can consider severity by using worst-case analysis associated with specific hazards. Developed three step process: Identify system-level hazards and associated severities Identify mitigation strategies and associated impact Calculate safety/risk metrics for each architecture

126 First identify system hazards and severities Sample

127 Identified Hazards and their Severities Severity ID# Phase Hazard H M Eq G1 General Flamable substance in presence of ignition source (Fire) G2 General Flamable substance in presnece of ignition source in confined space (Explosion) G3 General Loss of life support (includes power, temperature, oxygen, air pressure, CO2, food, water, etc.) G4 General Crew injury or illness G5 General Solar or nuclear radiation exceeding safe levels G6 General Collision (Micrometeroids, debris, with modules during rendevous or separation maneuver, etc.) G7 General Loss of attitude control G8 General Engines do not ignite PL1 Pre-Launch Damage to Payload PL2 Pre-Launch Launch delay (due to weather, pre-launch test failures, etc.) L1 Launch Incorrect propulsion/trajectory/control during ascent L2 Launch Loss of structural integrity (due to aerodynamic loads, vibrations, etc) L3 Launch Incorrect stage separation E1 EVA in Space Lost in space A1 Assembly Incorrect propulsion/control during rendevous A2 Assembly Inability to dock A3 Assembly Inability to achieve airlock during docking A4 Assembly Inability to undock T1 In-Space Transfer Incorrect propulsion/trajectory/control during course change burn D1 Descent Inability to undock D2 Descent Incorrect propulsion/trajectory/control during descent D3 Descent Loss of structural integrity (due to inadequate thermal control, aerodynamic loads, vibrations, etc) A1 Ascent Incorrect stage separation (including ascent module disconnecting from descent stage) A2 Ascent Incorrect propulsion/trajectory/control during ascent A3 Ascent Loss of structural integrity (due to aerodynamic loads, vibrations, etc) S1 Surface Operations Crew members stranded on M surface during EVA S2 Surface Operations Crew members lost on M surface during EVA S3 Surface Operations Equipment damage (including related to lunar dust) NP1 Nuclear Power Nuclear fuel released on earth surface NP2 Nuclear Power Insufficient power generation (reactor doesn't work) NP3 Nuclear Power Insufficient reactor cooling (leading to reactor meltdown) RE1 Re-Entry Inability to undock RE2 Re-Entry Incorrect propulsion/trajectory/control during descent RE3 Re-Entry Loss of structural integrity (due to inadequate thermal control, aerodynamic loads, vibrations, etc) RE4 Re-Entry Inclement weather 4 2 2

128 For example, not performing a rendezvous in transit reduces hazard of being unable to dock

129 Evaluate Each Architecture and Calculate Safety/Risk Metrics Create an architecture vector with all parameters for that architecture (column C of spreadsheet) Compute metric on architecture vector: Calculate a Relative Hazard Mitigation Index Calculate a Relative Severity Index Combine into an Overall Safety/Risk Metric Details in final.pdf

130 Sample Results

131 Ballistic Missile Defense System (BMDS) Non-Advocate Safety Assessment using STPA A layered defense to defeat all ranges of threats in all phases of flight (boost, mid-course, and terminal) Made up of many existing systems (BMDS Element) Early warning radars Aegis Ground-Based Midcourse Defense (GMD) Command and Control Battle Management and Communications (C2BMC) Others MDA used STPA to evaluate the residual safety risk of inadvertent launch prior to deployment and test

132 Safety Control Structure Diagram for FMIS Command Authority Early Warning System Radar Exercise Results Readiness Status Request Status Wargame Results Doctrine Engagement Criteria Training TTP Workarounds Engage Target Operational Mode Change Readiness State Change Weapons Free / Weapons Hold Launch Report Status Report Heartbeat Radar Tasking Readiness Mode Change Status Request Status Track Data Operators Fire Control Operational Mode Readiness State System Status Track Data Weapon and System Status Abort Arm BIT Command Task Load Launch Operating Mode Power Safe Software Updates Launch Position Stow Position Perform BIT Launcher BIT Results Launcher Position Fire DIsable Fire Enable Operational Mode Change Readiness State Change Interceptor Tasking Task Cancellation Command Responses System Status Launch Report Interceptor Simulator Launch Station Abort Acknowledgements BIT Results Health & Status Acknowledgements BIT Results Health & Status Flight Computer Arm BIT Command Task Load Launch Operating Mode Power Safe Software Updates Breakwires Safe & Arm Status Voltages BIT Info Safe & Arm Status Arm Safe Ignite Interceptor H/W 8/2/

133 Results Deployment and testing held up for 6 months because so many scenarios identified for inadvertent launch (the only hazard considered so far). In many of these scenarios: All components were operating exactly as intended Complexity of component interactions led to unanticipated system behavior STPA also identified component failures that could cause inadequate control (most analysis techniques consider only these failure events) As changes are made to the system, the differences are assessed by updating the control structure diagrams and assessment analysis templates. Adopted as primary safety approach for BMDS

134 Safety-driven Design of an Outer Planets Explorer Spacecraft for JPL Demonstration of approach on the design of a deep space exploration mission spacecraft (Europa). Defined mission hazards Generated mission safety requirements and design constraints Created spacecraft control structure and system design Performed STPA and generated component safety requirements and design features to control hazards (complete specifications also available)

135 Organizational and Cultural Risk Analysis

136 Cultural and Organizational Risk Analysis and Performance Monitoring Apply STAMP and STPA at organizational level plus system dynamics modeling and analysis Goals: Evaluating and analyzing risk Designing and validating improvements Monitoring risk ( canary in the coal mine ) Identifying leading indicators of increasing or unacceptable risk

137 System Dynamics Created at MIT in 1950s by Forrester Used a lot in Sloan School (management) Grounded in non-linear dynamics and feedback control Also draws on Cognitive and social psychology Organization theory Economics Other social sciences Use to understand changes over time (dynamics of a system

138

139 People who don't know rate of sharing the news People who know + + Probability of Contact with those in the know + Contacts between people who know and people who don't people Time (Month) People who know People who don't know Rate of sharing the news

140

141

142

143

144

145 Risk Analysis Process for Independent Technical Authority 1. Preliminary Hazard Analysis? System hazards? System safety requirements and constraints 2. Modeling the ITA Safety Control Structure? Roles and responsibilities? Feedback mechanisms 3. Mapping Requirements to Responsibilities 4. Detailed Hazard Analysis using STPA? Gap analysis? System risks (inadequate controls) 5. Categorizing & Analyzing Risks? Immediate and longer term risks 6. System Dynamics Modeling and Analysis? Sensitivity? Leading indicators? Risk Factors 7. Findings and Recommendations? Policy? Structure? Leading indicators and measures of effectiveness

146 1. Preliminary Hazard Analysis System Hazard: Poor engineering and management decisionmaking leading to an accident (loss). System Safety Requirements and Constraints: 1. Safety considerations must be first and foremost in technical decision-making. 2. Safety-related technical decision-making must be done by eminently qualified experts with broad participation of the full workforce. 3. Safety analyses must be available and used starting in the early acquisition, requirements development, and design processes and continuing through the system lifecycle. 4. The Agency must provide avenues for full expression of technical conscience and a process for full and adequate resolution of technical conflicts as well as conflicts between programmatic and technical concerns.

147 Each of these was refined, e.g., 1. Safety considerations must be first and foremost in technical decision-making. a. State-of-the art safety standards and requirements for NASA missions must be established, implemented, enforced, and maintained that protect the astronauts, the workforce, and the public. b. Safety-related technical decision-making must be independent from programmatic considerations, including cost and schedule c. Safety-related decision-making must be based on correct, complete, and up-to-date information. d. Overall (final) decision-making must include transparent consideration of both safety and programmatic concerns. e. The Agency must provide for effective assessment and improvement in safety-related decision-making. To create a set of system safety requirements and constraints sufficient to eliminate or mitigate the hazard

148 2. Model the ITA Control Structure

149 For each component specified: Inputs, outputs Overall role and detailed responsibilities (requirements) Potential inadequate control actions Feedback requirements For most added: Environmental and behavior-shaping factors (context) Mental model requirements Controls

150 Example from System Technical Warrant Holder 1. Establish and maintain technical policy, technical standards, requirements, and processes for a particular system or systems. a. STWH shall ensure program identifies and imposes appropriate technical requirements at program/project formulation to ensure safe and reliable operations. b. STWH shall ensure inclusion of the consideration of risk, failure, and hazards in technical requirements. c. STWH shall approve the set of technical requirements and any changes to them d. STWH shall approve verification plans for the system(s)

151 3. Map System Requirements to Component Responsibilities Took each of system safety requirements and traced to component responsibilities (requirements) Identified omissions, conflicts, potential issues Recommended additions and changes Added responsibilities when missing in order for risk analysis to be complete.

152 4. Hazard Analysis using STPA General types of risks for ITA: 1. Unsafe decisions are made by or approved by ITA 2. Safe decisions are disallowed (overly conservative decisionmaking that undermines the goals of NASA and long-term support for ITA) 3. Decision-making takes too long, minimizing impact and also reducing support for ITA 4. Good decisions are made by ITA, but do not have adequate impact on system design, construction, and operation Applied to each of component responsibilities Identified basic and coordination risks

153 Example from Risks List CE Responsibility: Develop, monitor, and maintain technical standards and policy Risks: 1. General technical and safety standards and requirements are not created (IC) 2. Inadequate standards and requirements are created (IC) 3. Standards degrade as changed over time due to external pressures to weaken them. Process for approving changes is flawed (LT). 4. Standards not changed or updated over time as the environment changes (LT).

154 5. Categorize and Analyze Risks Large number resulted so: Categorized risks as Immediate concern Longer-term concern Standard Process Used system dynamics models to identify which risks were most important to assess and measure Provide most important assessment of current level of risk Most likely to detect increasing risk early enough to prevent significant losses (leading indicators)

155 Launch Rate System Safety Resource Allocation Perceived Success by Administration ITA System Safety Knowledge, Skills & Staffing Shuttle Aging and Maintenance Risk Incident Learning & Corrective Action System Safety Efforts & Efficacy

156 6. System Dynamics Modeling Modified our NASA manned space program model to include Independent Technical Authority (ITA) Independently tested and validated the nine models, then connected them Ran analyses: Sensitivity analyses to investigate impact of various parameters on system dynamics and risk System behavior mode investigation Metrics evaluations Additional scenarios and insights

157 Example Result ITA has potential to significantly reduce risk and to sustain an acceptable risk level But also found significant risk of unsuccessful implementation of ITA that needs to be monitored 200-run Monte-Carlo sensitivity analysis Random variations of +/- 30% of baseline exogenous parameter values

158 1 Successful vs. Unsuccessful ITA Implementation Indicator of Effectiveness and Credibility of ITA System Technical Risk Time Time

159 Successful Scenarios Self-sustaining for short period of time if conditions in place for early acceptance. Provides foundation for a solid, sustainable ITA program implementation under right conditions. Successful scenarios: After period of high success, effectiveness slowly declines Complacency Safety seen as solved problem Resources allocated to more urgent matters But risk still at acceptable levels and extended period of nearly steady-state equilibrium with risk at low levels

160 Unsuccessful Implementation Scenarios Effectiveness quickly starts to decline and reaches unacceptable levels Limited ability of ITA to have sustained effect on system Hazardous events start to occur, safety increasingly perceived as urgent problem More resources allocated to safety but TA and TWHs have lost so much credibility they cannot effectively contribute to risk mitigation anymore. Risk increases dramatically ITA and safety staff overwhelmed with safety problems Start to approve an increasing number of waivers so can continue to fly.

161 Unsuccessful Scenario Factors As effectiveness of ITA decreases, number of problems increase Investigation requirements increase Corners may be cut to compensate Results in lower-quality investigation resolutions and corrective actions TWHs and Trusted Agents become saturated and cannot attend to each investigation in timely manner Bottleneck created by requiring TWHs to authorize all safetyrelated decisions, making things worse Want to detect this reinforcing loop while interventions still possible and not overly costly (resources, downtime)

162 Identification of Lagging vs. Leading Indicators Number of waivers issued good indicator but lags rapid increase in risk System Technical Risk Outstanding Accumulated Waivers Time Risk Units Incidents Incidents under investigation is a better leading indicator System Technical Risk Incidents Under Investigation Time Risk Units Incidents

163 Modeling Exploration Enterprise (ESMD) Built a large STAMP plus systems dynamics model of the Project Constellation Development-oriented vs. operations oriented Space Shuttle model Demonstrating how it can be used for risk management decision-making

164 Risk Management in NASA s New Exploration Systems Mission Directorate Created an executable model, using input from the NASA workforce, to analyze relative effects of management strategies on schedule, cost, safety and performance Developed scenarios to analyze risks identified by the Agency s workforce Performed preliminary analysis on the effects of hiring constraints, management reserves, independence of safety decision-making, requirements changes, etc. Derived preliminary recommendations to mitigate and monitor program-level risks

165 Structure of System Dynamics Model Congress and White House Decision -Making OSMA NASA Administration and ESMD Decision -Making OCE NESC Exploration Systems Program /Project Management Task Completion and Schedule Pressure Resource Allocation Safety and Mission Assurance SMA Status, Efficacy, Knowledge and Skills Exploration Systems Engineering Management Technical Personnel Resources and Experience Efforts and Efficacy of Other Technical Personnel System Development and Safety Analysis Completion Engineering Procurement

166 Engineering - System Development Completion and Safety Analyses Technology Development Task Allocation Rate (from P/P Management) Capacity for PerformingTechnology Development Work 0 System Performance Design Schedule Pressure from Management Capacity for Performing System Design Work 0 System Design Overwork Progress Report to Management Abandoned Technologies Technology Abandonment Rate Pending Technology Development Tasks Incoming Technology Technology Development Development Tasks Task Completion Rate Completed Technology Development Tasks Technology Utilization Rate Technologies used in Design Additional Incoming Work from Changes (from P/P Management) Technology Available to be used in Design Desired Design Task Completion Rate Design Task Allocation Rate (from P/P Management) Additional Incoming Design Work Incoming Program Design Work Design Work Remaining Design Task Completion Rate Design Work Completed Total Design Work Completion Rate Apparent Work Completed Integrated Product Design Work Completion Rate with Safety and Integration Flaws Design Work Completed with Undiscovered Safety and Integration Problems Rework Cycle Fraction of Design Tasks Completed with Safety and Integration Flaws Fraction of Design Tasks with Associated Hazard Analysis Safety of Operational System Ability to Perform Contractor Safety Oversight 2 Fraction of HAs Too Late to Influence Design Flaw Discovery Rate Incentives to Report Flaws Safety Assurance Resources Time to Discover Flaws Unplanned Rework Decision Rate Work Discovered with Safety and Integration Problems Acceptance Rate Desired Safety Analysis Completion Rate Maximum System Safety Analysis Completion Rate Quality of Safety Analyses 0 Efficacy of Safety Assurance (SMA) Efficacy of System Integration Design Work with Accepted Problems or Unsatisfied Requirements Average Hazard Analysis Quality Average Quality of Hazard Analyses used in Design Incoming Hazard Analysis Tasks Pending Hazard Analyses Completed Hazard Analyses HA Completion Rate HA Utilization Rate Hazard Analyses used in Design HA Discard Rate Additional Operations Cost for Safety and Integration Workaround Safety Rework Hazard Analyses unused in Design Decisions

A System s Approach to Safety. Prof. Nancy Leveson Aeronautics and Astronautics MIT

A System s Approach to Safety. Prof. Nancy Leveson Aeronautics and Astronautics MIT A System s Approach to Safety Prof. Nancy Leveson Aeronautics and Astronautics MIT The Problem Why do we need a new approach? New causes of accidents in complex, softwareintensive systems Software does

More information

Assuring Safety of NextGen Procedures

Assuring Safety of NextGen Procedures Assuring Safety of NextGen Procedures Prof. Nancy Leveson Cody H. Fleming M. Seth Placke 1 Outline Motivation Propose Accident Model Hazard Analysis Technique Current and Future Work 2 Motivation Air Traffic

More information

STPA: A New Hazard Analysis Technique. Presented by Sanghyun Yoon

STPA: A New Hazard Analysis Technique. Presented by Sanghyun Yoon STPA: A New Hazard Analysis Technique Presented by Sanghyun Yoon Introduction Hazard analysis can be described as investigating an accident before it occurs. Potential causes of accidents can be eliminated

More information

Incorporating Safety in Early System Architecture Trade Studies

Incorporating Safety in Early System Architecture Trade Studies Incorporating Safety in Early System Architecture Trade Studies Nicolas Dulac, Nancy Leveson, Ph.D.; Massachusetts Institute of Technology, Cambridge, Massachusetts, USA Keywords: Design for Safety, Hazard

More information

Software Safety Testing Based on STPA

Software Safety Testing Based on STPA Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 80 (2014 ) 399 406 3 rd International Symposium on Aircraft Airworthiness, ISAA 2013 Software Safety Testing Based on STPA Changyong

More information

The Path to More Cost-Effective System Safety

The Path to More Cost-Effective System Safety The Path to More Cost-Effective System Safety Nancy Leveson Aeronautics and Astronautics Dept. MIT Changes in the Last 50 Years New causes of accidents created by use of software Role of humans in systems

More information

Using System Theoretic Process Analysis (STPA) for a Safety Trade Study

Using System Theoretic Process Analysis (STPA) for a Safety Trade Study Using System Theoretic Process Analysis (STPA) for a Safety Trade Study David Horney MIT/U.S. Air Force Distribution Statement A: Approved for public release; distribution unlimited Safety-Guided Design

More information

STAMP Experienced Users Tutorial. John Thomas Blandine Antoine Cody Fleming Melissa Spencer Qi Hommes Tak Ishimatsu John Helferich

STAMP Experienced Users Tutorial. John Thomas Blandine Antoine Cody Fleming Melissa Spencer Qi Hommes Tak Ishimatsu John Helferich STAMP Experienced Users Tutorial John Thomas Blandine Antoine Cody Fleming Melissa Spencer Qi Hommes Tak Ishimatsu John Helferich Systems approach to safety engineering (STAMP) STAMP Model (Leveson, 2003);

More information

A Systems Approach to Risk Management Through Leading Indicators

A Systems Approach to Risk Management Through Leading Indicators A Systems Approach to Risk Management Through Leading Indicators Nancy Leveson MIT Goal To identify potential for an accident before it occurs Underlying assumption: Major accidents not due to a unique

More information

DEVELOPING SAFETY-CRITICAL SOFTWARE REQUIREMENTS FOR COMMERCIAL REUSABLE LAUNCH VEHICLES

DEVELOPING SAFETY-CRITICAL SOFTWARE REQUIREMENTS FOR COMMERCIAL REUSABLE LAUNCH VEHICLES DEVELOPING SAFETY-CRITICAL SOFTWARE REQUIREMENTS FOR COMMERCIAL REUSABLE LAUNCH VEHICLES Daniel P. Murray (1) and Terry L. Hardy (2) (1) Federal Aviation Administration, Office of Commercial Space Transportation,

More information

Engineering a Safer and More Secure World

Engineering a Safer and More Secure World Seminar Series Engineering a Safer and More Secure World Nancy Leveson MIT You ve carefully thought out all the angles You ve done it a thousand times It comes naturally to you You know what you re doing,

More information

CORE TOPICS Core topic 3: Identifying human failures. Introduction

CORE TOPICS Core topic 3: Identifying human failures. Introduction CORE TOPICS Core topic 3: Identifying human failures Introduction Human failures are often recognised as being a contributor to incidents and accidents, and therefore this section has strong links to the

More information

STAMP Applied to Workplace Safety

STAMP Applied to Workplace Safety STAMP Applied to Workplace Safety Emily Howard, Ph.D., Senior Technical Fellow Lori Smith, EHS Deputy Chief Engineer March 21, 2016 The Team Dr. Emily Howard, Human Factors Engineering, Boeing Senior Technical

More information

Intelligent-Controller Extensions to STPA. Dan Mirf Montes

Intelligent-Controller Extensions to STPA. Dan Mirf Montes Intelligent-Controller Extensions to STPA Dan Mirf Montes Disclaimer The views expressed in this document are those of the author and do not reflect the official position or policies of the United States

More information

To understand the importance of defining a mission or project s scope.

To understand the importance of defining a mission or project s scope. Scoping & CONOPS 1 Agenda To understand the importance of defining a mission or project s scope. To explain the contents of scope, including needs, goals, objectives, assumptions, authority and responsibility,

More information

A systems-theoretic approach

A systems-theoretic approach A systems-theoretic approach to analyze humanautomation interactions Dr. John Thomas MIT Human Factors in Control April 2018 Halden, Norway Outline Safety Engineering Modern engineering challenges Modern

More information

Engineering Safety-Critical Systems in the 21 st Century

Engineering Safety-Critical Systems in the 21 st Century Engineering Safety-Critical Systems in the 21 st Century Tom Ferrell, Principal FAA Consulting, Inc. Slide 1 A Loose Outline What is a Safety? Safety Levels Architecture and Process Organizational Focus

More information

Lothar Winzer Head of Software Product Assurance Section ESA/ESTEC Product Assurance and Safety Department. Apr-17-09

Lothar Winzer Head of Software Product Assurance Section ESA/ESTEC Product Assurance and Safety Department. Apr-17-09 Lothar Winzer Head of Software Product Assurance Section ESA/ESTEC Product Assurance and Safety Department Apr-17-09 List of past new challenges or new promises Which have been overcome Which are work

More information

Scoping & Concept of Operations (ConOps) Module

Scoping & Concept of Operations (ConOps) Module Scoping & Concept of Operations (ConOps) Module Space Systems Engineering, version 1.0 Space Systems Engineering: Scoping & ConOps Module Module Purpose: Scoping & ConOps To understand the importance of

More information

Risk. Risk Categories. Project Risk (aka Development Risk) Technical Risks. Business Risk. Example: Project Risk. Lecture 5, Part 1: Risk

Risk. Risk Categories. Project Risk (aka Development Risk) Technical Risks. Business Risk. Example: Project Risk. Lecture 5, Part 1: Risk Risk Lecture 5, Part 1: Risk Jennifer Campbell CSC340 - Winter 2007 The possibility of suffering loss Risk involves uncertainty and loss: Uncertainty: The degree of certainty about whether the risk will

More information

A STAMP-based approach for designing maritime safety management systems

A STAMP-based approach for designing maritime safety management systems A STAMP-based approach for designing maritime safety management systems Osiris A. Valdez Banda and Floris Goerlandt Research Group on Maritime Risk and Safety, Marine Technology, Aalto University, Espoo,

More information

Risk analysis of serious air traffic incident based on STAMP HFACS and fuzzy sets

Risk analysis of serious air traffic incident based on STAMP HFACS and fuzzy sets Risk analysis of serious air traffic incident based on STAMP HFACS and fuzzy sets Michał LOWER, Jan MAGOTT Wroclaw University of Technology Jacek SKORUPSKI Warsaw University of Technology Plan 1. The goal

More information

Not all or nothing, not all the same: classifying automation in practice

Not all or nothing, not all the same: classifying automation in practice Not all or nothing, not all the same: classifying automation in practice by Dr Luca Save Different Levels of Automation Since the seminal work of Sheridan & Verplanck 39 it has become apparent that automation

More information

Scoping & Concept of Operations (ConOps) Module Exploration Systems Engineering, version 1.0

Scoping & Concept of Operations (ConOps) Module Exploration Systems Engineering, version 1.0 Scoping & Concept of Operations (ConOps) Module Exploration Systems Engineering, version 1.0 Exploration Systems Engineering: Scoping & ConOps Module Module Purpose: Scoping & ConOps To understand the

More information

Introduction to Software Testing

Introduction to Software Testing Introduction to Software Testing Introduction Chapter 1 introduces software testing by : describing the activities of a test engineer defining a number of key terms explaining the central notion of test

More information

Contextual note SESAR Solution description form for deployment planning

Contextual note SESAR Solution description form for deployment planning Purpose: Release 5 SESAR Solution #101 Contextual note SESAR Solution description form for deployment planning This contextual note introduces a SESAR Solution (for which maturity has been assessed as

More information

ISO : Rustam Rakhimov (DMS Lab)

ISO : Rustam Rakhimov (DMS Lab) ISO 26262 : 2011 Rustam Rakhimov (DMS Lab) Introduction Adaptation of IEC 61508 to road vehicles Influenced by ISO 16949 Quality Management System The first comprehensive standard that addresses safety

More information

STATE NUCLEAR POWER SAFETY INSPECTORATE Of THE REPUBLIC OF LITHUANIA (VATESI) REGULATIONS

STATE NUCLEAR POWER SAFETY INSPECTORATE Of THE REPUBLIC OF LITHUANIA (VATESI) REGULATIONS 1 Translation from Russian STATE NUCLEAR POWER SAFETY INSPECTORATE Of THE REPUBLIC OF LITHUANIA (VATESI) REGULATIONS GENERAL REQUIREMENTS FOR THE EVENT REPORTING SYSTEM AT A NUCLEAR POWER PLANT VD-E-04-98

More information

Iterative Application of STPA for an Automotive System

Iterative Application of STPA for an Automotive System Iterative Application of STPA for an Automotive System GM Team Joe D Ambrosio Rami Debouk Dave Hartfelder Padma Sundaram Mark Vernacchia Sigrid Wagner MIT Team John Thomas Table of Contents Introduction/Background

More information

Comparison of Hazard Analysis Requirements for Instrumentation and Control System of Nuclear Power Plants

Comparison of Hazard Analysis Requirements for Instrumentation and Control System of Nuclear Power Plants of Hazard Analysis Requirements for Instrumentation and Control System of Nuclear Power Plants Jang Soo Lee and Jun Beom Yoo 2. I&C.HF Division, KAERI, Daejeon, Korea (jslee@kaeri.re.kr) 2. Department

More information

Software Design Decision Vulnerability Analysis

Software Design Decision Vulnerability Analysis Software Vulnerability Analysis P G Avery*, R D Hawkins *Thales UK, UK, email: phil.avery@uk.thalesgroup.com, The University of York, UK, email: richard.hawkins@york.ac.uk Keywords: software, safety, design,

More information

Leadership and the Normalization of Deviance: Step off the Ladder

Leadership and the Normalization of Deviance: Step off the Ladder Leadership and the Normalization of Deviance: Step off the Ladder Organizational Traps Theories-In-Use vs Espoused Theories The Ladder of Inference I will follow up questions to confirm

More information

IAPA Project Final Report Synthesis and guidelines. Implications on ACAS Performances due to ASAS implementation IAPA Project

IAPA Project Final Report Synthesis and guidelines. Implications on ACAS Performances due to ASAS implementation IAPA Project IAPA Project Final Report Synthesis and guidelines Implications on ACAS Performances due to ASAS implementation IAPA Project Drafted by: Béatrice Raynaud & Thierry Arino Authorised by: Thierry Arino on

More information

Transforming Risk Management

Transforming Risk Management Transforming Risk Management 1 Transforming Risk Management Understanding the Challenges of Safety Risk Measurement 2 Transforming Risk Management The challenge of safety risk measurement is threefold:

More information

Lecture 10: Managing Risk. Risk Management

Lecture 10: Managing Risk. Risk Management General ideas about Risk Risk Management Identifying Risks Assessing Risks Case Study: Mars Polar Lander Lecture 10: Managing Risk 2008 Steve Easterbrook. This presentation is available free for non-commercial

More information

JASON 1 : LESSONS LEARNED FROM THE DEVELOPMENT AND 1 YEAR IN ORBIT

JASON 1 : LESSONS LEARNED FROM THE DEVELOPMENT AND 1 YEAR IN ORBIT JASON 1 : LESSONS LEARNED FROM THE DEVELOPMENT AND 1 YEAR IN ORBIT Thierry Lafon JASON 1 Satellite Manager, Centre National d Etudes Spatiales 18 avenue Edouard Belin - 31401 Toulouse cedex 4 France thierry.lafon@cnes.fr

More information

A New Approach to Hazard Analysis for Rotorcraft

A New Approach to Hazard Analysis for Rotorcraft A New Approach to Hazard Analysis for Rotorcraft Blake Abrecht Massachusetts Institute of Technology Engineering Systems Division Master s Student 2 nd Lieutenant, United States Air Force Cambridge, MA,

More information

Boeing Engineering. Overview of

Boeing Engineering. Overview of Boeing Engineering Overview of Systems Theoretical Analysis, Modeling and Processes (STAMP) and Systems Theoretical Process Analysis (STPA) for Product and Production Systems Engineering Marc Nance Director,

More information

Systems-Based Approaches for Effective Problem Solving

Systems-Based Approaches for Effective Problem Solving Systems-Based Approaches for Effective Problem Solving James P. Bagian, MD, PE Director Center for Healthcare Engineering and Patient Safety Professor, Department of Anesthesiology and Engineering University

More information

Challenge H: For an even safer and more secure railway

Challenge H: For an even safer and more secure railway The application of risk based safety analysis has been introduced to the Railway system with the publication of the dedicated standard EN 50 126 in 1999. In the railway sector the application of these

More information

The SMS Table. Kent V. Hollinger. December 29, 2006

The SMS Table. Kent V. Hollinger. December 29, 2006 The SMS Table Kent V. Hollinger December 29, 2006 This presentation introduces the concepts contained in a Safety Management System (SMS) by using the analogy of an SMS being a four-legged glass-top table,

More information

An Application of Causal Analysis to the Software Modification Process

An Application of Causal Analysis to the Software Modification Process SOFTWARE PRACTICE AND EXPERIENCE, VOL. 23(10), 1095 1105 (OCTOBER 1993) An Application of Causal Analysis to the Software Modification Process james s. collofello Computer Science Department, Arizona State

More information

Techniques and benefits of incorporating Safety and Security analysis into a Model Based System Engineering Environment

Techniques and benefits of incorporating Safety and Security analysis into a Model Based System Engineering Environment Techniques and benefits of incorporating Safety and Security analysis into a Model Based System Engineering Environment Gavin Arthurs P.E Solution Architect Systems Engineering IBM Software, Rational Common

More information

Designing to Reduce Human Error

Designing to Reduce Human Error Designing to Reduce Human Error 1 Advantages of Humans Human operators are adaptable and flexible Able to adapt both goals and means to achieve them Able to use problem solving and creativity to cope with

More information

INTEGRATED modular avionics, or IMA, is a shared set of flexible, reusable, and interoperable hardware and software resources that, when

INTEGRATED modular avionics, or IMA, is a shared set of flexible, reusable, and interoperable hardware and software resources that, when JOURNAL OF AEROSPACE INFORMATION SYSTEMS Vol. 11, No. 6, June 2014 Improving Hazard Analysis and Certification of Integrated Modular Avionics Cody Harrison Fleming and Nancy G. Leveson Massachusetts Institute

More information

Introduction to Software Engineering

Introduction to Software Engineering CHAPTER 1 Introduction to Software Engineering Structure 1.1 Introduction Objectives 1.2 Basics of Software Engineering 1.3 Principles of Software Engineering 1.4 Software Characteristics 1.5 Software

More information

Designed-in Logic to Ensure Safety of Integration and Field Engineering of Large Scale CBTC Systems

Designed-in Logic to Ensure Safety of Integration and Field Engineering of Large Scale CBTC Systems Designed-in Logic to Ensure Safety of Integration and Field Engineering of Large Scale CBTC Systems Fenggang Shi, PhD; Thales Canada Transportation Solutions; Toronto, Canada Keywords: safety engineering,

More information

Autonomous Control for Generation IV Nuclear Plants

Autonomous Control for Generation IV Nuclear Plants Autonomous Control for Generation IV Nuclear Plants R. T. Wood E-mail: woodrt@ornl.gov C. Ray Brittain E-mail: brittaincr@ornl.gov Jose March-Leuba E-mail: marchleubaja@ornl.gov James A. Mullens E-mail:

More information

Using STPA in Compliance with ISO26262 for developing a Safe Architecture for Fully Automated Vehicles

Using STPA in Compliance with ISO26262 for developing a Safe Architecture for Fully Automated Vehicles Bitte decken Sie die schraffierte Fläche mit einem Bild ab. Please cover the shaded area with a picture. (24,4 x 11,0 cm) Using STPA in Compliance with ISO26262 for developing a Safe Architecture for Fully

More information

PACIFIC STATES/BRITISH COLUMBIA OIL SPILL TASK FORCE SPILL and INCIDENT DATA COLLECTION PROJECT REPORT July, 1997

PACIFIC STATES/BRITISH COLUMBIA OIL SPILL TASK FORCE SPILL and INCIDENT DATA COLLECTION PROJECT REPORT July, 1997 PACIFIC STATES/BRITISH COLUMBIA OIL SPILL TASK FORCE SPILL and INCIDENT DATA COLLECTION PROJECT REPORT July, 1997 A. Background on the Task Force The Pacific States/British Columbia Oil Spill Task Force

More information

Roadblocks to Approving SIS Equipment by Prior Use. Joseph F. Siebert. exida. Prepared For. ISA EXPO 2006/Texas A&M Instrumentation Symposium

Roadblocks to Approving SIS Equipment by Prior Use. Joseph F. Siebert. exida. Prepared For. ISA EXPO 2006/Texas A&M Instrumentation Symposium Roadblocks to Approving SIS Equipment by Prior Use Joseph F. Siebert exida Prepared For ISA EXPO 2006/Texas A&M Instrumentation Symposium Houston, TX/College Station, TX October 18, 2006/ January 24, 2007

More information

Outline. Concept of Development Stage. Engineering Development Stage. Post Development Stage. Systems Engineering Decision Tool

Outline. Concept of Development Stage. Engineering Development Stage. Post Development Stage. Systems Engineering Decision Tool Outline Course Outline Introduction to the Course Introduction to Systems Engineering Structure of Complex System System Development Process Needs Analysis Concept Exploration Concept Definition Advanced

More information

CLASS/YEAR: II MCA SUB.CODE&NAME: MC7303, SOFTWARE ENGINEERING. 1. Define Software Engineering. Software Engineering: 2. What is a process Framework? Process Framework: UNIT-I 2MARKS QUESTIONS AND ANSWERS

More information

Advisory Circular. Date: DRAFT Initiated by: AIR-110

Advisory Circular. Date: DRAFT Initiated by: AIR-110 U.S. Department of Transportation Federal Aviation Administration Advisory Circular Subject: DETERMINING THE CLASSIFICATION OF A CHANGE TO TYPE DESIGN. Date: DRAFT Initiated by: AIR-110 AC No: 21.93-1

More information

A Conflict Probe to Provide Early Benefits for Airspace Users and Controllers

A Conflict Probe to Provide Early Benefits for Airspace Users and Controllers A Conflict Probe to Provide Early Benefits for Airspace Users and Controllers Alvin L. McFarland Center for Advanced Aviation System Development, The MITRE Corporation, USA 08/97 The MITRE Corporation

More information

Investigating and Analysing Human and Organizational Factors

Investigating and Analysing Human and Organizational Factors Investigating and Analysing Human and Organizational Factors Heather Parker Transport Canada 2006-11-09 1 Outline Meaning of Human Factors Collecting Data Meaning of Human Error Investigating and Analysing

More information

SULAYMANIYAH INTERNATIONAL AIRPORT MATS CHAPTER 13. April 2012

SULAYMANIYAH INTERNATIONAL AIRPORT MATS CHAPTER 13. April 2012 KURDISTAN REGIONAL GOVERNMENT SULAYMANIYAH INTERNATIONAL AIRPORT MATS CHAPTER 13 ATS SAFETY MANAGEMENT International and Local Procedures ( First Edition ) April 2012 Ff Prepared By Fakhir.F. Mohammed

More information

Custom Small UAV Lab. To: Dr. Lewis Ntaimo ISEN

Custom Small UAV Lab. To: Dr. Lewis Ntaimo ISEN Custom Small UAV Lab To: Dr. Lewis Ntaimo ISEN 689-601 From: Gerardo Iglesias, Sugiri Halim, William, Zane Singleton March 20, 2008 I. Statement of Problem Current development and research in the field

More information

RESILIENCE IN RISK ANALYSIS AND RISK ASSESSMENT

RESILIENCE IN RISK ANALYSIS AND RISK ASSESSMENT Chapter 15 RESILIENCE IN RISK ANALYSIS AND RISK ASSESSMENT Stig Johnsen Abstract Resilience is the ability of a system to react to and recover from disturbances with minimal effects on dynamic stability.

More information

DO-178B 김영승 이선아

DO-178B 김영승 이선아 DO-178B 201372235 김영승 201372237 이선아 Introduction Standard Contents SECTION 1 INTRODUCTION SECTION 2 SYSTEM ASPECTS RELATING TO SOFTWARE DEVELOPMENT SECTION 3 SOFTWARE LIFE CYCLE SECTION 4 SOFTWARE PLANNING

More information

Guidance Material. SMS Manual Format - Scalable. Notice to Users

Guidance Material. SMS Manual Format - Scalable. Notice to Users Guidance Material SMS Manual Format - Scalable Notice to Users This document is an advanced version of a draft CAA publication (proposed appendix to draft Advisory Circular AC137-1 Agricultural Aircraft

More information

COMPARISON OF PROCESS HAZARD ANALYSIS (PHA) METHODS

COMPARISON OF PROCESS HAZARD ANALYSIS (PHA) METHODS COMPARISON OF PROCESS HAZARD ANALYSIS (PHA) METHODS by Primatech Inc. The hazard and operability (HAZOP) study is the most commonly used process hazard analysis (PHA) method. However, there are many other

More information

Collaboration For Better Human Performance : Aviation Industry Success Story

Collaboration For Better Human Performance : Aviation Industry Success Story Collaboration For Better Human Performance : Presentation to: NERC 2015 Human Performance Conference Name: Christopher A. Hart Date: March 18, 2015 Aviation Industry Success Story The Contrast - Conventional

More information

A holistic approach to Automation Safety

A holistic approach to Automation Safety A holistic approach to Automation Safety Mark Eitzman - Manager, Safety Business Development How technology, global standards and open systems help increase productivity and overall equipment effectiveness.

More information

Delivering Safety Through Design Using Early Analysis Methods. Mark A. Vernacchia, MSES, PE General Motors Company; Milford, Michigan, USA

Delivering Safety Through Design Using Early Analysis Methods. Mark A. Vernacchia, MSES, PE General Motors Company; Milford, Michigan, USA Delivering Safety Through Design Using Early Analysis Methods Mark A. Vernacchia, MSES, PE General Motors Company; Milford, Michigan, USA Keywords: systems engineering, SEFA, STPA, interactions, safety,

More information

taking automation to the next level Symptomatic 20st century joke in the world of Process Automation The operator is there to feed the dog

taking automation to the next level Symptomatic 20st century joke in the world of Process Automation The operator is there to feed the dog Operator reliability taking automation to the next level Automation and Control for Energy Manchester, 10-11 may 2011 Luc.de-wilde@total.com Symptomatic 20st century joke in the world of Process Automation

More information

Safety Culture in Modern Aviation Systems Civil and Military

Safety Culture in Modern Aviation Systems Civil and Military Safety Culture in Modern Aviation Systems Civil and Military Valentin-Marian IORDACHE 1, Casandra Venera BALAN (PIETREANU)*,1 *Corresponding author 1 POLITEHNICA University of Bucharest, Aerospace Engineering

More information

International Space Safety Foundation ISSF. Vision for Voluntary Space Safety Standards Development. J. Steven Newman D.Sc.

International Space Safety Foundation ISSF. Vision for Voluntary Space Safety Standards Development. J. Steven Newman D.Sc. International Space Safety Foundation ISSF Vision for Voluntary Space Safety Standards Development By J. Steven Newman D.Sc. November 2007 Safety Risk in Space Regardless of the energy, expertise and excitement

More information

Software verification and validation. Introduction

Software verification and validation. Introduction Software verification and validation. Introduction Marius Minea September 27, 2017 Topics be discussed Black-box testing (no source access) Glass-box/white-box testing (with source access) Generating unit

More information

SOFTWARE DEVELOPMENT STANDARD

SOFTWARE DEVELOPMENT STANDARD SFTWARE DEVELPMENT STANDARD Mar. 23, 2016 Japan Aerospace Exploration Agency The official version of this standard is written in Japanese. This English version is issued for convenience of English speakers.

More information

Instructions. AIRBUS A3XX: Developing the World s Largest Commercial Aircraft

Instructions. AIRBUS A3XX: Developing the World s Largest Commercial Aircraft Instructions AIRBUS A3XX: Developing the World s Largest Commercial Aircraft In this case, you will be analyzing the strategic interaction between Airbus and Boeing, the two leading producers of large

More information

PRACTICE NO. PD-ED-1273 PAGE 1 OF 7 QUANTITATIVE RELIABILITY REQUIREMENTS USED AS PERFORMANCE-BASED REQUIREMENTS FOR SPACE SYSTEMS.

PRACTICE NO. PD-ED-1273 PAGE 1 OF 7 QUANTITATIVE RELIABILITY REQUIREMENTS USED AS PERFORMANCE-BASED REQUIREMENTS FOR SPACE SYSTEMS. PAGE 1 OF 7 PREFERRED RELIABILITY PRACTICES PERFORMANCE-BASED REQUIREMENTS FOR SPACE SYSTEMS Practice: Develop performance-based reliability requirements by considering elements of system performance in

More information

Validation, Verification and MER Case Study

Validation, Verification and MER Case Study Validation, Verification and MER Case Study Prof. Chris Johnson, School of Computing Science, University of Glasgow. johnson@dcs.gla.ac.uk http://www.dcs.gla.ac.uk/~johnson Introduction. Definitions and

More information

Lecture 9 Dependability; safety-critical systems

Lecture 9 Dependability; safety-critical systems Lecture 9 Dependability; safety-critical systems Kari Systä 17.3.2014 17.3.2014 TIE-21100/21101; K.Systä 1 Week Lecture Exercise 10.3 Quality in general; Patterns Quality management systems 17.3 Dependable

More information

Agile Projects 7. Agile Project Management 21

Agile Projects 7. Agile Project Management 21 Contents Contents 1 2 3 4 Agile Projects 7 Introduction 8 About the Book 9 The Problems 10 The Agile Manifesto 12 Agile Approach 14 The Benefits 16 Project Components 18 Summary 20 Agile Project Management

More information

Resilience Engineering as an approach to safety for industry and society. Erik Hollnagel, Professor, Ph.D.

Resilience Engineering as an approach to safety for industry and society. Erik Hollnagel, Professor, Ph.D. Resilience Engineering as an approach to safety for industry and society Erik Hollnagel, Professor, Ph.D. www.erikhollnagel.com The meaning of safety From French Sauf = unharmed / except How can it be

More information

OJT INSTRUCTOR. Learning outcomes. Why study this course? Aim. ICAO Code 212

OJT INSTRUCTOR. Learning outcomes. Why study this course? Aim. ICAO Code 212 OJT INSTRUCTOR ICAO Code 212 ATC Operational Training 10 days Air Traffic Controllers The younger generation expects to know how they are achieving and they want to engage and take responsibility for their

More information

A Holistic Approach to Safety Automation

A Holistic Approach to Safety Automation Word count: 2,974 Conference paper: XVIII World Congress on Safety and Health at Work For more information: Andrea Hazard Padilla Speer Beardsley 612-455-1733 Fax 612-455-1060 ahazard@psbpr.com A Holistic

More information

Design Your Safety System for Improved Uptime

Design Your Safety System for Improved Uptime Design Your Safety System for Improved Uptime Chris Brogli - Manager, Safety Business Development Incorporating integrated safety technologies in the design stage can increase machinery availability, reduce

More information

SAFETYWIRE IN THIS EDITION PAGE 1

SAFETYWIRE IN THIS EDITION PAGE 1 SAFETYWIRE YOUR MONTHLY SOURCE FOR SAFETY INFORMATION MAY 2018 VOLUME XVIII ISSUE V IN THIS EDITION PAGE 1 The Risks of Circling Approaches PAGE 3 Managing Change Safely PAGE 8 Safety Manager s Corner

More information

Spaceflight vs. Human Spaceflight

Spaceflight vs. Human Spaceflight Spaceflight vs. Human Spaceflight The Aerospace Corporation 2013 Stephanie Barr The Aerospace Corporation Civil and Commercial Division stephanie.e.barr@aero.org 6 th Conference International Association

More information

Human Factors: Safety and Automation. Beth Lyall, Ph.D. Research Integrations Inc.

Human Factors: Safety and Automation. Beth Lyall, Ph.D. Research Integrations Inc. Human Factors: Safety and Automation Beth Lyall, Ph.D. Research Integrations Inc. Overview Definition and approach Examples of automation and safety improvements Flight deck automation issues Human factors

More information

AIRBORNE SOFTWARE VERIFICATION FRAMEWORK AIMED AT AIRWORTHINESS

AIRBORNE SOFTWARE VERIFICATION FRAMEWORK AIMED AT AIRWORTHINESS 27 TH INTERNATIONAL CONGRESS OF THE AERONAUTICAL SCIENCES AIRBORNE SOFTWARE VERIFICATION FRAMEWORK AIMED AT AIRWORTHINESS Yumei Wu*, Bin Liu* *Beihang University Keywords: software airworthiness, software

More information

Validation, Verification and MER Case Study

Validation, Verification and MER Case Study Validation, Verification and MER Case Study Prof. Chris Johnson, School of Computing Science, University of Glasgow. johnson@dcs.gla.ac.uk http://www.dcs.gla.ac.uk/~johnson Introduction. Definitions and

More information

Functional Safety: ISO26262

Functional Safety: ISO26262 Functional Safety: ISO26262 Seminar Paper Embedded systems group Aniket Kolhapurkar, University of Kaiserslautern, Germany kolhapur@rhrk.uni kl.de September 8, 2015 1 Abstract Functions in car, such as

More information

Aircraft Systems Mechanical, Electrical and Avionics.pdf Chap System Design and Development

Aircraft Systems Mechanical, Electrical and Avionics.pdf Chap System Design and Development UNIVERSITY OF SALENTO SCHOOL OF INDUSTRIAL ENGINEERING DEPT. OF ENGINEERING FOR INNOVATION Lecce-Brindisi (Italy) MASTER OF SCIENCE IN AEROSPACE ENGINEERING PROPULSION AND COMBUSTION Aircraft Systems Mechanical,

More information

EVALUATION OF SAFETY CULTURE IN WANO PRE-STARTUP REVIEWS

EVALUATION OF SAFETY CULTURE IN WANO PRE-STARTUP REVIEWS EVALUATION OF SAFETY CULTURE IN WANO PRE-STARTUP REVIEWS Todd Brumfield World Association of Nuclear Operators (WANO) Atlanta, Georgia, USA ABSTRACT: The requirements for performance of pre-startup reviews

More information

version NDIA CMMI Conf 3.5 SE Tutorial RE - 1

version NDIA CMMI Conf 3.5 SE Tutorial RE - 1 Requirements Engineering SE Tutorial RE - 1 What Are Requirements? Customer s needs, expectations, and measures of effectiveness Items that are necessary, needed, or demanded Implicit or explicit criteria

More information

ATC BASIC. Learning outcomes. Why study this course? Aim. ICAO Code 051

ATC BASIC. Learning outcomes. Why study this course? Aim. ICAO Code 051 ATC BASIC ICAO Code 051 6 weeks Trainee ATCs (Maximum 12 per course) This course forms the prerequisite for all other Air Traffic Control Courses. trainee will have demonstrated competency with regards

More information

Integrating Legacy Software: Lessons and Hurdles

Integrating Legacy Software: Lessons and Hurdles Integrating Legacy Software: Lessons and Hurdles John Chobany, Associate Director Vehicle Concepts Department Architecture & Design Subdivision Systems Engineering Division The Aerospace Corporation 2

More information

Preventing Fatal & Life Changing Injury Events Frank Baker, CSP, CFPS, ALCM

Preventing Fatal & Life Changing Injury Events Frank Baker, CSP, CFPS, ALCM Preventing Fatal & Life Changing Injury Events Frank Baker, CSP, CFPS, ALCM Regional Manager, Risk Management Greg Clone, ASP Sr. Risk Management Consultant Page 1 N3L3 Is about changing the way we think

More information

Just Culture. Leading Through Shared Values and Expectations

Just Culture. Leading Through Shared Values and Expectations Just Culture Leading Through Shared Values and Expectations Objectives Understand the concepts of Just Culture Identify three predictable behaviors Understand a Just Culture investigation Describe the

More information

Introduction to software testing and quality process

Introduction to software testing and quality process Introduction to software testing and quality process Automated testing and verification J.P. Galeotti - Alessandra Gorla Engineering processes Engineering disciplines pair construction activities activities

More information

On various testing topics: Integration, large systems, shifting to left, current test ideas, DevOps

On various testing topics: Integration, large systems, shifting to left, current test ideas, DevOps On various testing topics: Integration, large systems, shifting to left, current test ideas, DevOps Matti Vuori www.mattivuori.net matti.vuori@mattivuori.net @Matti_Vuori TUT lecture series of SW Technologies:

More information

Use of PSA to Support the Safety Management of Nuclear Power Plants

Use of PSA to Support the Safety Management of Nuclear Power Plants S ON IMPLEMENTATION OF THE LEGAL REQUIREMENTS Use of PSA to Support the Safety Management of Nuclear Power Plants РР - 6/2010 ÀÃÅÍÖÈß ÇÀ ßÄÐÅÍÎ ÐÅÃÓËÈÐÀÍÅ BULGARIAN NUCLEAR REGULATORY AGENCY TABLE OF CONTENTS

More information

International System Safety Training Symposium

International System Safety Training Symposium International Safety Training Symposium Functional Hazard Analysis (FHA) Tutorial 5 August 2014 Mr. Adam Scharl NSWCDD, 540-653-7940 adam.scharl@navy.mil Mr. Kevin Stottlar NSWCDD, 540-653-7301 kevin.stottlar@navy.mil

More information

International Railway Safety Council Berlin. Using Event Data in German Railways for a Systemic Perspective of System Design and Optimization

International Railway Safety Council Berlin. Using Event Data in German Railways for a Systemic Perspective of System Design and Optimization International Railway Safety Council 2014 - Berlin Using Event Data in German Railways for a Systemic Perspective of System Design and Optimization O. Sträter & M. Arenius Department of Work and Organisational

More information

04/04/2018. Definition. Computers everywhere. Evolution of Embedded Systems. Art & Entertainment. Typical applications

04/04/2018. Definition. Computers everywhere. Evolution of Embedded Systems. Art & Entertainment. Typical applications Definition Giorgio Buttazzo E-mail: buttazzo@sssup.it Real-Time Systems are computing systems that must perform computation within given timing constraints. Scuola Superiore Sant Anna http://retis.sssup.it/~giorgio/rts-le.html

More information

9. Verification, Validation, Testing

9. Verification, Validation, Testing 9. Verification, Validation, Testing (a) Basic Notions (b) Dynamic testing. (c) Static analysis. (d) Modelling. (e) Environmental Simulation. (f) Test Strategies. (g) Tool support. (h) Independent Verification

More information

Chapter 1: Introduction

Chapter 1: Introduction Using UML, Patterns, and Java Object-Oriented Software Engineering Chapter 1: Introduction What is a computer program? A list of instructions, written in a specific programming language (Java, C, Fortran,

More information