System Reliability Theory: Models and Statistical Method> Marvin Rausand,Arnljot Hoylanc Cowriaht bv John Wilev & Sons. Inc.

Size: px
Start display at page:

Download "System Reliability Theory: Models and Statistical Method> Marvin Rausand,Arnljot Hoylanc Cowriaht bv John Wilev & Sons. Inc."

Transcription

1 System Reliability Theory: Models and Statistical Method> Marvin Rausand,Arnljot Hoylanc Cowriaht bv John Wilev & Sons. Inc Glossary Accelerated test A test in which the applied stress level is chosen to exceed that stated in the reference conditions in order to shorten the time required to observe the stress response of the item, or magnify the responses in a given time. To be valid, an accelerated test shall not alter the basic modes and mechanisms of failure or their relative prevalence (BS 4778). Accident An unintended event or sequence of events that causes death, injury, environmental, or material damage (DEF-STAN 00-56) Active redundancy That redundancy wherein all means for performing a required function are intended to operate simultaneously [IEC 50( 191)]. Availability The ability of an item (under combined aspects of its reliability, maintainability, and maintenance support) to perform its required function at a stated instant of time or over a stated period of time (BS 4778). Basic event The bottom or "leaf" events of a fault tree. The limit of resolution of the fault tree. Examples of basic events are component failures and human errors (NASA2002). Coherent system A system whose structure function is nondecreasing as a function of all state variables. Note: A system is coherent when: if all the components are in a failed state, the system is in a failed state, if all the components are functioning, the system is functioning, * when the system is in a failed state, no additional component failures will cause the system to function, when the system is functioning, no component repair will cause the system to fail. * 599

2 600 GLOSSARY Common cause failure Multiple component faults that occur at the same time or that occur in a relatively small time window and that are due to a common cause (NASA 2002). Failure, which is the result of one or more events, causing coincident failures of two or more separate channels in a multiple channel system, leading to system failure (IEC61508, Part 4). Corrective maintenance The actions performed, as a result of failure, to restore an item to a specified condition [MIL-STD-2173(AS)]. The maintenance carried out after a failure has occurred and intended to restore an item to a state in which it can perform its required function (BS 4778). Dependability The collective term used to describe the availability performance and its influencing factors: reliability performance, maintainability performance, and maintenance support performance (IEC 60300). Design review A formal, documented, comprehensive, and systematic examination of a design to evaluate the design requirements and the capability of the design to meet these requirements and to identify problems and propose solutions (IS0 8402). Distribution function Consider a random variable X. The distribution function of X is Fx(x) = Pr(X I x) Downtime The period of time during which an item is not in a condition to perform its required function (BS 4778). Equipment under control (EUC) Equipment, machinery, apparatus, or plant used for manufacturing, process, transportation, medical, or other activities (IEC 61508, Part 4). Fail safe A design property of an item that prevents its failures being critical failures (BS 4778). A design feature that ensures the system remains safe or, in the event of a failure, causes the system to revert to a state that will not cause a mishap (MIL-STD 882D). Failure The termination of its ability to perform a required function (BS 4778). An unacceptable deviation from the design tolerance or in the anticipated delivered service, an incorrect output, the incapacity to perform the desired function (NASA2002). A cessation of proper function or performance; inability to meet a standard; nonperformance of what is requested or expected (NASA 2000). Failure cause The physical or chemical processes, design defects, quality defects, part misapplication, or other processes which are the basic reason for failure or which initiate the physical process by which deterioration proceeds to failure (MIL- STD- 1629A). The circumstances during design, manufacture, or use which have led to a failure [IEC 50(191)].

3 GLOSSARY 601 Failure effect The consequence(s) a failure mode has on the operation, function, or status of an item (MIL-STD-1629). Failure mode The effect by which a failure is observed on the failed item (EuRe- DatA, 1983). Failure mode and effect analysis (FMEA) A procedure by which each potential failure mode in a system is analyzed to determine the results or effects thereof on the system and to classify each potential failure mode according to its severity (MIL-STD- 1629A). Failure rate The rate at which failures occur as a function of time. If T denotes the time to failure of an item, the failure rate z(t) is defined as z(t) = lim Af-iCC Pr(t < T I t + At I T > t) At The failure rate is sometimes called force of mortality (FOM). Failure symptom An identifiable physical condition by which a potential failure can be recognized [MIL-STD-2173(A)]. Fatigue Reduction in resistance to failure of a material over time, as a result of repeated or cyclic applied loads [MIL-STD-2173(AS)]. Fatigue life For an item subject to fatigue, the total to functional failure failure of the item [MIL-STD-2173(AS)]. Fault A defect, imperfection, mistake, or flaw of varying severity that occurs within some hardware or software component or system. Fault is a general term and can range from a minor defect to a failure (NASA 2002). Abnormal condition that may cause a reduction in, or loss of, the capability of a functional unit to perform a required function (IEC61508, Part 4). Fault tolerance Ability of a functional unit to continue to perform a required function in the presence of faults or errors (IEC 61508, Part 4). Force of mortality Same as Failure rate. Functional unit Entity of hardware or software, or both, capable of accomplishing a specified purpose (IEC61508, Part 4). Functioning state The state when an item is performing a required function (see also Operating state ). Gradual failure Failure that could be anticipated by prior examination or monitoring (BS 4778). Hazard rate Same as Failure rate. Hidden failure A failure not evident to the crew or operator during the performance of normal duties [MIL-STD-2173(AS)]. Infant mortality The relatively high conditional probability of failure during the period immediately after an item enters service. Such failures are due to defects in manufacturing not detected by quality control [MIL-STD-2173(AS)].

4 602 GLOSSARY Inspection Activities such as measuring, examining, testing, gauging one or more characteristics of a product or service, and comparing these with specified requirements to determine conformity (IS08402). Intermittent failure Failure of an item for a limited period of time, following which the item restores its required function without being subjected to any external corrective action (BS 4778). Life cycle cost (LCC) The cost of acquisition and ownership of a product over a defined period of its life cycle. It may include the cost of development, acquisition, operation, support, and disposal of the product (IEC 60300). Maintainability The ability of an item, under stated conditions of use, to be retained in, or restored to, a state in which it can perform its required functions, when maintenance is performed under stated conditions and using prescribed procedures and resources (BS 4778). Maintenance The combinations of all technical and corresponding administrative actions, including supervision actions, intended to retain an entity in, or restore it to, a state in which it can perform its required function [IEC50(191)]. Maintenance support performance The ability of a maintenance organization, under given conditions, to provide upon demand, the resources required to maintain an entity, under a given maintenance policy [IEC 50(191)]. Mean time to failure (MTTF) Let T denote the time to failure of an item, with probability density f (t) and survivor function R(t). The mean time to failure is the mean (expected) value of T which is given by Mean time to repair (MTTR) Let D denote the down time (or repair time) after a failure of an item. Let f~(d) denote the probability density of D, and let Fo(d) denote the distribution function of D. The mean time to repair is the mean (expected) value of D which is given by roo MTTR = lo t. f a f~(t)dt = (1 - F~(t))dl 10 MTTR is sometimes called the mean down time (MDT) of the item. In some situations MTTR is used to denote the mean active repair time instead of the mean down time of the item. Operating state The state when an entity is performing arequired function [IEC 50( 191)l. Percentile Let X be a random variable with distribution function F(x). The upper 1006% percentile x, of the distribution F(x) is defined such that Pr(X > x,) = 6 Preventive maintenance The maintenance carried out at predetermined intervals or corresponding to prescribed criteria and intended to reduce the probability of failure or the performance degradation of an item (BS 4778).

5 GLOSSARY 603 Probability density tion fx(x) of X is Consider a random variable X. The probability density func- where FX (x) denotes the distribution function of X. Quality The totality of features and characteristics of a product or service that bear on its ability to satisfy stated or implied needs (IS0 8402). Redundancy In an entity, the existence of more than one means for performing a required function [IEC 50( 191)]. Existence of means, in addition to the means which would be sufficient for a functional unit to perform a required function or for data to represent information (IEC61508, Part 4). Reliability The ability of an item to perform a required function, under given environmental and operational conditions and for a stated period of time (IS0 8402). Reliability centered maintenance (RCM) A disciplined logic or methodology used to identify preventive maintenance tasks to realize the inherent reliability of equipment at least expenditure of resources [MIL-STD-2173(AS)]. Repair The part of corrective maintenance in which manual actions are performed on the entity [IEC 50(191)]. Required function A function, or a combination of functions, of an entity, which is considered necessary to provide a given service [IEC 50(191)]. Safety Freedom from those conditions that can cause death, injury, occupational illness, or damage to or loss of equipment or property (MIL-STD-882D). The expectation that a system does not, under defined conditions, lead to a state in which human life is endangered (DEF-STAN 00-56). Safety integrity Probability of a safety-related system satisfactorily performing the required safety functions under all the stated conditions within a specified period of time (IEC61508, Part 4). Safety integrity level (SIL) Discrete level (one out of a possible four) for specifying the safety integrity requirement of the safety functions to be allocated to the E/E/PE safety-related systems, where safety integrity level 4 has the highest level of safety integrity and safety integrity level 1 has the lowest (IEC 61508, Part 4). Security Dependability with respect to the prevention of unauthorized access andor handling of information (Laprie, 1992). Severity The consequences of a failure mode. Severity considers the worst potential consequence of a failure, determined by the degree of injury, property damage, or system damage that could ultimately occur (MIL-STD- 1629A) Single failure point The failure of an item which would result in failure of the system and is not compensated for by redundancy or alternative operational procedure (MIL-STD- 1629).

6 604 GLOSSARY State variable A variable X (t) associated with an item such that: 1 if the item is functioning at time t 0 if the item is in a failed state at time t State vector A vector X(t) = (Xl(t), X:!(t),..., X,(t)) of the state variables of the n components comprising the system. Step stress test A test consisting of several stress levels applied sequentially for periods of equal duration to one sample. During each period a stated stress level is applied and the stress level is increased from one period to the next (BS 4778). Structure function A variable #(X(t)) associated with a system (with state vector X(t)) such that: #(X(t)) = { 1 if the system is functioning at time t 0 if the system is in a failed state at time t Sudden failure A failure that could not be anticipated by prior examination or monitoring [IEC 50(191)]. Survivor function Let T denote the time to failure of an item. The survivor function R(t) of the item is R(t) = Pr(T > t) fort 2 0 R(t) is sometimes called the reliability function or the survival probability at time t of the item. System A bounded physical entity that achieves in its domain a defined objective through interaction of its parts (DEF-STAN 00-56). Set of elements which interact according to a design, where an element of a system can be another system, called subsystem, which may be a controlling system or a controlled system and may include hardware, software and human interaction (IEC61508, Part 4). Systematic failure Failure related in a deterministic way to a certain cause, which can only be eliminated by a modification of the design or of the manufacturing process, operational procedures, documentation, or other factors (EC 61508, Part 4). Test frequency The number of tests of the same type per unit time interval; the inverse of the test interval (IEEE Std. 352). Test interval The elapsed time between the initiation of identical tests on the same sensor, channel, etc. (IEEE Std. 352). Wear-out failure A failure whose probability of occurrence increases with the passage of time, as a result of processes inherent in the entity [IEC 50( 191)].