Designing for Extreme Reliability

Similar documents
HALT/HASS Testing. An Industry Insight Paper

HALT as a Field Service Tool

CESSNA AIRCRAFT COMPANY AIRCRAFT DIVISION WICHITA, KANSAS Report No: RE-GEN-004 Rev B Reliability Substantiation and Program Requirements

Because People Rely on Your Products, Everyday, You can Rely on Us

HALT and HASS Test Equipment For Highly Accelerated Life Test and Stress Screen Applications. Weiss Umwelttechnik GmbH Simulationsanlagen Messtechnik

WHITE PAPER. Is Your Reliability Testing Program Keeping Pace with Manufacturing and Design Advancements?

Elite Electronic Engineering

AM GENERAL SUPPLIER RELIABILITY ASSURANCE REQUIREMENTS

Rugged Computers Displays Storage Networking Embedded

Root Causes of Avionics Can-Not-Duplicate Maintenance Burden & Solutions

Redefining Reliability in Medical Electronic Devices

WIN Enterprises WHITE PAPER Non-recurring Engineering (NRE) Projects for Embedded Computing Platforms

The following activities are covered for the quantitative reliability analysis:

ATC ATC. Commercial Off-the-Shelf (COTS) Commercial Off The Shelf. High Reliability Certification Program. Applications:

Highly Accelerated Thermal Shock Reliability Testing

Aftermarket Technical Service Solutions for Pumping Technologies

The Applications and Reference Design of Railway Rolling Stock in Transportation Systems 1

Introduction to software testing and quality process

COPYRIGHTED MATERIAL RELIABILITY ENGINEERING AND PRODUCT LIFE CYCLE 1.1 RELIABILITY ENGINEERING

Summary of TL 9000 R4.0 Requirements Beyond ISO 9001:2000

Polymer Tantalum Capacitors for Automotive Applications

DATALOGIC SCANNING.

TMS320C6x Manufacturing with the BGA Package

OpenVPX Chassis Thermal Management: Dangerous Myths and Analytical Tools

CONTENTS 0. RELIABILITY TEST SUMMARY INTRODUCTION PRODUCT INFORMATION RELIABILITY...2

AN IGBT Module Reliability Application Note AN April 2015 LN32483 Author: Dinesh Chamund

PAYLOAD TEST REQUIREMENTS

Z Series (Z-Foil) Vishay Foil Resistors.

Module 1 Introduction. IIT, Bombay

The following presentation materials are copyright protected property of Ops A La Carte LLC.

Automotive AEC-Q100 Grade 2 Compliance

PRELIMINARY. HTHA (Z1-Foil)

HTHA (Z1-Foil) Vishay Foil Resistors

Spring Pressure Connection Technology Vibration-Proof Fast Maintenance-Free

Solar Cell Optimization: Cutting Costs and Driving Performance

ALT of Lead Free CCAs

Liquid-cooling on the Agilent SOC Series

Planned Updates to MIL-STD-331 Fuze Engineering Standardization Working Group (FESWG)

Manufacturing Reliability What Is It and Why Should I Care. Aron Brall, CRE SRS Technologies

Mining Field Services. Making the big difference for our customers

Review of US Air Force Parts, Materials and Processes (PMP) Requirements

Electronics Manufacturers Turn To Converters For Help In Handling Hot Components and EMI/RFI Shielding Challenges

Investment in reliability program versus return how to decide

Status USCAR project

MEN Profile Embedded Electronics

Personnel Protection Vehicle Protection Structural Protection

RED and/or IR TO-CAN Laser Diodes

Pump Company. Simsite Structural Composite Pumps. The only Pump that is Impervious to Salt Water Corrosion. Since 1919 VERTICAL PIT

CERTIFIED RELIABILITY ENGINEER (CRE) BODY OF KNOWLEDGE MAP 2018

PRELIMINARY. FRSH Series (0603, 0805, 1206, 1506, 2010, 2512) (Z1-Foil) Vishay Foil Resistors

General HUMS Overview. Jason Alamond

Reliability Module. By: Alex Miller and Mark Robinson. Material Summarized from Reliability Module

The Secret Behind Wide Temperature Technology

SCOPE OF ACCREDITATION TO ISO/IEC 17025:2005

MGA Research Corporation

of the International Traffic in Arms Distribution Regulations Statement: (ITAR) Public and is Release. approved for public distribution.

Liquid cooling facilitates tomorrow s embedded systems

Manitowoc Product Verification Center. Research and development for Grove, Manitowoc, National Crane, and Potain cranes

Cage Code First Edition. A guide to AQS continuous improvement expectations

Reciprocating Compressor Condition Monitoring

Product or Process Modifications Requiring Limited CBTL Retesting to Maintain Safety Certification

MSC Nastran Desktop. The Future is Multidiscipline. Affordable for your Business. Tailored to Your Needs Affordable Easy to Use Scalable Powerful

HALT, HASS, and HASA Application Seminar

Comparison of Requirements ISO/TS vs. AS9100 and ISO 13485

PCTB PC-LAB. Power Cycling Testbench for Power Electronic Modules. Power Cycling Test Laboratory

DE-SERIES CENTRIFUGES

DEPARTMENT OF DEFENSE STANDARD PRACTICE AVIONICS INTEGRITY PROGRAM (AVIP)

RED and IR Frame Laser Diodes

Quality and Reliability Report

Guardian Telecom Systems. Page/Talk.

Reducing Risk and Costs in the Global Supply Chain

An Approach to Predicting Passenger Operation Performance from Commuter System Performance

Signature Failure Analysis-Based Methodology for Customer Failure Analysis

Combined Cycle HRSG and Balance of Plant - Program 88

8.1 Force Structural Management

GORE SKYFLEX aerospace materials

Acoustic and Vibration Isolation for Asylum Research AFMs

Application Note. Capacitor Selection for Switch Mode Power Supply Applications

Reaching around the. world to bring you. the future in THERMOELECTRICS

STI s Solution for High Quantity Production of Stirling Coolers

Mil/Aero Thermal Management

Bridging Supply Chain Gap for Exempt High-Reliability OEM s

FAILURE ANALYSIS. ME 481 Senior Design I Fall Dr. Trevor C. Sorensen

DRS C3 & Aviation Integrated Manufacturing Solutions. DRS C3&A Integrated Manufacturing Solutions. All Rights Reserved.

How Much Will Serialization Really Cost? AN INTRODUCTION TO THE TOTAL COST OF OWNERSHIP

Underground Conveyor Systems Product Overview

DEPARTMENT OF DEFENSE HANDBOOK MECHANICAL EQUIPMENT AND SUBSYSTEMS INTEGRITY PROGRAM

Introduction and Revision of IEC 61508

Power Generation Asset Optimization: Optimal Generating Strategies in Volatile Markets (Case Study) Presented at POWER-GEN 2001 Las Vegas, Nevada

Choosing The Correct Flow Meters For Hydraulic Fracturing

VTU Energy concentrates on consultancy and software systems for the power and energy-intensive process industries.

Supplier Guide Book. Collaborate, perform, grow together

For Cost-Effective Composite Solutions

Sby SR Instruments, Inc.

Reliability Analysis Techniques: How They Relate To Aircraft Certification

Total Quality Systems, Inc. Market Pull: Supporting the Sustainment of the USAF

Root Cause Analysis. Where Does It Fit In The Reliability Liability Context?

3456 en / c. 1 to 20 MW Land and offshore applications

Naval Air Systems Command

Transcription:

White Paper May 2012 Version 1.1 Designing for Extreme Reliability HALT and HASS Testing Helps Improve Product Reliability and Lower Total Cost By: David Coffin, Corporate Quality and Reliability Engineer Overview Very high reliability is a preeminent requirement in many embedded market segments, such as military, aerospace and medical. Making designs more robust, long-established HALT and HASS processes can quickly identify functional and physical weaknesses in a product. This white paper discusses how Radisys applies HALT/HASS testing during product design and manufacturing in order to improve life cycle reliability (e.g., MTBF). CONTENTS Executive Summary pg. 2 Rugged Applications pg. 2 HALT/HASS Overview pg. 2 HALT Testing pg. 3 Next: HASS Screening pg. 5 Measuring MTBF pg. 6 Ruggedized COM Express Modules pg. 6 Partner with HALT/HASS Specialists pg. 6 References pg. 7

2 Executive Summary Very high reliability is a preeminent requirement in many embedded market segments, such as military, aerospace and medical. When human lives are at stake, it s impossible to quantify the value of safe and reliable systems. One proven approach that improves product reliability is to design in additional margin. This is akin to demonstrating a product will operate at 10,000 revolutions per minute (RPM) when it will only be used in the field up to 5,000 RPMs. The resulting design is less susceptible to component variation in the manufacturing process or performance degradation due to typical wear and tear in the field, which otherwise could cause a failure. Margin translates into long term reliability. Long-established HALT (Highly Accelerated Life Testing) and HASS (Highly Accelerated Stress Screening) processes are used to effectively assess and monitor design margin. The purpose of HALT testing is to identify a product s functional and physical limitations, which then can be improved until the fundamental limit of the underlying technologies is reached, thereby achieving the maximum attainable margin before a product is released to manufacturing. HASS screening, performed on 100 percent of manufactured products, verifies margins are maintained. This white paper discusses how Radisys applies HALT/HASS testing during product design and manufacturing in order to improve reliability (e.g., MTBF, availability and failure rate over time). Rugged Applications Systems, like unmanned ground vehicles (Figure 1) or man-wearable computers, must be ruggedized to stand up to extreme temperatures, thermal and dynamic shock, vibration and G-forces. Used on land, at sea or in the sky, these systems have to perform reliably in the field under extreme temperatures and vibration conditions. COM Express extended temperature designs are regularly used to satisfy these requirements because they can stand up to the rigors of the toughest environmental conditions. Running mission-critical applications, these systems are deployed in vehicles, aircraft and ships that require the maximum computing performance in thermally-challenged and spaceconstrained settings. Figure 1. Talon unmanned ground vehicles made by Foster-Miller are rugged, lightweight tracked vehicles widely used for explosive ordnance disposal, reconnaissance, communications, sensing, security, defense and rescue.

3 HALT/HASS Overview The roots of HALT and HASS processes can be traced back to 1969, 1 when Dr. Hobbs developed advanced practices for increasing equipment reliability and ruggedness. The premise is to subject products to extreme environmental conditions up until the point of failure in order to determine the weakest aspects of the design. Subsequently, primary, secondary and tertiary failures are removed by reengineering and redesigning the product until the fundamental limits of technology are reached (i.e., the physics of semiconductor devices). As a result, the product has an increased functional and physical design margin, which safeguards against failure. HALT and HASS processes are intended to augment, not replace, other processes, such as design verification, and manufacturing and life testing, used to improve product quality and reliability. Many Mil/Aero customers require their suppliers, who are original equipment manufacturers (OEMs), to adopt HALT/HASS testing methodologies so they attain higher levels of product quality and reliability. HALT Testing HALT (Highly Accelerated Life Testing) is performed during the product design phase since the outcome often calls for design modifications. The product undergoes multiple stresses that far exceed its environment specifications, with the goal of overstressing the product and inducing failures in the product. Unlike other tests, HALT is not a Pass/Fail test, because the product is expected to fail as thermal and vibration test conditions increase in severity, well beyond use conditions. The outcome is the determination of the product s functional and destruct limits, and if the identified failure modes are corrected, these limits can be pushed out. The main objective of HALT testing is to maximize the limits of a product design. When applied correctly, HALT can improve reliability, identify issues before the product ships, and reduce warranty costs, as indicated in Table 1. OEM Perspective Improves product reliability by increasing design margin Identifies issues that can be addressed before the product ships Reduces warranty cost because there are fewer field failures Product testing is conducted in a HALT chamber, as shown in Figure 2, typically on a few units. Testing consists of stepped vibration and thermal stresses, which determines the actual limits of the design and board components. The vibration test covers six degrees of vibration movement: both directional and rotational movements along the X, Y and Z axes, as illustrated in Figure 3. The shock and vibration, measured in g s, is applied in increments, usually 5-10 Grms (i.e., root-mean-square of acceleration). The stress remains for a minimum dwell time of ten minutes; there must be enough time to run a full functional test on the product. Benefits Derived From the HALT Process Table 1. Benefits Derived From HALT Testing Y X HALT COVERS SIX DEGREES OF FREEDOM THREE DIRECTIONALLY Z Y Component Manufacturer Perspective Identifies design and process limitations quickly Allows engineers to evaluate and improve design margins Reduces product development and certification time when critical product flaws are discovered early Enables design problems to be eliminated before product manufacturing commences Z Figure 2. A HALT Chamber Used to Test a Chassis with Boards THREE ROTATIONALLY X Figure 3. Unlike Traditional Vibration Tests, HALT Covers Six Degrees of Vibration Movement

4 During thermal testing, the temperature is stepped as quickly as possible in 10 degree Celsius increments. Starting at ambient, the temperature is lowered for the cold test and raised for the hot test, as shown in Figure 4. Similar to the vibration test, the dwell time at each step is a minimum of ten minutes, and the product must pass a full functional test. Finally, a profile of both thermal and vibration stresses is executed to assess robustness in a combined stress environment. There is no standards specification for HALT tests; however, the common practice is to use multiple stresses: Maximum thermal limits: high and low temperatures Extreme thermal rates of change Maximum vibration Combinations of vibration and thermal stresses Both thermal and vibration stress are fatigue tests that lead to mechanical and physical failures, examples of which are shown in Table 2. Thermal stresses are particularly effective at identifying electrical and semiconductor component marginality. As failure modes are discovered, they are corrected by design or component improvement until further improvement is either impractical or constrained by the fundamental limit of the underlying technologies. By establishing that the design and components are capable of operating not only to the extended temperature specification but well beyond, HALT demonstrates the true operational limits of the product. The concept and execution of maximizing the design margin is critical to successfully producing reliable, extended temperature products. When HALT testing is completed, the product s functional and destruct limits are determined, and the factors dictating design and manufacturing margin are identified. At this point, the product development team has incorporated the changes they Temperature (C) 90 80 70 60 50 40 30 20 10 0-10 -20-30 -40-50 Every 10 minutes, increase/decrease temperature by 10 C, ramping as quickly as possible. Hot Step Test Figure 4. Hot and Cold Step Stresses Vibration and Thermal Stress Bad solder joints (e.g., metal composition) Surface mount issues (bonding strength) Low mechanical tolerance (expansion coefficients) Raw board problems (e.g., delamination) Time (Minutes) Thermal Stress Cold Step Test Poor component placement (excessive loads, noise) deemed necessary, having repeated HALT testing on reengineered units to verify the efficacy of the modifications. Now there s enough data and design margin understanding to construct a HASS test used to screen products coming off the manufacturing line. Ambient Temperature IC quality problems\ (excessive leakage, electrostatic discharge ESD) Insufficient electrical tolerance (signal quality) Timing issues (signal skew, setup/hold time) Table 2. The Types of Defects Identified Using HALT

5 Next: HASS Screening HASS (Highly Accelerated Stress Screening) ensures the performance of manufactured products exceeds the product specification by a certain level, called operating margin, as shown in Figure 5. The premise of the screen is that a product with greater operating margin will, on average, have higher long term reliability than a product with a lower margin. The test conditions of the screen are established by backing off from the functional and destruct limits found through HALT testing to a level that will not take excessive life out of the product. This is verified with a proof-of-screen test, which also serves to confirm the screen is effectively identifying defective products. The proof-of-screen test repeats the proposed HASS profiles 20 times, while under continuous diagnostic observation. At the high and low test points, the product is powered off and allowed to settle before it is rebooted and retested. A full functional test is executed at the completion of the process to ensure no failures of any kind have occurred. The success requirement for the proof-of-screen test is that all units pass (no failures). An example is given in Table 3. Typically, ten products are thoroughly characterized before and after the proof of screen test to look for any significant performance degradation, in which case the screen conditions are revisited. This exercise may be followed by a life testing on the sample to check whether any infant mortality failure mechanisms were induced by the screen. In order for HASS to be truly effective, it must be performed on 100 percent of the products manufactured. In this way, an OEM can guarantee that each product not only meets the extended temperature requirement, but that it does so with additional operating margin, ensuring long-term performance and reliability. By driving operating margin into the design, an OEM can become immune to material and process variation in manufacturing, attain high yields during all production phases and have sustained reliability and low warranty costs. So to recap, HALT drives the design to its fullest capability, and HASS ensures manufactured products have a certain level of operating margin. Stress Product Specification (Manufacturing Test) HASS Screen HALT Test Use Conditions Operating Margin Functional and Destruct Limits Figure 5. Relative Stress: Manufacturing Test, HASS and HALT HASS Screen Proof-of-Screen Test Thermal and 1) Starting at ambient temperature, Repeat HASS screen 20 times. Vibration the temperature is increased (20 C per minute) to the high test point. During the dwell time, the product is powered down, rebooted and tested. 2) The temperature is ramped from the high to low test point while diagnostic tests are run. During dwell time, the product is powered down, rebooted and tested. 3) The temperature is ramped from the low test point to ambient temperature while functional. Comprehensive product testing is conducted. Table 3. HASS Proof of Screen Test Example

6 Measuring MTBF One of the most widely used predictions of reliability is the mean time between failures (MTBF). The greater the MTBF, the higher the reliability is at a given point in time. The objective of the HALT and HASS processes previously discussed is to increase the inherent reliability of the product. This is achieved by identifying ways to increase the product s design margin (HALT) and ensuring the expected level of operating margin (HASS) in manufactured products. It is important to note that HALT and HASS processes do not yield MTBF data. This is because they are fatigue tests using conditions outside the normal operating range and specification of the product. MTBF is best demonstrated using life tests and actual field data. Radisys uses two methods to arrive at MTBF calculations: Sum of Failure Rates: A product MTBF is derived by adding up the failure rate of every individual board component. Component failure rates, based on millions of operational hours in the field, are found in the MIL-HDBK-217 and Telcordia (Bellcore) SR-332 databases (require subscriptions). Demonstrated Reliability: Radisys tracks every unit it ships, and all field failures, in order to calculate a Radisys-specific MTBF based on cumulative data collected over twenty years. Ruggedized COM Express Modules Radisys extended and industrial temperature products are designed with higher capability components and are subjected to an extensive suite of environmental tests to demonstrate capability of operation in the -25 C to +70 C and -40 C to +85 C temperature ranges. One example is the Radisys CEQM57XT module, which combines Intel Core i7 and i5 processors and the mobile Intel QM57 Express chipset, and supports ruggedized vibration specifications. The module is designed for Mil/Aero, medical and industrial applications such as ruggedized computers/tablets, unmanned vehicles/robots and in-vehicle computers. With thorough design verification and 100 percent HASS screening, OEMs can depend on the sustained reliability of the Procelerant COM Express in harsh, rugged environments. Radisys designs ruggedness into the CEQM57XT by utilizing its proprietary implementation of the HALT testing during the design process to identify and correct product weaknesses prior to production. HASS is then used during production to continually monitor supplier quality and component robustness. Through rigorous HALT/HASS testing, Radisys ensures products are designed to satisfy extended temperature range limits required by its customers. This saves customers months of system development time previously dedicated to ensuring that a module can meet their extended temperature or vibration specifications. Radisys proprietary design and test processes set the standard for COM Express ruggedness compared to simple screening processes employed in the industry. Partner with HALT/HASS Specialists Military programs are very eager to use the HALT/ HASS methodologies for both cost and schedule reasons, according to John E. Starr, consultant at CirVibe and Douglas D. Walker, Principle Test Engineer at ATK. 2 However, spending time and resources performing these tests, not to mention investing in the test equipment itself, is often a heavy burden. Alternatively, OEMS can select suppliers, like Radisys, who perform HALT/HASS testing upfront. By partnering with HALT/HASS testing specialists at Radisys, OEMs can reduce development time as well as deploy systems with extreme reliability quickly and cost-effectively.

7 References 1 http://www.haltandhass.com 2 A Look Under the Hood of HALT and HASS, COTS Journal, January 2008: http://upload.rtcgroup.com/ cotsjournal/digital/pdf/cots0801.pdf, page 16. Corporate Headquarters 5435 NE Dawson Creek Drive Hillsboro, OR 97124 USA 503-615-1100 Fax 503-615-1121 Toll-Free: 800-950-0044 www.radisys.com info@radisys.com 2012 Radisys Corporation. Radisys, Trillium, Continuous Computing and Convedia are registered trademarks of Radisys Corporation. *All other trademarks are the properties of their respective owners. May 2012