Conducting a successful highly accelerated life test (HALT) is not a complex task. However, some of the unique characteristics of the testing can make it a challenge to execute it correctly. Here is a list of 10 key elements of HALT that can help in the planning and implementation of the testing.

1. Be prepared to stress beyond specification

The failure modes exposed in HALT could take months or years to fail in the product’s normal use environment. If the test stresses are limited to the specifications of the product then the failure modes will take months or years to occur. To achieve the dramatic fatigue acceleration experienced in HALT and find these failure modes in just a few days it is essential to stress the product outside of its specifications. Accelerated results demand stressing beyond specification. During HALT, it is expected that the failures found will be outside of the product specifications, and often outside of the specifications of the individual components.

Source: ESPECSource: ESPEC

When first exposed to the idea of testing a product beyond specification, engineers often express a fear that the failure modes found won’t be relevant. It seems likely that the failure modes found will simply represent the design limits of the product. A thorough failure analysis is the tool that will determine the validity of a failure mode. In practice, the controlled overstressing that is implemented in HALT will first force out the failure modes that are most susceptible to the given stress, and so are most likely to fail early in the product’s life. Testing beyond specification is an integral component of the effectiveness of HALT.

2. Perform failure analysis on all failure modes found

At the conclusion of a HALT, users will typically have several failure modes identified. The next challenge is failure analysis. There is always a temptation to think “Well, of course it failed, it was outside of specification. What did you expect?” and ignore a failure mode without further investigation. This type of thinking can dramatically reduce the value realized from the testing. It is true that in some cases a failure mode may be due to a limit of the technology and so can be safely ignored. A first step in failure analysis can be to ask a simple question — Do I understand this failure mode enough to say if it could have a distribution that could cause it to show up at lower stress levels? If the failure mode is well understood and would clearly always require an extreme stress to induce it, then users may be able to safely ignore it. Otherwise, (and typically) failure analysis is in order.

The good news is that this failure analysis is often not that difficult. HALT is showing users the design issues that would otherwise show up in the early life of the product in the field. These issues typically don’t require major redesigns to correct. HALT just gives testers the chance to do the failure analysis and implement the corrective action before the product is released.

3. Understand the impact of repetitive shock vibration on the test configuration

HALT testing uses repetitive shock (RS) vibration for mechanical stressing. This type of vibration is dramatically different from the much more common electro-dynamic (ED) shaker systems. One of the more significant differences is that the RS table is not rigid like the ED shaker table. An RS table flexes during operation, resulting in differences in the energy delivered to a product fixtured at different locations on the table. This needs to be taken into consideration when configuring multiple units to be tested simultaneously because it is desirable to have the units subjected to similar levels of mechanical stress during testing.

The resonant characteristics of the RS table include a primary mode that is an ‘oilcan’ mode, with the center of the table moving up and down in opposition to the sides and corners of the table. This results in excitation that is roughly symmetric about the center of the table. Additional modes of the tabletop result in excitation that is symmetric in quadrants of the table. This means that products that are fixtured in the center of the four quadrants will experience approximately equal excitation at a given vibration setpoint. So, when testing multiple products, four products tested simultaneously is a good number, with each product fixtured at the center of each of the four quadrants of the table.

4. Fixture the product correctly

HALT fixturing is typically straightforward and uncomplicated. However, it is also dramatically different from fixturing for an ED shaker. Here are a few basic guidelines for the design of HALT vibration fixtures.

Put rails under the product to keep the product up off the table. This allows air flow under the product and keeps the product from contacting the thermal mass of the table, significantly improving the thermal change rates on the product.

In general, think in terms of clamping the product tightly down to the table to get a good path for the shock pulses from the table to the areas of the product where failures are expected. If the product’s basic structure is a frame or enclosure, then fixturing with rails across the top of the product, clamping it down to the table, is a good way to start. If testing a board level product, fixturing will often consist of using existing mounting holes in the board to secure it to standoffs that are bolted to fixturing elements that are in turn bolted to the table top. If the boards don’t have sufficient mounting holes for this approach, then board edge clamping is a good alternative. In any case, avoid the use of large, heavy plates, since these can dampen the table’s natural responses and reduce the vibration getting into the product.

5. Monitor the product’s response

There are two primary reasons for monitoring the thermal and vibration stresses on the product itself, ideally at the points on subassemblies where failures are expected to occur. First, the RS vibration used in HALT can induce mechanical responses in the product that are very different from the characteristics of the table. In fact, monitoring the table in an RS machine has little value. The product response is the key. To understand the actual stresses that induced a given failure mode in a product, stress measurements taken from near the failure location are necessary. Similarly, the mechanical configuration of a product can make the thermal stresses seen at the point of failure to be different from the actual temperature of the air in the chamber. Again, measurements from near the point of failure will provide the most valid data for understanding the failure mode.

A second reason to monitor the product is that it is very common to repeat a HALT on a product after corrective actions are implemented. Seemingly minor changes in vibration fixturing or product location on the table can dramatically affect the relationship between the vibration and thermal setpoints of the system and the actual stresses seen at the failure location. Simply reproducing the chamber setpoints when repeating a test does not guarantee that the stresses at the point of failure have been duplicated. Measurement on the product provides the data necessary to effectively repeat a HALT.

6. Avoid a pass/fail mentality

One of the more unusual characteristics of a HALT is that it isn’t a test t0 pass or fail. HALT is a failure-oriented test. The goal is to find failure modes, not to demonstrate product reliability. Stress levels are increased until failure modes are induced. The output of the test is a list of failure modes. In fact, the only real way to ‘fail’ a HALT is to not find any valid failure modes. The value of the test is in the failure modes found.

All these concepts are basic to HALT and when a company first implements HALT there is typically a general acceptance of these concepts. After a few tests have been completed, however, there can be a tendency for the ‘Pass/Fail’ mentality to creep in. The logic tends to be “Well, we tested Product A and all the hot thermal failures happened below x degrees. If we don’t find any hot failures below x degrees with Product B, then it’s just as good and we don’t have to test any higher.” The risk here is obvious — Product B may have a critical failure mode above x degrees that could go undetected. The level at which a failure is induced generally isn’t nearly as important as the nature of the failure. Remember the goal of the testing — finding failure modes!

7. Stimulate, not simulate

The phrase “stimulate, not simulate” is often heard during HALT training. It serves as a quick way to remember that the purpose of HALT is to rapidly stimulate failure modes in a product without necessarily attempting to simulate the normal use conditions of the product. The end use configuration or environment for a product may very well not be the best conditions for rapidly stimulating failures in the product. For example, when testing a product that consists of a circuit card or cards mounted inside of an enclosure, it is typical to remove the cards from the enclosure for the test. This generally will improve air flow, and so thermal change rates, on the product. Also, when the boards are tested inside of the enclosure, it essentially becomes part of the vibration fixture, and it may very well be a poor vibration fixture, dampening the vibration that the boards experience during the testing.

Modification of the product, such as removing the top cover from an enclosure or cutting holes in an enclosure to improve air flow across internal components, is certainly a reasonable thing to do in HALT if it will improve the effectiveness of the test without compromising the functionality of the product. In general, when deciding on the final configuration for testing the product, remember the purpose and think in terms of maximizing the thermal and vibration stresses on the product and not on mimicking the end use configuration.

8. HALT is a multi-disciplinary test

Typically, reliability testing is started near the end of the product development cycle and requires little support outside of reliability engineering to complete. HALT is very different in this respect. It is done much earlier in the design cycle, as early as feasible, and is very much a joint effort between reliability, design, test and other engineering disciplines. It is important to have the necessary personnel committed to the support of the testing before testing starts.

During HALT failure modes are induced that often require immediate analysis to determine which way to go in the next steps of testing. ‘Workarounds’ need to be developed to allow testing to continue after failure modes have been found. Failure modes can be exposed that may be related to process rather than design issues. Often failure modes are related to how the software is responding to unexpected hardware performance. And understanding what is happening during functional test and how these failure modes are identified is critical to the success of the test. Addressing these issues requires input from a range of engineering disciplines that are not typically intimately involved in reliability testing. Understanding this, and making sure that the necessary personnel are available to support the test in a timely manner, is important to the success of the testing.

9. Remember the importance of functional testing

Remember that the goal of HALT is to induce failure modes. If some of these failure modes are not detected, then much of the value of the testing can be lost. Consequently, a thorough functional test routine is very important to the success of the testing. It can be valuable to carefully evaluate the functional blocks of the product and determine how failures in each of those functional blocks will be detected and reported. It is not unusual for HALT to require a custom functional test suite to get the coverage and failure detail necessary to quickly identify and understand the failure modes found.

A common mistake made in functional test design is to not include a power cycle. There is often a significant amount of hardware and software in a product that is dedicated to simply making sure that the product powers up successfully. A power cycle is the only way to test these design components.

When designing the functional test, it is also important to remember that this test will be run while the product is under stress inside the chamber. During HALT many of the failure modes found are ‘soft’ failures — a failure occurs at a certain level of stress, such as a temperature level, but then goes away when the stress is reduced. It is extremely valuable to detect these soft failures. If left uncorrected, they can turn into ‘no defect found’ types of field issues that are very difficult to understand and resolve. And the only way to detect these soft failures is by functionally testing and monitoring the product while it is under stress.

10. Educate all personnel involved

When compared to traditional reliability testing, such as accelerated life testing or reliability growth testing, HALT is unique in many respects. HALT is a failure-oriented test, with no pass/fail criteria, that stresses a product far beyond its intended use environment. The product may be configured in ways completely unlike normal use. Without a thorough understanding of the theory behind HALT, engineers who are involved in the testing may find these aspects of the test to be confusing and counter intuitive. This can lead to a reluctance to follow the prescribed HALT protocol for fear of overstressing the product, or dismissing failure modes because they were found outside of the product’s specifications, among other problems. The same ideas can be held by management as well, leading to changes in the test protocol that dramatically reduce the effectiveness of the test.

The key to dealing with these issues is to make sure that all personnel involved in the testing, as well as the managers of these personnel, have had training in the basics of HALT theory. While it is unique, the theory is not complex, and once it is understood and accepted by those involved, testing can go much more smoothly, yielding the expected valuable results.

While this list is certainly not exhaustive, it does cover some of the major items to consider as HALT is planned and executed. Going through this list before starting a HALT can help users avoid some major pitfalls and increase their chances of a successful test campaign. Find more HALT support by connecting with the ESPEC Solutions Group by visiting their website.