MTBF | Thomas Reiter
  Managing technical uncertainties

Statistics, RAMS & Quality Management
Search this site Search this siteSearch this site
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-beachte_class="first_item"_im_ersten_li_tag_xxxxxxxx
    • MTBF Calculation
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-beachte_class="last_item"_im_ersten_li_tag_sowie_zusaetzliche_/ul_und_/li_tags_am_schluss_xxxxxxxx

    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-beachte_class="first_item"_im_ersten_li_tag_xxxxxxxx
    • Reliability
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    • MTBF
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    • Functional Safety
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    • FMEA & FMECA
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    • Reliability Block Diagrams
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    • Fault Tree Analysis
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    • Event Tree Analysis
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    • Markov Analysis
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    • Weibull Analysis
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-beachte_class="last_item"_im_ersten_li_tag_sowie_zusaetzliche_/ul_und_/li_tags_am_schluss_xxxxxxxx

    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-beachte_class="first_item"_im_ersten_li_tag_xxxxxxxx
    • Customers
    • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-beachte_class="last_item"_im_ersten_li_tag_sowie_zusaetzliche_/ul_und_/li_tags_am_schluss_xxxxxxxx
    • Projects

A general Introduction into Reliability

Standards, Awareness, Management, Uncertainty

You are here:  Page Content  

This is a high level description about basic aspects of reliability.
For the reader, this introduction doesn't require any knowledge and experience with reliability.

It is sufficient to just understand the following simplistic definition of reliability:

Reliability = The probability that an item performs a required function without failure.

Please note that due to its simplicity, this definition is incomplete.
However, it still suggests "reliable = failure free", and that's precise enough in order to understand this paragraph.


1. Reliability Standards


There are  quite a lot of standards addressing different aspects of reliability. The first institution with a
approach to reliability was the US department of defense (DOD) in the early 1950s.
Meanwhile, the DOD and related institutions (RAC, RiAC,..)  have issued hundreds of documents dealing with various aspects of reliability at different levels of detail. These DOD issued documents can be divided into
XXXX , YYYY and ZZZZ  are 3 to 5 digit numbers.
The following table gives an imagination of the vast scope and various levels of detail of these documents. Please note that this list is only a tiny sample of actually many hundreds of documents:

Document number Title
MIL-HDBK-217 Reliability prediction of electronic equipment.
This is actually the No. 1 MTBF calculation standard.
MIL-HDBK-470 Designing and developing maintainable products and systems
MIL-HDBK-472 Maintainability Prediction
MIL-STD-1629 Procedures for performing a failure mode, effects and criticality analysis.
This is actually the No. 1 FMEA standard.
MIL-HDBK-189 Reliability Growth Management.
Procedures for improving (product) reliability
MIL-STD-1388 DOD requirements for a Logistics Support Analysis record.
In essence: Data exchange format for product reliability related data.
Mil-STD-105 Sampling procedures and tables for inspection by attributes.
In essence: Statistical methods to determine failure rates based on attributive information (good / bad, does fit / does not fit, ...)
MIL-PRF-38534 General specification for hybrid microcircuits
Mil-STD-1472 Human Engineering Design Criteria for Systems, Equipment and Facilities.
Mil-HDBK-338 Electronic reliability design handbook.
A comprehensive guideline for electronic, mechanic and quality engineers.

As you can see, these documents cover many aspects of reliability, with emphasis on electronic equipment. 

During the last few decades, many documents have been made obsolete by the DOD, but they still serve as guidelines, references and look-up material for the civil industry.Many obslolete documents are available in the internet for no charge.
The reasons for the obsolescense are:
Today, military industry is relying almost thoroughly on "civil" reliability techniques .


2. Reliability Awareness


Apart from the military industries, today's civil industries with the highest level of reliability awareness include, but are not limited to:
Significant indicators for reliability awareness are:
The strongest driver for awareness are requirements coming from government and other authorities.
Most people may associate the above list rather with safety than reliability. The short answer is: Safety comprises reliability. A more comprehensive description can be found on the page Reliability vs. Safety.
Basically, a safe system may be unreliable with respect to the functions not directly related to safety. On the other hand, and for the same reason, a reliable system may be unsafe.

3. Reliability Management

However, the reality is that proactive reliability management is quite rare. Only a few big companies have it, while the vast majority of companies adereese reliability in a kind of hindsight manner ("we just completed the project milestone, now let's engage a subcontractor to make an FMECA").


4. Reliability: Uncertainty


A further but essential characteristic of reliability appears when we look at numbers.
Reliability analysis results are typically highly  imprecise and contain quite a lot of uncertainty.
Not only statistical uncertainty (which could be quantified), but also uncertainty regarding assumptions and conclusions,  which very often can hardly be quantified.

In a provocative manner we could say that the accuracy of reliability analysis results can be compared with the accuracy of weather forecasts, nuclear physics, stock price prediction, etc. Even more provocative: Reliability analysis results are unreliable by default. There are some reasons for this:

Typically there is not enough information and experience available, so the reliability analyst has to make use of expert guess, plausibility, and common sense. What makes this "worse" is the fact that customers expect reliability to be determined in advance, even if the
product has built in new technologies for which really no experience exists.

On the statistical side, since confidence intervals are perceived as "high math" for the majority of engineers, even reliability analysts themselves may not be aware of the statistical uncertainty of their results. 
As an example, Mil-HDBK-217 results appear to be "exact",  because they use six or more decimals, for example   1,05229 failures per million hours (fpmh). But the truth is that even the first digit (here 1) is highly speculative in most cases.
This is not only true for Mi-HDBK-217, but also for every other MTBF calculation standard.
Telcordia SR 332 is the only MTBF calculation standard offering standard deviations for failure rates. This makes Telcordia SR 332 "honest", because the standard deviations reveal the level of uncertainty for every failure rate. But this uncertainty covers only the uncertainty in those data,  which has been used to develop theTelcordia SR 332 standard in the past. Of course, it does NOT cover the uncertainty of predicted MTBF vs. real MTBF for new equipment.

Just to make the level of uncertainty usually encountered in reliability analysis really clear:
the real (yet unknown) failure rate was10 fpmh, every reported figure between 5 and 20 fpmh would be a very good reliability prediction, and every figure between 2 and 50 fpmh would still be considered acceptable.

It is a typical situation in reliability workshops that the audience expects from the trainer unambiguous statements how to perform a reliability analysis, which way is right and which is wrong.
Needless to say that these folk would make a big step forward as soon as they  understand the general uncertainty prevailing the reliability universum.
Managing technical uncertainties, in particular dealing with these uncertainties professionally is a strong requirement for the reliability engineer.

Up To top

Next Topic

Privacy Policy