& Quality Management
this site Search
general Introduction into Reliability
This is a high level description
about basic aspects of reliability.
For the reader, this introduction
doesn't require any knowledge and experience with reliability.
It is sufficient to just understand
the following simplistic definition of reliability:
Reliability = The probability that
an item performs a required function without failure.
Please note that due to its simplicity,
this definition is incomplete.
However, it still suggests "reliable
= failure free", and that's precise enough in order to understand this
There are quite a lot of standards
addressing different aspects of reliability. The first institution with a
1. Reliability Standards
approach to reliability was the US
department of defense (DOD) in the early 1950s.
the DOD and related institutions (RAC, RiAC,..) have issued
hundreds of documents dealing with various aspects of reliability at
different levels of detail. These
DOD issued documents can be divided into
- stringent and
XXXX , YYYY and ZZZZ are 3 to 5
following table gives an imagination of the vast scope and various
levels of detail of these documents. Please note that this list is only
a tiny sample of actually many hundreds of documents:
- Handdbooks (Mil-HDBK-XXXX)
- Standards (Mil-STD-YYYY)
- Performance Specifications
As you can see, these documents cover
many aspects of reliability, with emphasis on electronic
During the last few decades, many
documents have been made obsolete by the DOD, but they still serve as
guidelines, references and look-up material for the civil industry.Many
obslolete documents are available in the internet for no charge.
The reasons for the obsolescense are:
||Reliability prediction of electronic equipment.
This is actually the No. 1 MTBF
||Designing and developing maintainable products and systems
||Procedures for performing a
failure mode, effects and criticality analysis.
This is actually the
No. 1 FMEA standard.
||Reliability Growth Management.
Procedures for improving (product) reliability
||DOD requirements for a Logistics
Support Analysis record.
In essence: Data exchange format for product reliability related data.
||Sampling procedures and tables
for inspection by attributes.
In essence: Statistical methods to determine failure rates based on
attributive information (good / bad, does fit / does not fit, ...)
||General specification for hybrid
||Human Engineering Design Criteria for Systems, Equipment
||Electronic reliability design
A comprehensive guideline for electronic, mechanic and quality
Today, military industry is relying
almost thoroughly on "civil" reliability techniques .
- decreasing military budget,
- better awareness of reliability in civil industry
- Functional safety
for example, is meanwhile a widely established toolset for managing
safety in civil industries. Simply put, safety = reliability + further
Apart from the military industries,
today's civil industries with the highest level of reliability
awareness include, but are not limited to:
2. Reliability Awareness
Significant indicators for
reliability awareness are:
- (nuclear) power
- others like elevator
The strongest driver for awareness
are requirements coming from government and other authorities.
- company quality policy contains
- methods are established,
understood and carried out by personnel,
- industry-specific standards
- written and binding reliability
- warranty database with
corrective action process
- reliability engineers exist and
have influence on R&D
- reliability tests are carried
out and have influence on R&D
Most people may associate the above list rather with safety than
reliability. The short answer is: Safety comprises reliability. A more
comprehensive description can be found on the page Reliability vs.
Basically, a safe system may
unreliable with respect to the functions not directly related to
safety. On the other hand, and for the same reason, a reliable system
may be unsafe.
3. Reliability Management
However, the reality is that
proactive reliability management is quite rare. Only a few big
companies have it, while the vast majority of companies adereese
reliability in a kind of hindsight manner ("we just completed the
project milestone, now let's engage a subcontractor to make an FMECA").
- It's no big deal to explain
reliability methods and metrics to engineers.
- The author has given
reliability related trainings many times in various industries ... it's really not
a big deal.
- Furthermore, almost everybody
would agree that producing reliable products is a success factor.
- And finally, almost nobody
would decline that established
reliability methods and techniques is a success factor for a company.
A further but essential
characteristic of reliability appears when we look at numbers.
4. Reliability: Uncertainty
Reliability analysis results are typically highly imprecise and
contain quite a lot of uncertainty.
Not only statistical uncertainty (which could be quantified), but also
uncertainty regarding assumptions and conclusions, which very
often can hardly be
In a provocative manner we could say that the accuracy of reliability
analysis results can be compared with the accuracy of weather
forecasts, nuclear physics, stock price prediction, etc. Even more
provocative: Reliability analysis results are unreliable by default.
are some reasons for this:
Typically there is not enough information and experience available, so
the reliability analyst has to make use of expert guess, plausibility,
and common sense. What makes this "worse" is the fact that
customers expect reliability to be determined in advance,
even if the product has built
in new technologies for which really
no experience exists.
On the statistical side, since confidence intervals are perceived as
"high math" for the majority of engineers, even reliability analysts
themselves may not be aware of the statistical uncertainty of their
As an example, Mil-HDBK-217 results appear to be "exact", because
they use six or more decimals, for example 1,05229 failures per
million hours (fpmh). But the truth is that even the first digit (here
1) is highly speculative in most cases.
This is not only true for Mi-HDBK-217, but also for every other MTBF calculation standard.
Telcordia SR 332 is the only MTBF
calculation standard offering standard deviations for failure rates.
This makes Telcordia SR 332 "honest", because the standard deviations
reveal the level of uncertainty for every failure rate. But this
uncertainty covers only the uncertainty in those data, which has
been used to develop theTelcordia SR 332 standard in the past. Of
course, it does NOT cover the uncertainty of predicted MTBF vs. real
MTBF for new equipment.
Just to make the level of uncertainty usually encountered in reliability analysis really clear:
If the real (yet unknown) failure rate was10 fpmh, every reported figure between 5 and 20 fpmh
would be a very good reliability prediction, and every figure between 2
and 50 fpmh would still be considered acceptable.
It is a typical situation in reliability workshops that the audience
expects from the trainer unambiguous statements how to perform a
reliability analysis, which way is right and which is wrong.
Needless to say that these folk would make a big step forward as soon
they understand the general uncertainty prevailing the
Managing technical uncertainties, in particular dealing with
these uncertainties professionally is a strong requirement for the