|
Little r and Big R |
Reliability
is in the vocabulary of almost everyone.
Ask an ordinary person to define reliability. You’ll get a wide variety of non-responses
ending in “I know what it is, I just
can’t define it right now”.
The
Encarta® Dictionary from Microsoft defines reliability as:
1) Dependable: able to be trusted to do what is expected or has
been promised, and
2) Likely to
be accurate: able to be trusted to
be accurate or correct or to provide a correct result.
Here
is my preferred definition of reliability particularly appropriate for the
chemical and petroleum industries:
|
Reliability is the probability that a device,
system, or process will perform its prescribed
duty without failure for a given time when
operated correctly in a specified environment. |
Notice
the emphasis on the word process which is the king in most production
facilities and operations and without failure which are the key words. The sweet side of the coin is
reliability. The sour side of the coin
is a failure that terminates reliability.
Happiness
is operating without failures. You don’t
get reliability by wishing, hoping, or waiting for miracles. You can not repair yourself to
happiness. You achieve reliability by:
1) Planning for reliability,
2) Controlling for reliability, and
3) Improving reliability.
No miracles in this
effort—just hard work using the correct tools for reliability and
using the tools correctly. An example
for correct use of tools is a chain saw.
A chain saw is a wonderful device for cutting wood—but only if you use
it with engine operating—otherwise it’s a pain!
Dr.
Joe Juran, a modern founder of the quality movement
died on Feb 29, 2008 at age 103 after working up to the day before his
death. Juran promoted planning and prevention
as a complement to control. Juran asked for us
to think with a large scope about quality.
Juran emphasized the difference between
“little q” and “big Q” for quality (an parallels exist
for reliability issues).
1) Little q has to do with problems of production and the
tactical tools that lead to control and improvement of quality
2) Big Q relates to quality management issues because
they are more comprehensive and system wide as pertains to strategic issues.
Of course both little q and big Q are
complementary. Every engineer should own and study a copy of Juran’s Quality Handbook, 5th edition,
page 2.4, Consider Juran’s Table 2.1 where Big Q and
Little Q for quality are contrasted:
|
Table 2.1: Contrast,
Big Q and Little q (where Q and q refer to quality) |
||
|
Topic |
Content of little q |
Content of big Q |
|
Products |
Manufactured goods |
All products, goods, and
services, whether for sale or not |
|
Processes |
Processes directly related
to manufacture of goods |
All process manufacturing
support: business, etc. |
|
Industries |
Manufacturing |
All industries,
manufacturing, service, government, etc., whether for profit or not |
|
Quality viewed as: |
A technological problem |
A business problem |
|
Customer |
Clients who buy the
products |
All who are affected,
external and internal |
|
How to think about quality |
Based on culture of
functional departments |
Based on the universal
trilogy [Juran’s trilogy involves planning, control, and improvement] |
|
Quality goals are
included: |
Among factory goals |
In company business plan |
|
Cost of poor quality |
Costs associated with
deficient manufactured goods |
All costs that would
disappear if everything were perfect |
|
Evaluation of quality is
based mainly on: |
Conformance to factory
specifications, procedures, standards |
Responsiveness to customer
needs |
|
Improvement is directed
at: |
Departmental performance |
Company performance |
|
Training in managing for
quality is: |
Concentrated in the
quality department |
Company wide |
|
Coordination is by: |
The quality managers |
A quality council of upper
managers |
|
Source: Juran’s Quality Handbook, 5th
edition, ISBN
0-07-034003-X |
||
Parallels
exist between quality and reliability.
Quality is static.
Reliability is dynamic.
Reliability’s
dynamic of time makes it a more difficult subject to study, calculate, and
explain.
1)
Little r has to do with problems of production and the
tactical tools that lead to control and improvement of reliability
2)
Big R relates
to reliability management issues because they are more comprehensive and
system wide as pertains to strategic issues.
Never the less, parallels
exist between Little q and Big Q on one side and
Little r and Big R which are described in the following table.
|
Barringer’s Little r and Big R Contrast (Where r, R=Reliability) |
||
|
Topic |
Content of Little r |
Content of Big R |
|
Little r and Big R means |
Little
r refers to a narrow view of reliability involving lower level
events/actions associated with things |
Big
R refers to a very broad view of reliability involving higher level and
broader concepts |
|
Distinction |
Little r relates to
product problems and tactical tools leading to reliability control and
improvements |
Big R is concerned with
comprehensive and system wide concepts which are strategic in nature for
reliability issues |
|
Products |
Generally consider for
manufactured goods but also applies to the production process |
All products, goods, and services,
whether for sale or not |
|
Processes |
Processes directly related
to manufacture of goods |
All processes for
manufacturing support: business, etc. |
|
Industries |
Usually consider as a
manufacturing issue but includes design, construction, installation, etc. |
All industries,
manufacturing, service, government, banking, etc., whether for profit or not |
|
Reliability is viewed as |
A technological problem
involved with failures |
A business problem with
cost and services |
|
Customer is viewed as |
Clients who buy or receive
the products |
All who are affected
including external and internal customers |
|
How to think about
reliability |
Based on culture of
functional departments |
Based on the universal
trilogy for reliability [planning,
control, improvement] |
|
Reliability goals are
included |
Among factory goals and
engineering specifications |
Stated in company business
plans and advertisements |
|
Cost of poor reliability |
Costs associated with
deficient manufactured goods or processes |
All costs that would
disappear if everything were perfect as summarized by the cost of
unreliability (COUR) |
|
Evaluation of reliability
is based mainly on |
Conformance to factory
specifications, procedures, and standards |
Responsiveness to customer
needs, expectations, and advertised statements |
|
Improvement is directed
at: |
Departmental performance
for reducing failures/costs |
Company performance |
|
Training in managing for
reliability is |
Concentrated in the
reliability department |
Company wide driven by a
policy statement for reliability |
|
Coordination is by |
The reliability manager |
A reliability council of
upper managers |
|
Program thrust |
Tactical:
Many small issues drive procedures and numerous rules |
Strategic:
Few large issues based on a reliability policy of intent for the organization
|
|
Audit thrust |
Numerous check list and
tools employed for control and improvement of reliability |
Is management functionally
performing against the reliability objectives and is the program being both
financially and customer successful? as opposed to a procedural audit |
|
Inspired from: Juran’s Quality Handbook, 5th
edition, |
||
How do we approach the
program? Consider Juran’s universal trilogy which
applies to Accounting, Banking, Engineering, Manufacturing, Quality,
Reliability, etc.:
Details for each element are
provided below for reliability issues.
Planning-
|
Reliability planning is a structured process for
developing products and processes that ensure customers needs and process
needs are met by the final result which is devoid of failures. |
Planning is required because
we have gaps between what the customer/owner needs/wants for reliability and what
the product/process delivers for reliability.
The plan must also consider the price that fits the lowest long term
cost of ownership. Cheapest first cost
is often misleading criteria as sustaining cost is usually 2-20 times greater
than the acquisition cost. Many
sustaining costs are incurred as untended consequences of costly events never
detailed upfront as part of the plan because owners/designers never play all
the cards face-up for clarity and understanding.
The end-user/owner of the
product/process must perceive the product/process gives them the beneficial
results they expect: The value must be
perceived in the eye of the end-user/owner and not in the eye of the
designer/manufacturer.
Key metrics must be spelled
out clearly. A good guide for planning
is SAE’s Reliability
and Maintainability Guideline for Manufacturing Machinery and Equipment,
second edition, publication M-110.2,
ISBN 0-7680-0473-X, 1999. For example,
spell out requirements upfront for:
Availability (probability the system can perform when called upon),
Reliability
(probability of failure free operation in the mission interval),
Maintainability
(probability of being repaired in the allowed interval),
Failure
definitions (define critical, non-critical, and benign failures),
Environmental
usage (used in what environments and conditions,
Lessons learned
from previous projects (avoid past errors), and
numerous other details and
figures of merit.
Reliability goals often
shift because of market conditions, what competitors are doing, new technology,
sales prices, etc. This requires knowing
the benchmark and demonstrating flexibility to modify to meet shifting targets
for new goals. The measurements systems
have these universal characteristics:
Specific,
Measurable,
Agreed upon by the
teams,
Realistic but
feasible, and
Time
specific for meeting project goals.
Experience shows three major
categories for destroying inherent reliability of the system are (going from
greatest to least with comparisons for good/bad systems):
Good
Plants Bad Plants
People - ~38% up
to ~80%
Processes/Procedures - ~34% up
to ~70%
Equipment - ~28% up
to ~40%
Planning for error proofing is
important for these critical factors which submarine the reliability of
systems. Critical factors are details
which represent danger to human life, health, environment, loss risks of
people/money/reputation, etc. which are often defined with risks matrix. Too often engineers think the major
reliability problems are with the equipment, however experience says humans are
often the weakest link because of technique errors, errors made worse by lack
of timely feedback (think Three Mile
Island and Chernobyl
nuclear reactor catastrophes), and errors because humans cannot sustain an
indefinite high state of emergency attention.
Here are some typical methods of error proofing:
Eliminate the
error prone operations,
Replace the
human with nonhuman operations,
Facilitate by
assisting the human operator with simple tools or training,
Detection of
the error at the earliest stage such as with automation, and
Mitigation
reduces the serious damage by physical means.
Also the book Hostages
To Each Other: The
Transformation Of Nuclear Safety Since Three Mile Island shows how
the nuclear industry has attacked the human error problems in
Audit the planning efforts
to validate that the plans are effective and implemented. Where deficiencies are discovered, shore up
the weak areas and where practical, implement new goals and bring the correct
skill sets into play to prevent future weaknesses.
Control-
|
Reliability control is a universal managerial
process for conducting the operation so as to provide stability while
preventing adverse changes, and maintain the status quo for failure free
products and processes. |
Control actions are required
to achieve stable reliability at an expensive level or stable reliability at a
less expensive level driven by the cost
of unreliability. Modify the Juran trilogy diagram for reliability in Figure 1.
|
|
|
Figure 1 The
Juran Trilogy Diagram Modified For Reliability |
We plan for reliability
improvements. We implement improvements
to move from the high cost of unreliability to a lower zone of unreliability. We accomplish better control of unreliability
costs by implementing lessons learned and by use of failure data, failure
modes, and the tools of reliability. We
enable the workforce to make numerous improvements by empowering the workforce to
take individual actions for reliability improvements to reduce the high costs
of unreliability as explained in “Reliability Programs: Successful or Failures?”.
Reducing the cost of
unreliability is better achieved by designing for reliability from the
beginning of the project. Unfortunately,
we have too many systems and operating plants that were designed without
thoughts about reliability improvements which results in the need to make
reliability improvements described in Figure 1.
The least cost, highest
return component is to improve our people.
By the actions
of our people we avoid the inadvertent high cost of the politically incorrect
term of MTBSE. Every engineer and every manger knows MTBSE
but does not use the politically incorrect term. Improving our people to let them control the
processes is vitally important for controlling and reducing errors which crash
our processes and plants.
The second most fertile area
for improving control is by our processes and procedures which our people must
follow rigorously to achieve control.
These are usually low cost improvements to aid our most fragile component—our
people.
Of course most engineers do
not have interest in the people component or the process/procedure component
and they zero-in on the equipment. So we
cross wire the improvement effort from the start. Concentrating primarily on the equipment
usually guarantees high capital costs and delays in implementation whereas
concentrating on the people, processes/procedures is both more cost conscious
and more effective for achieving control which eliminates failures.
We control to move to new
levels of performance by improvements.
Improvement-
|
Reliability improvement means creating an
organization dedicated to beneficial change for the purpose of achieving
unprecedented levels of product and process performance often with emphasis
on achieving a breakthrough to be devoid of failures to achieve the lowest
long term cost of ownership. |
Beneficial change is a
control feature applicable to two major reliability improvements by:
Product/process
improvements to better satisfy customers/owners
Freedom from
deficiencies which generate failures and cost money
These techniques result in
better reliability, improve satisfaction from customers/owners, and reduce
costs from restoring the product/process to operable condition such as occurs
with elimination of the early infant mortality issues.
Beneficial changes requires
discovery of the causes for failure.
Then remedies must be applied to prevent the failures and the costly
waste of failures—in short: make
improvements by better control.
Lessons learned libraries
are important sources of details that must be controlled to prevent
failures—see the NASA
lessons learned library for an example.
Use of lessons learned libraries empower and enable the work force to
make improvements—see how the British
Navy ruled the waves by use of:
Empowered teams
-- management authorizes individual initiative and experience to be used
continuously in an effective and time manner—in short individuals in the
organization have authority to take action for corrective action, and
Enabled
teams -- employees are trained and
drilled for proficiency using best practices that are continuously improved by
feedback from the working teams for making improvements—in short, management
must turn-on the organization for improvement action rather than disabling and
denying positive action by individuals.
The rate of improvements must
be measured and demonstrated with Crow-AMSAA reliability
growth plots to insure improvements are demonstrated and demonstrated
quickly to achieve competitive advantages in market. These are show me,
don’t tell me plots of our progress summarize the results of our
improvements. Many examples of
reliability growth plots are shown in the technical paper “Predict Future Failures From Your Maintenance Records”.
Reliability
Policy-
The motivation for a reliability policy
statement is to reduce the high cost of unreliability. This simple statement of intent for the
organization is rarely made by management.
It provides the guiding light for galvanizing the organization toward
improvement actions.
The reliability policy
(usually made in two short paragraphs) can be reduced to one simple concept as
shown in this example:
|
We will build an
economical and failure-free process that will operate for 5 years
between planned turnarounds. |
Management fails to make a
reliability policy statement and then wonders why they incur the high cost of
unreliability! Say what you want and
want what you say so the organization knows what to do. Start with big-R, implement with little-r and
reduce the cost of unreliability.
You can download a PDF file copy of this page
Return to Barringer & Associates, Inc. homepage
Revised March 9, 2010
© Barringer & Associates, Inc., 2008