Instructions for using the Frestimate System Reliability Model Module.

1.       The Frestimate System Reliability Model Module is used with either the Software Reliability Toolkit or the Frestimate software.  The user must have one of these products purchased prior to using this module.

2.       Using one of the above tools create a prediction for each software LRU in the system.  Make sure to include all in-house developed software, Commercial Off the Shelf (COTS), Government Furnished Software (GFS), Free Open Source Software (FOSS), and firmware.  Ideally the software should be designed so that each software LRU (typically called a Computer Software Configuration Item) is associated cohesively with a particular target hardware.  So, if the system is an automobile the software components might include the transmissions software, GPS software, Camera software, security software, convertible top control software, entertainment software, temperature control software, etc.  Unless the software in the system is very small, there should be more than one software LRU.  The LRU for software is the Dynamic Link Library (DLL) file or the Executable (EXE) file or the application file.

3.       Once the predictions are defined for each software LRU, launch the System Reliability Model Module. 

4.       Select the File->Open menu option.  Select the Frestimate or Software Reliability Toolkit file that you just created.  It will usually be in the c:\SWRT folder.

5.       View the Failure Mode Tab.  There is a checkbox and a pulldown menu.  All other columns are read only.

 

         Filter results for critical failures only checkbox

The Frestimate software generates 2 sets of reliability figures of merit.  The first set includes failures of any criticality and the second set includes only the failures predicted to affect availability.  When outputting the reliability figures, the “Filter results for critical failures only” checkbox is used to toggle the results.

 

Checked

Uses the predicted number of failures that are expected to affect availability.  On average this can be about 2-8% of all software failures.

Unchecked

Uses all types of failure severities which are serious enough to be noticeable.

 

Model Type

The “Model type” is a pulldown menu.  Select either “Failure Rate” or “MTBF”.  Both results are actually exported but the selected model type will be displayed when the predictions are imported into the Isograph Reliability Workbench™ software.  This field is used to show which results are shown on the Reliability Blocks. 

 

 

The other information on this page is read only and is retrieved from the Frestimate prediction file.  Review each column prior to exporting.  Any changes to any of the below inputs must be made using the Frestimate standard or manager’s edition or the software reliability toolkit.

Column Header

Description

Software component name

The name of the software LRU that you defined in the In-house and COTS worksheets of the software reliability toolkit or Frestimate software

Type of software component

This is either Application (developed by your organization) or COTS (commercially developed by another organization.  The in-house software components may exhibit different failure modes than the COTS components.  For example, COTS components are generally more likely to have interface or update problems.

Model type

Select either “Failure Rate” or “MTBF”.  Both results are actually exported.  This field is used to show which results are shown on the Reliability Blocks.

Month of Interest

You defined this in the Frestimate or Software Reliability Toolkit.  It is how many months of operational growth will transpire prior to the milestone of interest.  This is hardly ever more than 12 months due to the fact that most software systems undergo new feature releases at least yearly.

Unavailability, Failure rate and MTBF.

These are all computed at the “Month of Interest”. If you wish to change the month of interest, you will need to do so in the Software Reliability Toolkit or Frestimate software.

Restore time

There is no “MTTR” for software because software does not wear out.  There is a Mean Time To Software Restore which is a weighted average of restart time, reboot time, workaround time.  In some cases, some operational failures can only be fixed via a change to the software product code. So, the administrator time to get a new release of software or downgrade to a previous version of software is also considered.

Unavailability, Failure rate and MTBF for the first 10 months of operational usage.

Isograph Reliability Workbench allows for 10 different milestones for these predictions.  We associate the 10 milestones with the first 10 months of operational growth that is predicted to occur.

6.       Select the Fault Tree Tab. 

This tab shows 28 different failure mode/root cause pairs associated with software. Edit each of the columns for each software LRU so that the failure modes most related to a particular LRU are given a higher weighting than those that are least likely.  Some tips for determining which failure modes are more likely than others can be found in Effective Application of Software Failure Modes Effects Analysis as well as in IEEE 1633 Recommended Practices for Software Reliability. 

In summary a set of software trouble reports from prior systems or releases can be each analyzed for the root causes shown below.  The likelihood of each of the root causes can then be calculated as the number of reports with that root cause divided by the total number of reports.  This method for computing the likelihood does assume that the future failure modes and their likelihood will be similar to what was experienced in the past.  Another option is to brainstorm each of the below root causes with personnel who have experience with testing or supporting the software once deployed.  Depending on the type of software and maturity of the software, several of the root causes may not be relevant.  NOTE: the root causes and their relative frequencies can and do vary from one software Line Replaceable Unit (LRU) to another.  For example, the user interface component of a system may have different failure modes than a firmware component or a database component.

The below is a listing of the 28 failure mode/root cause pairs.  The fault tree will display the below events and assign a failure rate for each event that is the product of its relative weighting which you input on the above tab and the failure rate prediction for that software LRU. So, for example if the software LRU is predicted to have a failure rate of .001 and you assign equal relative portions of the below failure mode/root causes then each even will have a resulting failure rate of .001/28.

a) Edit the cells under each failure mode/root cause heading to assign more or less weighting to each  of the below failure mode/root cause pairs.  For more information about these failure mode/root cause pairs see “Effective Application of Software Failure Modes Effects Analysis” http://softrel.com/SoftwareReliabilityPublications.html.

b) The “compute” button ensures that the relative portions for each of the 28 failure mode/root cause pairs equal 1.  It is possible to assign a relative portion of 0 to a failure mode/root cause column if you have no past or current evidence that the failure mode is likely.  For example, if the software is installed exclusively in a factory or by a qualified service technician the likelihood of a serviceability failure mode is relatively small.

Note that all of the below failure modes and root causes can and do occur as a single point failure.  Three of the faulty error handling root causes happen when there is a failure in the system (i.e. hardware or other software) that the software fails to detect or handle.  You can supply a failure rate for the system event which is not necessarily related to the software itself.  For example, if the software LRU is a transmission software system and the transmission hardware encounters a failure and the software fails to detect it or fails to recover from it that is both a hardware failure and a software failure.

Generic failure mode

Specific root cause

Faulty functionality

This LRU performed an extraneous function

 

This LRU failed to execute when required

 

This LRU is missing a function

 

This LRU performed a function but not as required

Faulty sequencing

This LRU executed while in the wrong state

 

This LRU executed out of order

 

This LRU failed to terminate when required

 

This LRU terminated prematurely

Faulty timing

This LRU executed too early

 

This LRU executed too late

Faulty data

This LRU manipulating data in the wrong unit of measure or scale

 

This LRU can 't handle blank or missing data

 

This LRU can 't handle corrupt data

 

This LRU data/results are too big

 

This LRU data or results are too small

Faulty error handling

This LRU generated a false alarm

 

This LRU A failure in the hardware, system or software has occurred

A failure in the hardware, system or software has occurred

 

This LRU detected a system failure but provided an incorrect recovery

 

This LRU failed to detect errors in the incoming data, hardware, software, user or system

Faulty processing

This LRU consumed too many resources while executing

 

This LRU was unable to communicate/interface with the rest of the system

Faulty usability

This LRU caused the user to make a mistake

 

This LRU User made mistake because of user manual

 

This LRU failed to prevent common human mistakes

 

This LRU allowed user to perform functions that they should not perform

 

This LRU prevented user from performing functions that they should be allowed to perform

Faulty serviceability

This LRU installed improperly

 

This LRU updated improperly

 

This LRU is the wrong version or is outdated

 

7.       Export Tab.  On this tab, press the Export button to Export the failure modes, reliability blocks and fault tree information so that it can be imported by Isograph Reliability Workbench. Refer to this link for instructions on how to import the exported file into the Isograph software.