Software Safety in a Nutshell
Clifton A. Ericson II
The Boeing Company; Seattle, Washington
The purpose of this article is to present a quick overview of software safety or software safety in a nutshell. The intent here is not to directly solve your software problems, but to raise your level of awareness and understanding in regard to software safety This article expounds upon the poignant points of software safety and might also be called the A, B C’s of software safety
The technology race (or rage) combined with the small inexpensive microprocessor has made it such that our society is giving computer control to everything possible. Everything from toasters to medical equipment, from commercial aircraft lavatories to spacecraft, and from kids toys to nuclear weapon systems. And, as more control is being given to computers, it follows that the software driving these computers is more prevalent and controlling also.
The increased level of risk in software is due to many interrelated and complex factors, such as:
The very unique nature of software makes it difficult to completely understand, and even more difficult to visualize all the possible ways software can perform or fail to perform.
Some of the unique characteristics of software include:
The weak link in software is the final designed product, not the design process. Regardless of the design process, hazards will always be unintentionally built into the design (experience bears this truth out). When implemented, software almost always works as intended, or close to it. The problem is, it also has unforeseen functions built-in that can perform in unintended, undesired and hazardous ways.
As a system is designed with intended functions, it is also inadvertently designed with built-in unintended functions, many of which may be hazardous. You are probably familiar with the famous drawing of the old woman that is very easy to see in the picture. But, also within the same drawing is the picture of a beautiful young lady, which takes more time and effort to find in the picture. So, within the same drawing exists both an intentional picture and an unintentional picture. This is perhaps an over simplified view of software design.
Finding and eliminating the built-in unintended and undesired hazardous functions is the ultimate goal of software safety. This means attacking the designed product. Many built-in unintended hazardous functions (BUHF) can be avoided through the design process, but no matter what the process, a few BUHF’s will always be created and survive to live within the final product. History has shown this to be true with hardware designs.
Since safety problems are caused by the BUHF’s in systems, it only makes sense to focus on hazards to make a system safe. Of course this is not a new concept, it was discovered 40 years ago with hardware controlled systems.
Now, focusing on hazards is easier to say than to do. It is not easy to visualize or foresee hazards within a software design. Particularly when the hazards involve subtle features of the combined hardware, software, man machine interface and the environment.
It should be noted that hazard analysis is not an exact science, and still needs considerable improvement.
There are many ways to identify hazards, some of the most current include:
Software has a subtle nature which can make the safety analysis task more difficult than normal. For example, software can have errors and still function reasonably well, particularly without causing a safety problem. Software errors may not always be readily apparent, they may be lurking in the woodwork or slowly causing a hardware element to build up to an unacceptable tolerance level. Software errors are usually application and input dependent, that is, when software is used for applications and inputs for which it has not been tested, errors begin to occur more frequently. Not all software errors cause safety problems and not all software which functions according to specification is safe
The following taxonomy briefly describes typical software hazard types and is a very useful aid in performing a software safety analysis. The generalized hazard categories are as follows:
Common goals of both software design and software safety analysis should include the following:
As first identified in reference 1, software safety is a system issue. Software cannot be entirely removed from the system and analyzed in a sterile environment. By and of itself, software is not hazardous [ref. 1]. Software is only hazardous when operating or controlling hardware. This provides another key to software safety focus on the safety critical hardware and hazardous system operations and modes.
Software always works as designed, but not necessarily as intended. Software is totally designed to a specification and it eventually is made to achieve what the design specification requires. However, the system nature yields a software design which performs some unforeseen, unintended, and undesirable functions (ie, BUHF). These are the problems of interest to system safety.
Software has a greater capability than intended or specified. Software usually does not stop functioning when an error occurs, it merely continues to operate in an unanticipated manner. Therefore, the known design of software is a subset of the total design, which includes all of the possible outcomes software could achieve as a result of an error or hardware induced failure. The real exercise in software safety analysis is determining the extent of the total software capability
The various tools, techniques and methods for software design can also contribute to designing a safe system. The use of design standards, design guidelines and historical data help eliminate known and already experienced problems.
Common high level design goals include:
REFERENCES
[1] C. A. Ericson II, Software and System Safety, 5th International System Safety Conference, 1981.
[2] C. A. Ericson II, Software Safety Precepts, 14th International System Safety Conference, 1996.
BIOGRAPHY
Clifton A. Ericson II
The Boeing Company
18247 150th Ave SE
Renton, WA 98058 USA
phone 253-657-5245
fax 253-657-2585
email clifton.a.ericson@boeing.com
Mr. Ericson works in system safety on the Boeing 767 AWACS program. He has 33 years experience in system safety and software design with the Boeing Company. He has been involved in all aspects of fault tree development since 1965, including analysis, computation, multi-phase simulation, plotting, documentation, training and programming. He has performed Fault Tree Analysis on Minuteman, SRAM, ALCM, Apollo, Morgantown Personal Rapid Transit, B-1 and 737/757/767 systems. He is the developer of the MPTREE, SAF and FTAB fault tree computer programs. In 1975 he helped start the software safety discipline, and has written papers on software safety and taught software safety at the University of Washington. Mr. Ericson holds a BSEE from the University of Washington and an MBA from Seattle University.