Chris Johnson's
Failure in Safety-Critical Systems:
A Handbook of Accident and Incident Reporting
The full reference is:
C.W. Johnson, Failure in Safety-Critical Systems:
A Handbook of Accident and Incident Reporting, University of Glasgow Press, Glasgow, Scotland, October 2003.
ISBN 0-85261-784-4.
There are copyright notices and some
tips for downloading the documents.
You can download a single file containing all of the chapters to ease printing.
However, the PDF is 10Mb and it may be less frustrating to download it chapter by chapter.
If you spot any typos or corrections then please let me know as I'm trying to improve the manuscript when time allows.
Thanks, Chris Johnson, Glasgow, 2003.
- Table of contents, preface etc
- Chapter 1 Abnormal Incidents
- 1.1 The Hazards
- 1.1.1 The Likelihood of Injury and Disease
- 1.1.2 The Costs of Failure
- 1.2 Social and Organisational Influences
- 1.2.1 Normal Accidents?
- 1.2.2 The Culture of Incident Reporting
- 1.3 Summary
- Chapter 2 Motivations for Incident Reporting
- 2.1 Why Bother With Incident Reporting?
- 2.1.1 The Strengths of Incident Reporting
- 2.1.2 The Weaknesses of Incident Reporting
- 2.2 Different Forms of Reporting Systems
- 2.2.1 Open, Confidential or Anonymous?
- 2.2.2 Scope and Level
- 2.3 Summary
- Chapter 3 Sources of Failure
- 3.1 Regulatory Failures
- 3.1.1 Incident Reporting to Inform Regulatory Intervention
- 3.1.2 The Impact of Incidents on Regulatory Organisations
- 3.2 Managerial Failures
- 3.2.1 Latent and Catalytic Failures
- 3.2.2 Incident Reporting and Safety Management Systems
- 3.3 Hardware Failures
- 3.3.1 Acquisition and Maintenance Effects on Incident Reporting
- 3.3.2 Source, Duration and Extent
- 3.4 Software Failures
- 3.4.1 Failure Throughout the Lifecycle
- 3.4.2 Problems in Forensic Software Engineering
- 3.5 Human Failures
- 3.5.1 Individual Characteristics and Performance Shaping Factors
- 3.5.2 Slips, Lapses and Mistakes
- 3.6 Team Factors
- 3.6.1 Common Ground and Group Communication
- 3.6.2 Situation Awareness and Crew Resource Management
- 3.7 Summary
- Chapter 4 The Anatomy of Incident Reporting
- 4.1 Different Roles
- 4.1.1 Reporters
- 4.1.2 Initial Receivers
- 4.1.3 Incident Investigators
- 4.1.4 Safety Managers
- 4.1.5 Regulators
- 4.2 Different Anatomies
- 4.2.1 Simple Monitoring Architectures
- 4.2.2 Regulated Monitoring Architectures
- 4.2.3 Local Oversight Architectures
- 4.2.4 Gatekeeper Architecture
- 4.2.5 Devolved Architecture
- 4.3 Summary
- Chapter 5 Detection and Notification
- 5.1 `Incident Starvation' and the Problems of Under-Reporting
- 5.1.1 Reporting Bias
- 5.1.2 Mandatory Reporting
- 5.1.3 Special Initiatives
- 5.2 Encouraging the Detection of Incidents
- 5.2.1 Automated Detection
- 5.2.2 Manual Detection
- 5.3 Form Contents
- 5.3.1 Sample Incident Reporting Forms
- 5.3.2 Providing Information to the Respondents
- 5.4 Summary
- Chapter 6 Primary Response
- 6.1 Safeguarding the System
- 6.1.1 First, Do No Harm
- 6.1.2 Incident and Emergency Management
- 6.2 Acquiring Evidence
- 6.2.1 Automated Logs and Physical Evidence
- 6.2.2 Eye-Witness Statements
- 6.3 Drafting A Preliminary Report
- 6.3.1 Organisational and Managerial Barriers
- 6.3.2 Technological Support
- 6.3.3 Links to Subsequent Analysis
- 6.4 Summary
- Chapter 7 Secondary Investigation
- 7.1 Gathering Evidence about Causation
- 7.1.1 Framing an Investigation
- 7.1.2 Commissioning Expert Witnesses
- 7.1.3 Replaying Automated Logs
- 7.2 Gathering Evidence about Consequences
- 7.2.1 Tracing Immediate and Long-Term Effects
- 7.2.2 Detecting Mitigating Factors
- 7.2.3 Identifying Related Incidents
- 7.3 Summary
- Chapter 8 Computer-Based Simulation
- 8.1 Why Bother with Reconstruction?
- 8.1.1 Coordination
- 8.1.2 Generalisation
- 8.1.3 Resolving Ambiguity
- 8.2 Types of Simulation
- 8.2.1 Declarative Simulations
- 8.2.2 Animated Simulations
- 8.2.3 Subjunctive Simulations
- 8.2.4 Hybrid Simulations
- 8.3 Summary
- Chapter 9 Modelling Notations
- 9.1 Reconstruction Techniques
- 9.1.1 Graphical Time Lines
- 9.1.2 Fault Trees
- 9.1.3 Petri Nets
- 9.1.4 Logic
- 9.1.5 Conclusion, Analysis and Evidence (CAE) Diagrams
- 9.2 Requirements for Reconstructive Modelling
- 9.2.1 Usability
- 9.2.2 Expressiveness
- 9.3 Summary
- Chapter 10 Causal Analysis (NASA NPG 8621.1)
- 10.1 Introduction
- 10.1.1 Why Bother With Causal Analysis?
- 10.1.2 Potential Pitfalls
- 10.1.3 Loss of the Mars Climate Orbiter & Polar Lander
- 10.2 Stage 1: Incident Modelling (Revisited)
- 10.2.1 Events and Causal Factor (ECF) Charting
- 10.2.2 Barrier Analysis
- 10.2.3 Change Analysis
- 10.3 Stage 2: Causal Analysis
- 10.3.1 Causal Factors Analysis
- 10.3.2 Cause and Contextual Summaries
- 10.3.3 Tier Analysis
- 10.3.4 Non-Compliance Analysis
- 10.4 Summary
- Chapter 11 Alternative Causal Analysis Techniques
- 11.1 Event-Based Approaches
- 11.1.1 Multilinear Events Sequencing (MES)
- 11.1.2 Sequentially Timed and Events Plotting (STEP)
- 11.2 Check-List Approaches
- 11.2.1 Management Oversight and Risk Tree (MORT)
- 11.2.2 Prevention and Recovery Information System for Monitoring and Analysis (PRISMA)
- 11.2.3 Tripod
- 11.3 Mathematical Models of Causation
- 11.3.1 Why-Because Analysis (WBA)
- 11.3.2 Partition Models for Probabilistic Causation
- 11.3.3 Bayesian Approaches to Probabilistic Causation
- 11.4 Comparisons
- 11.4.1 Bottom-Up Case Studies
- 11.4.2 Top-Down Criteria
- 11.4.3 Experiments in Domain Experts' Subjective Responses
- 11.4.4 Experimental Applications of Causal Analysis Techniques
- 11.5 Summary
- Chapter 12 Recommendations
- 12.1 From Causal Findings to Recommendations
- 12.1.1 Requirements for Causal Findings
- 12.1.2 Scoping Recommendations
- 12.1.3 Conflicting Recommendations
- 12.2 Recommendation Techniques
- 12.2.1 The `Perfectability' Approach
- 12.2.2 Heuristics
- 12.2.3 Enumerations and Recommendation Matrices
- 12.2.4 Generic Accident Prevention Models
- 12.2.5 Risk Assessment Techniques
- 12.3 Process Issues
- 12.3.1 Documentation
- 12.3.2 Validation
- 12.3.3 Implementation
- 12.3.4 Tracking
- 12.4 Summary
- Chapter 13 Feedback and the Presentation of Incident Reports
- 13.1 The Challenges of Reporting Adverse Occurrences
- 13.1.1 Different Reports for Different Incidents
- 13.1.2 Different Reports for Different Audiences
- 13.1.3 Confidentiality, Trust and the Media
- 13.2 Guidelines for the Presentation of Incident Reports
- 13.2.1 Reconstruction
- 13.2.2 Analysis
- 13.3 Quality Assurance
- 13.3.1 Verification
- 13.3.2 Validation
- 13.4 Electronic Presentation Techniques
- 13.4.1 Limitations of Existing Approaches to Web-Based Reports
- 13.4.2 Using Computer Simulations as an Interface to On-Line Accident Reports
- 13.4.3 Using 2D and 3D Time-lines as an Interface to Accident Reports
- 13.5 Summary
- Chapter 14 Dissemination
- 14.1 Problems of Dissemination
- 14.1.1 Number and Range of Reports Published
- 14.1.2 Tight Deadlines and Limited Resources
- 14.1.3 Reaching the Intended Readership
- 14.2 From Manual to Electronic Dissemination
- 14.2.1 Anecdotes, Internet Rumours and Broadcast Media
- 14.2.2 Paper documents
- 14.2.3 Fax and Telephone Notification
- 14.3 Computer-Based Dissemination
- 14.3.1 Infrastructure Issues
- 14.3.2 Access Control
- 14.3.3 Security and Encryption
- 14.3.4 Accessibility
- 14.4 Computer-Based Search and Retrieval
- 14.4.1 Relational Data Bases
- 14.4.2 Lexical Information Retrieval
- 14.4.3 Case Base Reasoning
- 14.5 Summary
- Chapter 15 Monitoring
- 15.1 Outcome Measures
- 15.1.1 Direct Feedback: Incident and Reporting Rates
- 15.1.2 Indirect Feedback: Training and Operations
- 15.1.3 Feed-forward: Risk Assessment and Systems Development
- 15.2 Process Measures
- 15.2.1 Submission Rates and Reporting Costs
- 15.2.2 Investigator Performance
- 15.2.3 Intervention Measures
- 15.3 Acceptance Measures
- 15.3.1 Safety Culture and Safety Climate?
- 15.3.2 Probity and Equity
- 15.3.3 Financial Support
- 15.4 Monitoring Techniques
- 15.4.1 Public Hearings, Focus Groups, Working Parties and Standing Committees
- 15.4.2 Incident Sampling
- 15.4.3 Sentinel systems
- 15.4.4 Observational Studies
- 15.4.5 Statistical Analysis
- 15.4.6 Electronic Visualisation
- 15.4.7 Experimental Studies
- 15.5 Summary
- Chapter 16 Conclusions
- 16.1 Human Problems
- 16.1.1 Reporting Biases
- 16.1.2 Blame
- 16.1.3 Analytical Bias
- 16.2 Technical Problems
- 16.2.1 Poor Investigatory and Analytical Procedures
- 16.2.2 Inadequate Risk Assessments
- 16.2.3 Causation and the Problems of Counter-Factual Reasoning
- 16.2.4 Classification Problems
- 16.3 Managerial Problems
- 16.3.1 Unrealistic Expectations
- 16.3.2 Reliance on Reminders and Quick Fixes
- 16.3.3 Flaws in the Systemic View of Failure
- 16.4 Summary
- Bibliography
- Index
- Other
publications
Copyright issues:This version is placed on the web for personal use only.
Commercial use of this material requires the explicit prior permission
of the author.
If you find this book useful, please make a donation to your local children's charity.
Download tips: There are some problems in generating good pdf screen images from the
document preparation tool that I used (a brief technical explanation).
Although the text may appear slightly 'fuzzy' on the screen, it
will print correctly.
The file size and network congestion may create problems in accessing
the book from the links on this page.
In most browsers, if you select the right mouse button over the link it
will enable you to 'save target as' - this will download a copy onto
your hard disk.
You should then be able to monitor the progress of the download.
Prof. Chris Johnson, DPhil, MSc, MA, CEng, FBCS,
Dept. of Computing Science,
Univ. of Glasgow,
Glasgow,
G12 8QQ,
Scotland.
Tel: +44 141 330 6053,
Fax: +44 141 330 4913,
johnson@dcs.gla.ac.uk