Barking Up The Wrong Tree? TRIR vs. HRO's

Barking Up The Wrong Tree? TRIR vs. HRO's

The oil and gas industry must begin to look beyond the worksite and its traditional focus on personal injury metrics and “Goal Zero” if it wants to build High Reliability Organizations (HRO’s) for effective Major Hazard Event (MHE) management.

Fact 1:

On the day that control of the Macondo well was lost in 2010, executives from both BP and Transocean were on the Deepwater-Horizon drilling rig congratulating the crews on having worked 7 years without a Lost Time Injury (LTI).

Fact 2:

The US Chemical Safety Board has stated repeatedly that while companies can work many years without a personal safety injury, this provides no guarantee whatsoever that a Major Hazard Event (MHE) is any less likely.

Fact 3:

Many large E&P Operators have predominantly measured the safety performance of their high profile, highly paid contractors (i.e. those contractors carrying the greatest business risk) using metrics that focus primarily on metrics and activities related to personal safety.

Fact 4:

The incident triangle which has been referenced countless times over many years as an effective strategy for safety management practices, was developed in the 1930’s by an employee of the Travelers Insurance Company as a way to determine premiums and costs related to workplace safety injuries.

Fact 5:

Focusing on and reducing the number of more minor personal injuries such as First Aid Cases (FAC’s), does little to nothing to demonstrate effective Major Hazard Event (MHE) management and the availability and integrity of principal barriers.

Fact 6:

Prior to the Macondo disaster, BP and Transocean already fulfilled the majority of the new SEMS (Safety and Environment Management System) requirements put in place by the Bureau of Safety and Environmental Enforcement (BSEE) to prevent reoccurrence.

Fact 7:

The US Chemical Safety board has stated in response to the events of 2010 that “… in the aftermath of a catastrophe, the individuals immediately involved in the event seem to receive much of the focus and subsequent blame even though the underlying problems inevitably rests with the overall safety culture and organizational practices of the company”.

 

Background:

You would be forgiven for thinking with such established facts on hand that the oil and gas industry - in conjunction with other “high risk” industries - has finally recognized and conceded that in many ways, it has effectively been barking up the wrong tree with regard to Major Hazard Event (MHE) management. You would also think that in the aftermath and years since the BP Macondo / Deep-Water Horizon disaster in 2010, that the oil and gas industry must have been galloping toward developing new metrics and new ways of work in order to provide the necessary assurance that the likelihood for such events ever happening again has been managed to ALARP (as low as reasonably practicable).

But fundamentally redirecting and refocusing an entire industry that at times has historically been quite reticent to such changes, often requires a catalyst in the form of a compelling reference point to initiate and sustain decisive action. In 1988, the loss of 167 offshore workers on the Piper Alpha platform in the North Sea precipitated a concerted effort by the oil and gas industry to improve its global safety performance. Widespread efforts were undertaken to develop more structured and more robust HSE Management Systems to achieve a step change in industry safety performance. Likewise, the events of 2010 should have similarly illustrated the need for the oil and gas industry to accept that a very different approach would likely be needed if such catastrophic incidents were to be prevented in the future.

However in talking to many industry leaders - both inside and outside the oil and gas industry - it seems that adopting fundamentally different metrics to establish a sustainable way forward has not only created much consternation, but for many, has also resulted in hesitancy and procrastination. While the reasons for this can be numerous, what is clear is that personal injuries predominantly occur at the worksite. Therefore, it should come as no surprise that many organizations ultimately direct their improvement efforts also toward the worksite. Organizations spend significant time and money - often from the very best of intent - investigating workplace incidents wherever and whenever they occur. They are determined to understand what happened, why it happened - and importantly who was involved - such as to put in place multiple corrective actions to prevent future reoccurrence. While the effectiveness of such corrective actions are often seriously undermined because of a failure to sufficiently recognize the true scope and context of the underlying causes (please refer to “Fixing” Human Error Parts 1 & 2), this article is more concerned with the reasons why some organizations are having a tough time steering away from traditional thinking and metrics related to workplace injuries and moving toward ways of work for more effective Major Hazard Event (MHE) management.

Oil and Gas companies are very well versed in measuring and reacting to workplace injuries.

Its simple logic….

Problem: Someone gets hurt because they didn’t follow the company Lock Out / Tag Out (LO/TO) procedures.

Answer: Focus on the workplace and importantly, the work party.

Got a bigger problem? i.e. many more people are doing the exact same thing.

Answer: Implement a new lock out tag out system for all employees using LO/TO procedures.

 

The case for a HRO:

However, effective management of Major Hazard Events (MHE) cannot simply be achieved by redirecting the action of a single work party, nor even introducing behavioral modification programs to target all worksite employees. No, effective Major Hazard Event (MHE) management is a company-wide action. Starting at the highest level, the organization must galvanize itself around a single “line of sight” engaging and motivating all employees, at all levels, to become knowledgeable of the work they’re performing as it pertains to the prevention of catastrophic events. In essence, to become competent in defining and managing safety critical activities (i.e. those things necessary to support the availability and integrity of key barriers), all work must be progressed against the backdrop of a High Reliability Organization (HRO).

While there are many excellent reference sources available for outlining the origins, characteristics and benefits of a High Reliability Organization or HRO, such as Managing the Unexpected (Weick, K. E., & Sutcliffe, K. M. 2007), the definition is actually quite straightforward. A High Reliability Organization (HRO) is an organization that has succeeded in minimizing catastrophes in an environment where incidents can be expected due to certain risk factors and complexity.

But unlike personal safety where organizations often turn to the worksite for answers, embedding the characteristics of a HRO within a company’s operating culture requires direction, engagement and importantly, accountability from the C-Suite. It’s no good expecting employees on the front lines and at the worksite to be able to retool organizational objectives and how it goes about conducting its day-to-day operating and business practices. No, this must and can only come from the highest level of the organization because it impacts all levels of the organization. Therefore, it must also hold true that if an organization has significant challenges in demonstrating the integrity and availability of key barriers as part of Major Hazard Event (MHE) management, then the same old overly simplified tactics of focusing on the worksite and implementing isolated and limited corrective actions simply won’t stand up.

Incidents that involve catastrophic loss such as Piper Alpha and Macondo / Deep-water Horizon are perfect illustrations that such events cannot be directly attributed to individual(s) who reside at the worksite such as a welder, a mechanic or even a crane operator. For incidents involving catastrophic loss it’s often the case that the whole organization needs to be held to account. As the US Chemical safety board put it “One or two individuals did not cause the Macondo event. A multitude of poor decisions, poor actions and poor leadership up and down the entire organizational chains of both BP and Transocean led to the disaster.”

In other words, once an organization moves towards becoming an HRO (for better MHE management), the metrics used to measure performance have to proportionately move from simply focusing on the worksite to a much wider field of view that encompasses the decisions and actions of the entire organization. Hence ultimately, (though often reluctantly), this necessitates a visit to the top half of the organization. So in other words, the metrics established to measure progress toward becoming a HRO will require executives and senior managers to be on the hook for the competency and effectiveness of their leadership behaviors. Perhaps reason enough for organizations to become more than a little hesitant…

 

Barrier Management is an organization rather than a worksite challenge:

A good example of this is illustrated in the Herald of Free Enterprise disaster in the English Channel in 1987 where 193 people lost their lives. The Herald of Free Enterprise was a roll-on roll-off (RORO) car ferry owned by Townsend Thoresen and operated between the UK port of Dover and the Belgian port of Zeebrugge. To remain competitive with other ferry operators on the same route, Townsend Thoresen’s operating culture permitted certain work practices that were inherently unsafe and overly relied on the actions of single individuals to maintain barrier integrity. On the night of 6 March 1987, the Herald of Free Enterprise left the port of Zeebrugge with its bow-door open and once initially past the harbor, water began to enter the bow-door and quickly flooded the ship. The vessel’s stability was compromised within a matter of seconds and the ship soon capsized. The entire event took place within a matter of minutes. Most of the fatalities were caused by individuals who remained trapped in their cabins after the ship had already capsized and were exposed to the frigid sea water temperatures of 3°C (37°F). They likely died of hypothermia.

As is often the case in many incident investigations, the immediate cause was directed toward the worksite where negligent actions had been determined given the assistant boatswain had been asleep in his cabin when he should have been closing the bow-door. However, a formal public inquiry held into the disaster subsequently determined that the underlying causes for this behavior were actually buried much deeper within the organization (top half) and its leadership behaviors. This was concluded given the company had been downsizing the workforce and introducing aggressive cost cutting practices to remain competitive. Furthermore, it was concluded that a poor operating culture and poor workplace practices existed primarily because of poor morale and a "disease of sloppiness, negligence and poor safety leadership at every level of the corporation's hierarchy”. As a result, the public enquiry ultimately anchored its findings not simply in the actions of one or two individuals, but pointed the finger in the direction of the entire organization and its leadership.

While there should be no surprises with this type of example, what it does illustrate is that there is often an inherent and even overzealous desire by organizations (perhaps driven by a need for swift and decisive action to prevent reoccurrence) to frame the causes of such incidents simply in terms of worksite behaviors and the actions of those individuals directly involved.

The carry over for the oil and gas industry is that it simply cannot afford to think and act any longer with such limited fields of view if it is to become truly successful in developing HRO’s for effective Major Hazard Event (MHE) management. Indeed one of the hallmarks of a HRO is in the way it responds to system variations and any early signs of trouble. In non-HRO cultures, weak signals of trouble are met with weak, (read worksite) responses. In other words, responses to reports concerning barrier integrity is responded to (at best) in the same way as the organization responds to say, a First Aid Case (FAC) or Medical Treatment Case (MTC) i.e. focus on the worksite and those employees directly involved in the work. But unlike such responses to personal safety, Major Hazard Event (MHE) management should never simply ring fence the worksite and concentrate on the lower half of the organization while allowing the top half to often remain insulated from any real form of accountability. Effective Major Hazard Event (MHE) management requires that the whole organization be made more accountable and work collectively to address challenges that are often systemic, far reaching and well-established. In an HRO culture, reports concerning barrier integrity are given top priority and are seen as an organizational failure rather than a limited and isolated miss step at the worksite. In other words, in a HRO culture, weak signals of trouble are met with very strong responses.

The stakes for getting this right couldn’t be higher. At a time when margins are continuing to be squeezed, companies cannot afford the luxury of remaining hesitant or complacent around potential losses that are so great, that it could bring into question the viability of the organization to continue ongoing, sustainable operations. And in this regard, simply doing more of the same and beating the same old tired drum of “Goal Zero” and a focus on personal safety statistics to demonstrate Major Hazard Event (MHE) management, seems both foolhardy, naive and in the case of the Herald of Free Enterprise, an argument for criminal negligence.

 

The Way Forward:

So what to do?

How can the oil and gas industry take some pragmatic steps to move away from its traditional focus on workplace injuries and TRIR (Total Recordable Injury Rate) and become effective at Major Hazard Event (MHE) management? Well clearly minor workplace injuries happen far more regularly than incidents of catastrophic loss. But while low(er) consequence, high frequency incidents involving workplace injuries can provide numerous data points, by definition, the same cannot be said of high consequence, low frequency events. Therefore, organizations must change the way they respond to these two very different categories of incidents. Incidents involving catastrophic loss are rarely (if ever) due to the single actions of one or two individuals. They involve complex situations with multiple lines of defense, managed at all levels of the organization. So simply saying we’ve been so many years without a catastrophic incident (in the way achievements for personal safety are framed) is simply not helpful given that if it does occur, it may have the potential to put an organization out of business.

So clearly then, the first thing that is needed in redirecting an organization’s focus away from workplace injuries is to begin generating meaningful data points. For instance, if every time a barrier was unavailable, compromised or failed to function (as it ordinarily should), then this should generate an important data point. In fact, depending on the seriousness of the failure and the criticality of the barrier itself, this single data point should probably be seen with the same level of seriousness as say, a Lost Time Injury (LTI) - and likely regardless of the fact that other barriers may have performed as intended to provide the necessary checks to prevent further escalation. After all, it’s reasonable to presume that in the case of a workplace safety incident, it would likely still get reported regardless of the fact that any personal protective equipment (PPE) deployed may have performed as intended and prevented actual injury. That way, effective barrier management would start to get discussed at the highest level of the organization and on a much more regular basis (presuming that an LTI incident routinely gets to the C-Suite).

This should especially be the case where barrier integrity and availably is related to human performance. For instance if an organization knows it has a human performance challenge such as a data point that says “From the last 6 months of Management Walk Arounds (MBWA), findings have revealed that 50% of the time, employees are not strictly adhering to the documented procedures for the work”. In other words, for any number of jobs performed, 50% of the time the organization is out of control given that it simply doesn’t know how the work is actually being executed. Overlay this onto a barrier that demands the same type of human performance compliance and you begin to measure how barriers intended for managing Major Hazard Events (MHE) start to become comprised. Add another layer where the actions of people in high stress environments or unusual and unfamiliar situations start deviating from what was otherwise expected in such instances, (perhaps because of a lack of realistic scenario training) and further barrier degradation may likely result.

So to conclude, if organizations are serious about moving toward more effective Major Hazard Event (MHE) management, they must begin to routinely measure and monitor the integrity and availability of key barriers. And they must do this in such a way that every barrier failure or degradation receives the same intense scrutiny as a serious personal workplace injury - but with one major caveat. Accept and recognize that barrier failure in an HRO culture looks toward all levels of the organization - not just one or two individuals at the worksite.

If nothing more, then a good first step is that it at least gets the conversation going about the right issues at the right level. And at this juncture where new data points generates new understanding, the oil and gas industry can begin to take a meaningful step in the right direction.

Edwin Rodriguez Rivas

Molder Technician II at GreenTweed LLC, Kulpsville

6y

Problem: Someone gets hurt because they didn’t follow the company Lock Out / Tag Out (LO/TO) procedures. Answer: Focus on the workplace and importantly, the work party. Got a bigger problem? i.e. many more people are doing the exact same thing. Answer: Implement a new lock out tag out system for all employees using LO/TO procedures.

Ross Fisher

Maintenance Superintendent

6y

I was at a safety conference today with a major North sea operator. I am happy to say that for the first time reliability was raised in the same discussion as safety. Not only due to the fact that poor reliability may lead to or be the cause of an un safe situation. The steps taken to correct reliability issues at the job site are in their own way hazardous. Improve reliability and safety improves as a result.

Like
Reply
Jane Anglin

Director, JM Principle

6y

Clear & compelling

Like
Reply

Interesting read, Leadership on a daily basis and removal of peer pressure et al.

Like
Reply
Charles Harper

Managing Counsel, Refining Operations at Citgo Petroleum Corporation

6y

Excellent article.

Like
Reply

To view or add a comment, sign in

More articles by Peter V. Bridle

Insights from the community

Others also viewed

Explore topics