Operating robust Operational Technology in a connected world

This article was written for PreventFocus, a monthly magazine for health, safety and environment professionals. 
More information can be found here: https://prevent.be/product/preventfocus.

We live in a connected world where IT plays a role in nearly everything we do today. At the same time, the huge growth of cybersecurity incidents, including the ones that happen by accident are showing us how dependent and vulnerable we have become when something goes wrong with IT. Some even say that the next pandemic won’t be a biological one but a cyber one.

So far, most of these incidents usually impacted IT systems that only disrupted an organization’s business but did not have an adverse impact on our health, our safety, or our environment, although some exceptions already apply. Due to the increased number of attacks, the current geopolitical tensions, and the growing number of connected systems, cyber incidents could soon also be impacting more frequently operational environments that when disrupted would pose a threat to humans and our environment.

Most operational environments are controlled via Operational Technology or OT. Operational Technology is the hardware and software that controls and monitors industrial equipment and processes. OT systems control physical systems such as valves, motors, pumps, and other machines, they control various values such as pressure, temperature, flows, and they monitor these systems and values to prevent hazardous situations.

OT is typically found in manufacturing, electricity generation and distribution, our water and gas supply, our public transport, air- and seaports, etc. … but there is also an entire array of other applications that sometimes gets forgotten like connected medical equipment in hospitals, or building controls found in elevators, heating, ventilation, and air-conditioning systems.

While IT is operating in a world of bits and bytes, OT is operating in the physical world and it’s controlling it. A disruption, either by accident or malintent could have a severe impact on us. For instance, when a valve is not operating properly due to an unstable or hacked OT system that controls the valve, this could lead to an explosion in a chemical reactor or an eruption of dangerous chemicals.

Since OT systems manage and supervise real-time physical processes, high rates of availability and reliability for these processes and control systems must be achieved all the time. In the past, OT used to operate isolated, not connected to either the Internet or internal IT network, and sometimes not even using standard IT communication protocols making it less vulnerable to cyber incidents.

Given the growing importance of IT in our connected world, the line between OT and IT has become blurred and they even converge. This has many reasons:

  • Organizations like to have live operational metrics on how well an operational environment is performing and want to optimize it in an agile fashion, e.g., supply chain management and just-in-time delivery of raw materials in a chemical manufacturing process. For this data exchange to happen, links had to be created between the OT network and the business IT network. Next to this, we also see a growing demand for interconnecting various OT systems, and for consolidating the control rooms of a few plants into one centralized off-site control room.
  • OT Operators and engineers like to have remote access into OT for doing remote management and troubleshooting.
  • Using standard IT solutions within the OT environment has become the norm and many of these systems may require network and internet access (e.g., updates, licenses, etc. …).
  •  Suppliers of OT solutions have started to offer their solutions as a managed service that is monitored and managed continuously by them. The Internet and standard IT tools are typically leveraged as well to achieve this.
  • Larger OT environments are usually distributed across a larger area, e.g., seaports or large plants. This poses communication challenges that can be addressed by IT and standard communication solutions such as 4G or 5G.
  • In some cases, OT is also adopting emerging IT and communication technologies more rapidly than standard IT because these address needs that were previously impossible to achieve. With 5G, cloud, and AI, it is for instance now possible to remotely perform robot surgery and to control drone containerships.

OT is now connected to IT and Internet, so what?

To understand the impact of this, it is good to first look at the main differences between OT and IT:

  • In IT, the focus is on confidentiality, integrity, and availability of your systems. In OT, the focus is almost entirely on availability, predictability, and integrity of the system with a clear focus on safety. Confidentiality has been seen as less important, but that view is changing because of potential intellectual property theft.
  • While IT equipment is usually being replaced after 3 to 5 years, the lifespan of OT equipment can sometimes be up to 20 years which poses challenges with respect to older IT equipment that can no longer be updated and patched properly for known issues. The use of the end-of-life Windows XP systems is still quite common in many OT environments.
  • OT is supposed to be always on with usually only one or two maintenance cycles per year. This poses challenges with respect to design, change management, and maintenance such as quickly patching critical vulnerabilities.
  • In OT, very good awareness exists regarding physical security, health and safety, but awareness regarding cybersecurity is generally quite poor, e.g., operators in a control room sharing one computer account without a password, no anti-virus software being used yet, etc. This would have been acceptable when isolated from the rest of the world but is no longer valid today.

IT and OT environments are usually operated by different teams which poses cultural and organizational challenges. From an IT person’s point of view, OT environments connected to IT network are being seen as a risk to IT, e.g., because these OT systems are not regularly updated and vulnerable to IT attacks. From an OT person’s point of view, IT is seen as risk to the reliability of the OT system, e.g., improper patching a vulnerability on a real-time control system might break the entire process and potentially shutdown the entire environment.

How to do it right?

Given that the line between IT and OT is fading and even converging, collaboration between IT and OT is the foundation to success. People in OT must learn to understand that fences, CCTV camera’s and 24/7 security guards are no longer providing sufficient protection against intruders when an Internet cable is running into their environment. People in IT must learn to understand that you cannot always apply commonly accepted IT processes such as monthly patching or frequent password changes in an OT environment.

Hence, we advise to set up an IT/OT role or team depending on the size of the organization that facilitates this joint approach by acting as single point of contact for all cybersecurity and IT related questions and decisions in OT.

Rather than re-inventing the wheel with respect how to do it right, we recommend using internationally recognized standards. The ISA/IEC 62443 series of standards developed by the ISA99 committee and adopted by the International Electrotechnical Commission (IEC) provides a flexible framework to address and mitigate current and future security vulnerabilities in OT systems. These standards are also adopted by many local or regional legislations, developed with the intention to protect critical infrastructure. Adopting standards like IEC 62443 makes it also easier to obtain and showcase compliance in a later stage.

Given that many OT environments are already relatively old from an IT viewpoint and won’t be renewed anytime soon, retrofitting security controls in an existing environment isn’t always possible or desired. In this case, we advise to put what we call a virtual dome around your OT environment with only a few clearly defined access points in and out of the OT environment, hereby strictly controlling and monitoring all traffic flows from and to the Internet and IT networks. When IT systems in OT would exist that are known to be vulnerable and can no longer be patched because they are end-of-life, these should be separated and not made accessible directly from any another network.  

When additional solutions such as analysis and reporting tools would be added to the OT environment, pay sufficient attention to securing the connection, the integration, and to having the right level of user authentication and authorization.

Focus on security awareness, on limiting what users can do on their OT workstations such as blocking the use of USB devices, and on implementing standard non-intrusive IT best practices such as taking regular backups and testing them.

While prevention is ideal, timely detecting and rapidly responding to incidents are a must. It is impossible to anticipate and prevent all cyber incidents, but many can be detected when the required technology and processes are in place, and people are trained on what to do. Incident detection and incident management procedures and processes should be created, maintained, and practiced regularly. 

As a health and safety expert and advisor, you can also help by making organizations and people aware of these cybersecurity risks within operational environments and the potential threat they pose to human safety, health, and the environment. You can help the organization drive IT/OT collaboration and the required cultural changes this requires.

Do you want to know more about how to secure your OT environment, get in touch with us!