Offshore Solutions: Weathering a Network Storm

June 1, 2020
News

[Editor’s note: A version of this story appears in the June 2020 edition of E&P. Subscribe to the magazine here.]

Many operations at sea rely on a vessel with a dynamic positioning (DP) system whereby the vessel maintains a specific position or track to perform position-critical operations. The DP system comprises power generation, power distribution, a thruster and propulsion system, heading, environmental and position reference systems and the control system. The system components are typically arranged into two or more redundant groups, with each group able to generate and control vessel thrust in three axes: surge, sway and yaw.

Control system

The DP control system is typically arranged as two separate ethernet networks to ensure redundancy. The network nodes (e.g., operator stations, field stations and processors) contain network switches, hubs and other hardware that can create common points between redundant groups. While a single fault or loose connection in one network can result in operations continuing without a loss of position on the other redundant network, there are potential network faults that can affect both networks and defeat the redundancy concept.

A network storm, associated with ethernet networks, has the potential to affect both networks and result in a loss of position.

The external forces and degrees of motion experienced by a vessel and controlled by vessel thrust using a DP system are depicted in the diagram. *(Source: DNV GL)*

Network storms

It is not possible to be certain how often network storms occur on DP vessels because some are not reported and sometimes it is unclear if it was actually established that a system failure was a result of a network storm. DNV GL recently investigated several events where a network storm has contributed to a loss of position, including onboard diving support vessels and drillships.

In normal operation, the network has plenty of capacity to handle the packets of data that are exchanged between the various system nodes. The processors and control equipment process these packets of data and take action accordingly. In a network storm, the network is flooded with too much data and the network becomes overloaded. This can impact critical responsibilities, such as controlling thrusters, monitoring shutdown conditions or providing switchboard protection. Other possible causes of a netstorm can include software and hardware failure, human error, inherited design issue and network configuration errors.

The performance of a network also can deteriorate over time. Likewise, controller or network hardware can be replaced with lesser capacity components; or with different firmware, software files can become corrupt and controller hardware can fail. There is also a regular flurry of firmware/software upgrades/downgrades involved in service work, backups and restores. All of these activities could potentially introduce issues into the DP control system and network infrastructure.

Protection

Many network switches will feature some form of traffic control and rate-limiting functionality. This can essentially drop packets of various types when a specified traffic level is exceeded, potentially averting the degradation of performance for essential traffic during a critical period. This first line of defense can stop the overload of a controller, and the system can maintain proper function.

If the DP control system software detects a netstorm, then it may be possible that any switch-based protection has failed or is not present. The increased traffic level now poses a real threat to the control system and to maintaining position.

The system can warn the operator and also attempt to limit the effects by disabling the affected network while continuing operation on the redundant network. Netstorm and the associated protection should be addressed within the vessel’s failure mode and effects analysis (FMEA) and suitably tested as part of the DP FMEA proving trials and annual DP trials as necessary.

It is important to distinguish between throughput and bandwidth to ascertain sufficient performance. It also is vital in a control system that traffic reaches its destination in a time that is expected; in other words, good network latency.

Testing

Interestingly, the DP community has traditionally shied away from the concept of condition, confidence or riskbased testing, preferring to adhere to calendar-based test programs. For instance, the most recent update of the International Maritime Organization guidelines can be paraphrased as test everything every year. Opinions on the frequency of testing differ, but many agree that testing every five years or after a software or hardware change/ repair is prudent. The vessel and its systems should at the very least be tested when it is commissioned.

It is more common to find that network problems exist when the system is tested at FMEA, proving trials, which are not universal. The first network test is potentially the most important as a vessel with no protection, or defective protection stands no chance of surviving a netstorm or performance-related failure.

The duration of any testing also should be factored into these considerations and, most importantly, the duration for which the netstorm is maintained during the test.

Fortunately, the ease with which netstorm testing can be carried out is improving, and tests are more regular and commonplace. It is now possible to test networks annually with little, if any, additional burden on vessel availability. Options include facilities to allow the crew to test the networks themselves or to have the test carried as part of the annual DP trials—a service that DNV GL’s Noble Denton marine services perform.

It is imperative that preventative measures and throughput be tested to demonstrate that the system can detect and protect the controllers from a network storm on one of the two process networks and that alarms are efficiently and effectively delivered to the operator. Testing should also prove the network can maintain communications at the expected data rates as well as being independent of any joystick system. The result should be no loss of position or an unexpected failure of equipment.

Testing can be performed using a software utility run from a laptop or specialized hardware test equipment. Thought and consideration also should be given to how results are demonstrated and displayed for an assurance audit or other activity.

Network redundancy in DP control systems enable safe operations by ensuring the vessel stays in position. *(Source: DNV GL/Polar Media)*

Addressing dilemmas

The DP fleet currently in operation has a wide variety and age of vessels. Only in recent years has the danger of network storms been appreciated, and the requirements for its consideration and protection were included in classification society rules for such vessels. As such, it is quite common that only in newer vessels has this phenomenon been considered, designed for and tested to ensure the vessel is fault-tolerant. Older fleets are potentially not protected against and have never been tested for their resilience to network storms.

Any ethernet network on the vessel is subject to such a storm and should be considered in the DP FMEA and when testing. Other such networks and situations to consider are the thruster control system, emergency shutdown system or an integrated automation system. Netstorms can have a detrimental impact on all networks, not just those that are actively exchanging commands and instructions. It has been observed that a storm on a network only utilized for the exchange of data and not processing any executive actions can actually interfere with the vessel’s ability to control its thrusters resulting in a loss of position.

Risk-free methods to fully simulate scenarios mean there is no longer an unreasonable time or cost burden to identify and eliminate potential failure and its costly consequences.

This post appeared first on Hart Energy.