The ability to make smart, efficient, data-driven reliability decisions separates top performing refineries from the rest of their peers. While most refineries have access to more data now than they have ever had before, many refineries still struggle to make confident reliability decisions because they are overloaded with data and do not have the correct systems in place to analyse that data. Additionally, many refineries store their data across a multitude of disparate systems, making it more difficult to leverage this data when making critical asset management and capital investment decisions.
In an effort to help refineries better understand how they can leverage data to more accurately predict asset health and reliability threats, Pinnacle’s Research and Development team recently completed a study in which a data-driven approach was applied to modeling degradation and compared to current industry standard practices. Improving both the accuracy and precision of asset degradation modeling can have a profound impact on the results and recommendations output for reliability management and performance optimisation of Risk-Based Inspection (RBI) programs.
In this study, Pinnacle utilised a data-driven model that leveraged machine learning (ML) techniques to predict degradation rates and associated variability for a select set of assets from a dataset of reformer units. The model was fed large amounts of pertinent data including asset attributes, operating conditions, process stream data, inspection history, and other commonly available data to predict degradation. The results of the study showed that Pinnacle’s ML model was able to predict degradation rates and associated variability with a higher level of accuracy compared to existing industry standard practice and subject-matter expert (SME) estimation.
Before diving into the results of the ML model, it is useful to show some high-level analytics regarding Pinnacle’s dataset on reformer degradation. First, the analytics validate the company’s approach by showing that the dataset is sensible and the general trends that are observed match expectation. Second, the analytics illustrate the diversity observed in the collected data, which was ultimately leveraged by the ML model in order to make more accurate degradation predictions.
Degradation rate by operator/site:
Degradation rates naturally vary between different operators and their individual sites. In this study, the average degradation rate for the reformer dataset is calculated for each CML in the dataset. The variance of degradation rates across different companies and sites is illustrated in Figure 1, colour coded by operator.
Figure 1: Boxplot of degradation rate by operator and site.
As seen in Figure 1, most sites experience degradation rates less than 5 mils/yr for a majority of assets but also tend to have heavy tails that extend above 10 mils/yr, and in some cases 20 mils/yr. Note that Operator 1, overall, has much higher degradation rates than the other operators, such as Operator 3 or Operator 8. There is also significant variability across different sites of the same operator.
Additional insights regarding the variability in degradation rate can be gained by casting the analysis as a density plot as seen in Figure 2.
Figure 2: Density plot of degradation rate by operator and site.
Figure 2 illustrates that different sites may have heavier distribution tails, indicating that there is higher variability in degradation. In this specific case, there appears to be areas of significantly higher degradation that are not widespread throughout the system. Also, Operator 1 Site 2 has a characteristically different shape – peaking at around 7.5 mils/yr and being relatively flat, indicating high variability in degradation rates throughout the system.
Degradation rate by reformer type and system function:
Degradation rates can also be examined as a function of reformer type (e.g., fixed bed) as well as the system type (e.g., reaction). It is important to note that higher degradation rates are expected in systems such as the combined feed or reaction system due to the expected conditions of those areas. Figure 3 is a boxplot illustrating the average degradation rates by system type in each type of reformer for the nine generic systems seen in both continuous catalytic reformers (CCR) and fixed bed reformers.
Figure 3: Boxplot of degradation rate by system type.
Another view of each system’s average degradation rate and average maximum degradation rates overlaid on Process Flow Diagrams (PFDs) for each reformer type are illustrated in Figure 4 and Figure 5.
Figure 4: API 571 CCR process unit flow diagrams (From API 571 Damage Mechanisms Affecting Fixed Equipment in the Refining Industry 2nd ed. 2011, Section 5.2, Figure 5-69) overlaid with calculated average degradation rates per system type.
Figure 5: API 571 Catalytic Reforming – Fixed Bed process unit flow diagrams (From API 571 Damage Mechanisms Affecting Fixed Equipment in the Refining Industry 2nd ed. 2011, Section 5.2, Figure 5-70) overlaid with calculated average degradation rates per system type.
The systems shown in each PFD include an average degradation rate and an average maximum degradation rate.
The data-driven model
The goal of Pinnacle’s study was to utilise its ML model to accurately predict degradation rates for reformer units and associated assets and compare the model’s results to the current industry practice. While the exact implementation details of this model are beyond the scope of this article, it is important to differentiate the work of Pinnacle’s ML model from the current industry practice.
For example, consider degradation due to hydrochloric acid (HCl). This type of degradation can be modeled using the methodology found in API 581 or expertise from a materials and corrosion engineer. For HCl corrosion, API 581 specifies the expected degradation rate of an asset as a function of its metallurgy, temperature, pH, etc. Materials and corrosion engineers would typically use this information, in additional to a wealth of experience, to estimate degradation rates for a given asset. In this sense, the method prescribed by API 581 is a set of ‘rules’ that describe how degradation is expected to proceed given certain variables.
While such information is incredibly valuable, it can also be somewhat misleading when carrying out analysis due to the potential for multiple active damage mechanisms and their molecular interactions for example. Additionally, this information can be limiting regarding what information may be useful in predicting degradation. The goal of Pinnacle’s ML model is to learn how to deal with common situations using all available and pertinent data.
Pinnacle’s ML model is not explicitly coded with rules similar to API 581. Instead, the model is fed data examples that describe a given asset or component (operating conditions, process stream data, etc.) along with the measured degradation rates. The model then uses this data to learn how different variables impact degradation rates and will make inferences such as higher temperatures generally coincide with higher degradation rates without being explicitly told that this is the case. By using data-driven, inference-based learning, the model is able to make better, more informed predictions.
In addition to predicting degradation rates, the model can assess the relative importance of each variable in the dataset as illustrated in Figure 6.
Figure 6: ML model variable relative importance.
Certain variables are more important to the model than others. For example, operating temperature was found to be the most relevant variable in predicting degradation rates, followed by operating pressure and location (operator site) of the reformer. Stream information, such as hydrogen mole% and H2S ppm are also considered highly informative. In contrast, water mole%, while providing some boost to the model’s performance, is far less important to the model than some of the other variables as typically these are dry systems.
Figure 7 is an example output from the data-driven model, illustrating the predicted degradation rates for a drum from one operator in the dataset. The actual observed degradation rates from inspection history are illustrated by the teal line and the machine learning, data-driven predicted rates are shown by the grey line. The data-driven distribution indicates that rates around 4 mils/yr are likely, whereas rates greater than 7 mils/yr and less than 1 mil/yr are unlikely. The vertical green line represents the average degradation rate (around 2 mils/yr) and the black vertical line shows the modeled degradation rate (18 mils/yr), which was calculated leveraging API 581 3rd Edition and a materials and corrosion SME’s judgement.
Figure 7: Example degradation rates for a drum.
The study’s results show that the machine learning based model was much closer to the measured reality of the component than the industry standard approach (API 581) and quantified the uncertainty associated with the variability of degradation rates. This quantification of uncertainty is key to better understanding actual risk, which will improve the ability of facilities to predict potential failures, and more importantly, when to take action to effectively mitigate the risk.
Although Figure 7 illustrates only a single example, this analysis was conducted on a population containing over 10 000 assets. Table 1 contains the results comparing the mean absolute error of the industry standard approach and ML model as compared to actual degradation for this larger population.
Table 1: Mean absolute error for both the industry standard approach and ML model compared to actual degradation.
Not only was the ML model able to predict degradation with approximately 38% less error, but the model can be run in near-real time on the entire population of assets. This enables timely evaluations of changes in degradation as a result of continually changing process conditions such as during an integrity operating window (IOW) excursion or if an owner/operator needed to quickly simulate the potential impacts of a change in feedstock.
While current industry practice and subject matter expertise is not always as overconservative as depicted in Figure 7, this study shows that there is an opportunity for the industry to leverage data analytics to identify potential threats sooner, reduce uncertainty to improve predictability, and focus limited resources to drive the largest performance impact.
Constraints and limitations
While there was a large volume of data for the ML model to use, significant gaps in the data limited the performance of the model. For example, operating temperature and pressure was present in only about 70% of the dataset and stream information (H2S concentration, etc.) was even more scarce. Additionally, the dataset only contained single values for all variables even though these quantities typically fluctuate over time. Despite these limitations, the model still performed well and gave more accurate predictions than current industry standards. As both the volume and quality of the data improve, so will the ability of the model to accurately predict degradation rates.
The future of data in reliability
While Pinnacle’s study focused exclusively on predicting degradation rates, the possibilities of using data analytics within the industry are limitless. As the amount of data facilities collect, organise, and analyse increases daily, the industry must make better use of advanced technologies such as machine learning to continue to evolve current capabilities and potential use applications. When properly integrated into decision-making processes, facilities will be able to make smarter reliability decisions and realise a step change in performance.
Written by Andrew Waters, Rachel Salaiz and Ryan Myers, Pinnacle.
Read the article online at: https://www.hydrocarbonengineering.com/special-reports/11032021/data-driven-reliability/