Master's Thesis Recap: Industrial Time Series Analysis using Deep Learning

At the end of 2024, I had completed all the coursework for Aalto University’s Master’s Programme in Automation and Electrical Engineering, and it was time to choose a topic for my thesis with Prosys OPC. After a few initial discussions, the company’s CEO, Pyry Grönholm, proposed an intriguing idea: what if we could analyze the time series data that flows through many of our software products? The field of time series analysis was an uncharted territory both for the company and myself, differing greatly from the usual OPC UA related topics.

Time series is a common data format consisting of measurements recorded at regular intervals and arranged in time order. The field of time series analysis refers to analyzing certain aspects of this data. My thesis covers two of the most popular areas in this field, which are forecasting and anomaly detection. In forecasting, we try to predict future data based on past measurements, while in anomaly detection the aim is to find abnormalities from the data. Time series analysis has taken massive steps in recent years thanks to the development of deep learning, where neural networks are used to solve complex tasks with a large amount of data. My thesis focuses on industrial time series data that are often sensor measurements from machines and automation processes. The aim was to study and benchmark modern deep learning -based models on this type of data.

There are many relevant use-cases and benefits in analyzing the data coming from your industrial automation system. Anomaly detection allows condition monitoring and predictive maintenance which are two of the most impactful: by detecting early signs of equipment wear or unusual behavior, companies can prevent unplanned downtime and keep machines from breaking. Forecasting, on the other hand, helps with planning the process and can even predict when something unusual might happen. While the analysis can be done with simple algorithms, modern time series analysis models work better on complex type of data and do not require much manual configuration.

The full thesis is available in Aalto’s archive.

Test Setup

The practical part of my thesis centered around testing different time series analysis models on diverse industrial datasets. These datasets consisted of various measurements, such as temperature, pressure or water level, each with thousands of data points. Some of the datasets also had known anomalies. These anomalies can appear as unusually high or low values, sudden spikes, or changes in the overall data pattern.

Using these datasets, I benchmarked a couple dozen of the most popular time series analysis models. In forecasting, I ranked them based on the absolute error between the forecast and the actual value. For anomaly detection, I used the optimal F1-score that describes a balance between finding all anomalies with as few false positives as possible.

Results

In time series forecasting, the models can roughly be divided into four architectures, which are Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), Transformers and Multi-Layer Perceptrons (MLP). The major finding of my thesis was that the MLP-based models consistently outperformed the other architectures. This is good news since these models are also the lightest in terms of computation time, which meant that they are fast enough to compute without a GPU within a reasonable amount of time. A model named NHITS was the most accurate in the benchmarks, while NLinear and DLinear best combined both accuracy and computation time.

From the forecasts in the first image below, we can see that best models are able to accurately forecast periodic data. The second image showcases how the models can capture the pattern of the data quite well, but as the forecasting horizons extends, the trend of the data often loses accuracy.

Forecast on industrial cutting machine data
Forecast on periodic data from industrial cutting machine using NHITS model.
Forecast on data from three sensors.
Forecast on data from three sensors using DLinear model.

In anomaly detection, there are two main approaches, which are based on forecasting and reconstruction. My thesis focuses on the latter, where the model is trained with normal data without anomalies. It learns to accurately reconstruct normal data after it first compresses it, but once abnormal data is fed to the model, the reconstruction is poor and this would be flagged anomalous. Models named MAD-GAN, VAE and LSTM-AD were found to be the most accurate anomaly detection models in the benchmarks. Same major challenges in anomaly detection are setting the sensitivity and selecting the window size, which determines how large a chunk of data is analyzed at once.

The image below shows machine temperature data where the dataset provider informed of three incidents of machine failures, marked with green. The red dots were the anomalies detected by the MAD-GAN anomaly detection model. The lower plot shows the anomaly score where a higher value indicates a higher probability of abnormal behavior. As we can see, the model was able to detect all anomalies, although it was set slightly too sensitive and therefore detected some smaller spikes as well.

Anomaly detection on machine failure data.
Anomaly detection on machine failure data using MAD-GAN.

Conclusions and Future Work

The main conclusion of my thesis was that modern deep learning -based time series analysis models can be a viable option to be used for industrial data but their actual applicability heavily depends on the nature of the data. Excessive randomness in the data can make it impossible to analyze. Also, intended changes it automation process can easily be flagged by the anomaly detection models, and also be difficult to forecast. For these reasons, the results produced by such models should never be taken at face value. Instead, they should be treated as decision-support tools that complement domain expertise rather than replace it.

As a continuation to my thesis, I have started to develop a prototype plugin for OPC UA Forge that can be used to analyze the time series data through its history API. It has a web UI similar to Forge where the user can easily the run the models with the settings of their choosing. The results are plotted where you can see the predicted data or the spotted anomalies. In the future, these results could perhaps be sent back to Forge, for example as OPC UA alarm events. The continuation of this project depends heavily on whether our customers would find this type of functionality useful in their actual industrial environment. If you are interested in such feature or would even like to test the prototype plugin, please contact our sales team.

Headshot of Artturi Korhonen

Artturi Korhonen

Software Engineer

Email: artturi.korhonen@prosysopc.com

Related Posts

Displaying Sensor Data with Forge’s Grafana Plugin

In our latest blog post we demonstrate how OPC UA Forge can receive air-quality sensor measurements from Ruuvi devices and stream them through an MQTT broker into a structured information model in Forge. The data is then visualised in real-time or historically via the Forge Grafana Plugin — providing a seamless, no-code path from sensor to dashboard.

Read More »

Interested in this topic?

Get updated about new posts through our newsletter!