Prosys OPC Blog

Why Industrial DataOps Should Be Part of Every Modern Factory

It’s been almost twenty years since I started working on OT-IT integration projects with factories. During that time, I have seen many analytics and AI projects succeed, and quite a few fail. The successful ones almost always had the same recipe for success: someone had done the unglamorous work of getting plant data into a clean, well-modeled format, making the analytics easy.

That work has now gotten a proper name. It is called Industrial DataOps, and although the term itself is fairly new, the practice behind it is one I have quietly recommended to customers for years.

In this blog post, I’ll explain what Industrial DataOps actually means, why I think every factory should have it, why security and open standards belong at the center of the discussion, and why AI does not make any of this less important.

What Industrial DataOps means

There are quite a few definitions floating around. The one I find most useful is simple. Industrial DataOps is the practice of moving plant data from the OT side of the factory to the IT and cloud applications that consume it, in a way that:

  1. The data is reliable and validated
  2. The data is modeled with a meaningful information model
  3. The interfaces are standard
  4. The OT data sources are properly secured
  5. There is governance over what changes and who can access what

Most factories already do these things, but in fragmented ways. Every MES project does its own tag mapping. Every analytics initiative builds its own pipeline. Every cloud rollout argues about units, naming, and access for months. Industrial DataOps is what you get when you stop doing this work over and over again, and start doing it once, properly, in a dedicated layer between OT and IT.

This is, of course, exactly the role I have always given to a good Edge application. The difference is that the wider DataOps movement has brought some discipline from the IT side — versioning, data catalogs, governance — and that discipline is now showing up in industrial deployments as well.

Why every factory should have it

Industry surveys indicate that data scientists spend 60 to 80 percent of their time on it. In factories, the share is often higher than that, simply because each site stores its tags slightly differently, each machine vendor uses its own jargon, and each PLC programmer has their own habits.

Without an Industrial DataOps layer, this work has to be redone in every new project and budget disappears before the actual business problem is even touched, and what you end up with is usually a one-off solution that nobody else can reuse. With Industrial DataOps in place, the modeling and harmonization are done once, near the source, by the people who actually understand the process. After that, every new MES extension, analytics tool, AI agent, cloud system, or other IT system inherits clean and contextualized data.

This is not a particularly new argument. It is the same argument we have been making for years about the need for a real Edge application in every factory. What has changed is that the rest of the industry now agrees, and there is finally a shared vocabulary.

Industrial DataOps architecture showing OPC UA based connectivity between IT and OT systems, including ERP, MES, AI, cloud, PLCs, historians, and databases.

Security is not a side topic

In most Industrial DataOps articles I read, security is mentioned briefly and then dropped. I find that strange, because in the current world with increasing regulation and overall awareness, security is no longer a side topic at all. NIS2 in Europe, IEC 62443 globally, and OT-targeted attacks and incidents have moved OT cybersecurity into the broader security discussion.

NIS2 in particular spells out what is now expected of a factory: documented access control, network segmentation between OT and IT, encrypted communication, centralized logging, and prompt incident reporting. Looking at the list, it is easy to recognize these are not isolated requirements, but properties that a well-designed Industrial DataOps layer already has, because the same boundary that filters and harmonizes the data is also the natural place to enforce these rules. In practice, much of the NIS2 baseline comes almost for free.

A proper Industrial DataOps performs some of the most important security work in the factory. It isolates the automation network from the IT and cloud networks, so no IT applications or users on the IT network can communicate directly with a PLC. Further, it enforces access control at a single point for all connections requesting plant data and terminates non-secure legacy automation protocols within the OT zone, while offering outside access via modern, secure protocols. And it logs data flows, which is more or less what NIS2 and similar regulations now require.

So the same architectural decision that solves your data problem also solves a big part of your security compliance problem. That alone is a reason to take Industrial DataOps seriously.

Open information models to prevent vendor lock-in

Information modeling is the most time-consuming part of an Industrial DataOps implementation. Deciding what a pump, a batch, a tank, or a quality sample looks like as data takes real effort, and once it has been done, redoing it is expensive. Most of the labor time of onboarding a DataOps goes into this kind of work.

This is also where vendor lock-in tends to sneak in unnoticed. If your DataOps platform uses its own proprietary metamodel, every hour spent modeling assets becomes an investment in that vendor’s ecosystem. The data is still yours on paper, but the meaning of the data lives inside someone else’s tool. Replacing the platform later means redoing the most expensive part of the project, which usually means it does not get replaced at all.

The way to avoid this is to use open, standardized information models. There are two important families relying on the same markup format and somewhat overlapping each other:

  • OPC UA Companion Specifications. Built by industry associations in collaboration with the OPC Foundation, these define standard models for machine tools, robotics, pumps, packaging machinery, process automation devices, weighing instruments, and many other domains. The semantics, browse names, and structures are public.
  • CESMII Smart Manufacturing Profiles. Gathered and developed by the U.S. Smart Manufacturing Institute, these define reusable data models for manufacturing assets that align closely with OPC UA and ISA-95.

When the DataOps layer uses these open models, the modeling work travels with you if you ever change vendors. New devices and IT systems that already speak the same standards connect with a fraction of the integration effort. The most expensive part of the implementation becomes the most portable, which I think is the most important architectural decision in the whole project. So when making the decision of the Industrial DataOps product, the main question should be open semantics vs. proprietary semantics, nothing else.

AI and Industrial DataOps

Factory leaders these days are getting the same question from their management: What is our AI strategy? And many people assume that modern AI is so powerful that it can be pointed at raw plant data and somehow figure things out on its own, but, as discussed already, that is just a gateway to failure.

Large language models, machine learning, and autonomous agents all need context. A model that does not know whether a temperature is in Celsius or Fahrenheit, or whether a tag called FT-101 is a flow meter on line 3 or line 7, will happily produce confident nonsense. An agent that cannot tell a maintenance event from a setpoint change will make decisions that you really do not want it to make.

Contextualized data with proper semantics and reasonable quality is exactly what Industrial DataOps provides. The OPC Foundation, CESMII, and others are already working on protocols to enable AI agents to consume industrial data in a structured, safe way. Those protocols only help if there is something coherent on the other end.

AI does not make Industrial DataOps less relevant; it makes it essential. The factories that derive the best value from industrial AI today and in the future are those with the cleanest, best-modeled, and most accessible data. The size of the AI model will, in the end, matter less than the quality of the data going into it.

Conclusion

Industrial DataOps is a fairly new term, but the practice behind it has been part of every good OT-IT project I have worked on for years. It is the discipline of treating plant data as a strategic asset: collected, modeled, secured, and reused in a controlled way through a layer that sits between OT and IT.

For factory leadership, the takeaway is reasonably simple. Industrial DataOps is worth treating as a strategic investment rather than as a side project to be pursued during the next analytics pilot. Security and standards-based modeling should be set as ground rules from day one. The data modeling work, which is the most expensive part, should be done on open standards such as OPC UA Companion Specifications and CESMII Smart Manufacturing Profiles, so you do not end up locked into a single vendor for the part that is hardest to redo. And when you plan for AI, the best thing you can do is give it data it can actually trust.

More Information and Testing

At Prosys OPC we have been building products and services around exactly these principles for dozens of years. Our Prosys OPC UA Forge is designed to be the Industrial DataOps backbone of a modern factory. It connects OT systems, harmonizes data using open OPC UA information models, manages the OT/IT security boundary, and delivers trusted data to MES, analytics platforms, and AI consumers downstream.

If you would like to discuss your Industrial DataOps strategy or evaluate Forge for your own plant, feel free to contact me directly at pyry.gronholm@prosysopc.com or reach out through our contact form.

Headshot of Pyry Grönholm

Pyry Grönholm

Related Posts