Data science in Industry 4.0

Data science in Industry 4.0

Big data, machine learning, product digital twins and predictive maintenance—is it all only about marketing buzzwords or there is something real going on behind them? Industry 4.0 is becoming more and more tangible, but not everyone by far can realize what it is going to look like in the near future. At the same time, it is clear that this is something to come true in all industries—most likely, through an evolutionary process.

There are many articles in virtually all industrial magazines about Industry 4.0 and the problems it is supposed to address, so avid readers can easily google it. In this small article, we would like to share our experience and vision of the technologies that might help solve the problems.

Data science. How is it connected to Industry 4.0?

Why the big data concept, along with its analytical methods, is so important for the implementation of Industry 4.0? To make a factory smart and self-sufficient, it needs to be trained; and we need to take advantage of the best machine learning practices used in the manufacturing processes. The main purpose of machine learning is to develop algorithms that make predictions based on data. The algorithms/methods adaptively improve their performance as long as the number of learning samples increases; and that is why the accurate data becomes crucial.

What is the data then? It is a set of grouped metrics describing the properties of an object. Everything is data if you look close enough:

data

And what are the methods? Surprisingly, some methods have already been in place for a few centuries. Fundamental mathematical methods to solve such machine learning problems as data clustering, classification and regression were reported by scientists in 19th and 20th centuries already. Linear regression based on least squares is a good example introduced by Gauss in 1809. The absence of accurate and sufficient data used to prevent any learning experience for a long time. Now, thanks to the Internet of Things developments and the availability of modern and powerful IT infrastructures, we have gained the possibility to collect more and more data about manufacturing processes.

What does it all mean and where does it go to?

As we have mentioned at the beginning of this article, the new Industry 4.0 paradigm is coming in by small steps. For example, before we reach the Autonomous Factory level, technological process transparency must be secured. It again denotes that qualified data are to be monitored and collected during the manufacturing processes execution. As one of the next steps, we could imagine that the collected data will be getting bigger and bigger, more and more accurate and representative. This is what makes the machine learning possible and efficient. This will in turn allow solving such important business problems as: time-referenced predictive maintenance of factory equipment, raising the overall performance of equipment to the next level, optimizing the supply chain management and other concomitant processes.

Well, and what about technologies?

Industrial standards, of course, differ from those widespread in consumer markets, but their basic principles are the same. Thus, we see it as a great advantage to have already functioning systems on the Internet of Things market — smart houses, and predictive systems in, for example, Health, Insurance and E-Commerce domains. The technologies can be (and will be) reused in Industrial Automation as well. The similarity can be noticed in all layers, starting from the same IoT devices connected to different sensors for equipment temperature, sound and vibration, which send runtime telemetry to the Edge computer in a Factory or directly to the Cloud via MQTT or other suitable protocols. For data analysis purposes, modern and rapidly developing tools like Python and R can be utilized in the same way as implemented in consumer applications nowadays. Quick access to big data and its visualization can be achieved with the help of many modern tools like Elasticsearch and Kibana. The amount of data is growing fast everywhere today and the corresponding technologies are evolving even faster. We just have to choose the most suitable ones and bring them, in conjunction with the proven experience, to the Industry 4.0 field.