Kapernikov enables Umicore to increase plant capacity
Kapernikov helped Umicore to make its precious metal recycling processes more efficient. This enable ...
After getting a feeling for the Aquafin pump station data, we took a step back. Of course it is always fun to play with data and see what is in there, but knowing what to look for tends to make things a tad more efficient. The challenge is to detect pump failure, so the question we need to ask is: what defines a failed pump?
The possible causes we found in this dataset:
With the basic tools in place and a somewhat clear idea on what we need, it is time to rally the troops. We created a small overview on what we’ve done so far and posted this in Slack (we’re still in quarantine). With a little help from our in-house scientists, we managed to do some nice feature engineering that gives us a better view on what is going on in the data. Finding anomalies is essentially finding fingerprints for not-normal behaviour.
Since in the data the balance lies towards normal behavior (fortunately), we did some digging. A lot of things are going on in the stations: pumps starting and stopping based on water levels, rain… but we believe we managed to find some decent definitions of “normal”.
As a next step, we decided that we will focus on wet periods, with large inflows of water, to target specific malfunctions.
While part of the team is on the pump stations, the Fluvius reporting tool story is getting shape as well. At Kapernikov, we have a strong background in inventory campaigns through our work at Infrabel. Organising campaigns like these takes more than just tech skills to set up IT infrastructure. Organisation, training, collaboration between the field and the offices are all key in making a campaign successful.
Seemingly trivial things can turn on you when sending out people all over the country to collect data. Missing crucial photos, for example, can make it impossible to verify the correctness of survey results. This becomes costly as well, when it means sending field agents back for a revisit.
It is interesting to think of the contrasts in working in the world of data. On the one hand, we can geek around analysing large sets of time series data to find anomalies. On the other hand, we’re looking for the most logical flow and best questions to ask to get accurate information that can be verified off-site.
It looks like we are getting somewhere and that is a good thing, because the deadline is at the end of this week. In the coming days, we look at wrapping up our submission and when that’s done, we’ll conclude with a final post here as well.