Automated nuts quality control with AI

Published on: September 14, 2022

Food production is one of the world’s major industries. The high production load requires the automation of the entire process, including the collection, processing and sorting of products. One of the main challenges when it comes to nuts is fast and accurate automatic sorting by quality and type. For example, it is necessary to detect embedded shell fragments or to sort different batches of nut products that can vary in shape or color. To tackle these quality control problems, we need to develop a segmentation algorithm that is efficient and robust to the variations in the products we load into the machine. In this article, we will show how a deep learning algorithm can be used to detect hazelnut defects.


We created a multi-class dataset of nuts images (cashews and peanuts) using our client’s sorting machine. Because these images could reveal confidential details of our client’s optical setup, we’re not allowed to publish these images, but for demonstrative purposes, we show in fig.1 a few ground truth masks labeled manually. We aim to separate pixels associated with nuts (foreground) from the rest of the image (background).

Figure 1. Ground truth image masks with cashews (left image) and peanuts (right image).
White pixels correspond to nuts, and black pixels correspond to the background.

For study purposes, we also provide a link to our code and instructions to train this model on the MVTEC anomaly detection dataset, which is designed to benchmark anomaly detection methods with a focus on industrial control. Keep in mind that this dataset can only be used for non-commercial purposes. We limit ourselves to one particular type of object, the hazelnut, which is ideally suited to the task of food quality control discussed here.


We implemented the Industrial UNet architecture developed by NVIDIA using the Keras Deep Learning API. The network architecture is shown in fig. 2 and, briefly, consists of compression and decompression paths with skipped connections between the same levels (shown as gray arrows). The purpose of the compression part is to extract features specific to the objects you are trying to segment (nuts or defects). The decompression part localizes the extracted features, restoring the mask with predictions. Skipped connections help recover information lost during the compression process.

Figure 2. Schematic representation of industrial Unet architecture adopted from NVIDIA catalog.
Graphical notations for different mathematical operations are explained in the legend.

Our implementation differs from the original NVIDIA architecture in the following ways:

  • We made the U-Net architecture dynamic so that the user can specify the number of blocks as well as the number of convolutional filters in each block.
  • We use the focal Tversky loss function with γ = 4/3 as a focal parameter to optimize our network. This loss function yields better results on small objects.
  • We use a mean Intersection over Union (IoU) averaged over all classes as a target metric.


We prepared a simple Jupyter notebook explaining all the steps required for training a segmentation model here. All configuration parameters are specified in `configs/env.yaml`, including model, data, and training settings. We recommend using pipenv and Pipfile in the root folder of the repository to easily create a working virtual environment. The main steps consist of loading the configuration file, images and their corresponding masks, dividing the dataset into several independent partitions, training a model and saving it along with the evolution curves of the metric and loss function.


After training the model, we can test it on an independent part of the dataset that was not used during training. For convenience, if you want to skip the training step, we provide an already trained model in our github repository. The inference notebook follows the same steps as the training one for loading and processing of images and masks. But now we select a test subset after splitting the data to feed it directly to the model to make predictions. The resulting confusion matrix, accuracy metric as well as a few visualizations are shown in Figures 3 and 4.

Figure 3. Confusion matrices of cashews (top) and peanuts (bottom) in a test subset. Upper left cell – true positives (i.e. defect predicted correctly), lower right cell – true negatives (i.e. no defect predicted correctly), upper right cell – false negatives (i.e. defect not recognized), lower left cell – false positives (i.e. erroneously predicted defect). Accuracy metric is shown at the bottom.
Figure 4. A few images demonstrating segmentation of cashews (red) and peanuts (green) from the background. Ground truth masks and predicted masks are in the left and right panels correspondingly.

In this article, we saw how to train and use a deep learning algorithm to efficiently detect different types of nuts. Being integrated into a sorting machine, this algorithm can be used in real-time food sorting and real-time automated quality inspection. 

It seems like you're really digging this article.

Subscribe to our newsletter and stay up to date.

    guy digging a hole