CN114627081A

CN114627081A - Wheat imperfect grain identification method based on Mask R-CNN

Info

Publication number: CN114627081A
Application number: CN202210258861.4A
Authority: CN
Inventors: 申冉; 甄彤; 李智慧; 高辉; 祝玉华; 吴建军; 马海华
Original assignee: Henan University of Technology
Current assignee: Henan University of Technology
Priority date: 2022-03-16
Filing date: 2022-03-16
Publication date: 2022-06-14

Abstract

The invention belongs to the field of detection of imperfect wheat grains, in particular to a method for identifying imperfect wheat grains based on Mask R-CNN (Mass-noise-reduction), which aims at solving the problems of complicated manual detection steps, excessively expensive imaging equipment, low accuracy and lack of practicability and provides the following scheme, comprising the following steps of: s1: collecting an imperfect wheat grain image sample, and carrying out pixel-level labeling on the imperfect wheat grain image sample to obtain labeled label image data; s2: constructing a deep learning network model suitable for imperfect wheat grain images; s3: training and fine-tuning the deep learning network model set up in the step S2 by taking the label image obtained in the step S1 as a training sample to obtain an optimal model; s4: the trained model is used for detecting the test set, and the method described by the invention has the advantages of low requirement on image acquisition equipment, high accuracy rate of detecting imperfect grains of wheat, high speed and the like.

Description

Wheat imperfect grain identification method based on Mask R-CNN

Technical Field

The invention relates to the technical field of detection of imperfect wheat grains, in particular to a method for identifying imperfect wheat grains based on Mask R-CNN.

Background

China is the largest wheat producing country and consuming country in the world, wheat needs to be graded according to quality assessment in the circulation process, the grade of the wheat depends on the proportion of imperfect grains, and the imperfect grains of the wheat mainly comprise broken grains, worsted grains, scab grains, sprouting grains and mildewed grains according to the national standard. At present, the detection technology of imperfect wheat grains mainly comprises the following steps: (1) and (5) manual detection. However, the manual detection steps are complicated, the labor intensity is high, and classification errors are easily caused by the subjectivity of detection personnel and other reasons, so that economic disputes are brought. (2) And detecting by using the spectral information difference of perfect grains and imperfect grains of wheat. But the imaging device is too expensive. (3) Detection is by means of sound waves. However, this method is too much dependent on the sound propagation medium and is greatly affected by noise. (4) And detecting by using a traditional machine learning method. But has poor segmentation effect on the adhered wheat grains, complex extraction characteristics and low accuracy. (5) And detecting by using a deep learning method. The experimental data in the current research are interfered by manpower, so the practicability is lacked.

Therefore, the method which is efficient, accurate, high in practicability and low in requirements on imaging equipment and imperfect wheat grain images is designed, the development of grain storage informatization of grain depots can be promoted, and the society is benefited.

Disclosure of Invention

The method for identifying the imperfect wheat grains based on Mask R-CNN solves the problems that the existing manual detection steps are complicated, imaging equipment is too expensive, the accuracy is low and the practicability is poor.

In order to achieve the purpose, the invention adopts the following technical scheme:

a wheat imperfect grain identification method based on Mask R-CNN comprises the following steps:

s1: collecting an imperfect wheat grain image sample, and carrying out pixel-level labeling on the imperfect wheat grain image sample to obtain labeled label image data;

s2: constructing a deep learning network model suitable for the imperfect wheat grain image;

s3: training and fine-tuning the deep learning network model set up in the S2 by taking the label image obtained in the S1 as a training sample to obtain an optimal model;

s4: and detecting the test set by using the trained model, and distinguishing six particle types of perfect particles, damaged particles, wormhole particles, scab particles, sprouting particles and mildewed particles.

Preferably, the S1 specifically is: collecting images of perfect grains and imperfect grains of wheat, and storing the images in a picture form; carrying out pixel level labeling on the collected image to obtain a mask picture representing the wheat region; and (3) the obtained mask picture is processed according to a training set: the validation set was split at a 9:1 ratio, and the training set was grayed out to prevent distortion and sized to be divisible by 2⁶。

Preferably, in S1, the collected imperfect wheat grain samples are from the grain reserve of zheng zhou xinglong country, the data set is 2000 pieces, and the data set is as follows: and (4) verification set: test set 7: 2: the data set is divided according to the proportion of 1 to obtain 1400 training sets, 400 verification sets and 200 test sets.

Preferably, in S2, the network model adopted is a Mask R-CNN model, and the specific process is as follows:

a1: obtaining a suggestion frame by utilizing a priori frame;

a2: obtaining a prediction frame by using the suggestion frame;

a3: obtaining a semantic segmentation result by using a prediction box;

the backbone extraction network model is a ResNet101 model.

Preferably, the first and second liquid crystal materials are,in S3, the training method of the network model specifically includes: during training, the number of the categories of the initial deep learning network model is set to be 6, and the learning rate adjustment mode is specifically adopted: adam optimizes gradient descent, the number of pictures of each iteration batch processing is set to be 2, the iteration frequency is set to be 20, and the iteration precision is 1 multiplied by 10^-5。

Preferably, in S3, the network model calculates classification loss, bounding box regression loss and segmentation loss as total loss during training; the total loss calculation formula is: l ═ L_cls+L_box+L_mask(ii) a Wherein L is_clsIs a classification loss, L_boxIs the bounding box regression loss, L_mask is the segmentation loss; the classification loss comprises RPN suggested network classification loss and Fast R-CNN classification loss, the loss function of the RPN suggested network classification loss adopts multi-class cross entropy loss, and the loss function of the Fast R-CNN classification loss adopts two-class cross entropy loss; the boundary box regression loss comprises RPN boundary box regression loss and Fast R-CNN classification loss, and the loss function adopts smooth_L1A loss function; the segmentation loss function adopts binary cross entropy loss.

Preferably, the collecting images of perfect grains and imperfect grains of wheat specifically comprises: the method comprises the steps of collecting images of the upper surface and the lower surface of perfect grains and imperfect grains of wheat in batches through collection equipment, then carrying out image preprocessing, then carrying out cutting boundaries and removing image backgrounds.

Compared with the prior art, the invention has the beneficial effects that:

(1) compared with manual detection, the method is high in speed and simple in steps;

(2) compared with the traditional wheat imperfect grain segmentation method, the method reduces the over-segmentation and under-segmentation phenomena, and has high accuracy and good robustness;

(3) compared with the experimental data which are arranged orderly, the experimental data have higher practicability;

(4) compared with the detection of imperfect wheat grains based on sound waves and spectrum images, the method reduces the cost of experimental equipment;

the method described by the invention has the advantages of low requirements on image acquisition equipment, high accuracy rate on detection of imperfect wheat grains, high speed and the like.

Drawings

FIG. 1 is a flow chart of a deep convolutional neural network training method for imperfect grains of wheat in an embodiment of the present invention;

FIG. 2 is a sample used to train a wheat example segmentation network in an embodiment of the present invention;

FIG. 3 is a schematic diagram of a mask generated according to a wheat instance segmentation label in an embodiment of the present invention;

FIG. 4 is a flow chart of wheat identification based on image instance segmentation in an embodiment of the present invention;

FIG. 5 is a block diagram of ResNet101 in an embodiment of the present invention;

FIG. 6 is a flow chart of an RPN module in an embodiment of the present invention;

FIG. 7 is the result of detection of defective grains in wheat in the example of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments.

Examples

Referring to fig. 1, a method for identifying imperfect wheat grains based on Mask R-CNN comprises the following steps:

s2: constructing a deep learning network model suitable for imperfect wheat grain images;

s3: training and fine-tuning the deep learning network model set up in the step S2 by taking the label image obtained in the step S1 as a training sample to obtain an optimal model;

Referring to fig. 2 to 3, in this embodiment, the S1 specifically is:

collecting images of perfect grains and imperfect grains of wheat, and storing the images in a picture form;

carrying out pixel level labeling on the collected image to obtain a mask picture representing the wheat region;

and (3) the obtained mask picture is processed according to a training set: the validation set was split at a 9:1 ratio, and the training set was grayed out to prevent distortion and sized to be divisible by 2⁶。

Referring to fig. 4, in this embodiment, the step S2 specifically includes:

obtaining a suggestion box by using a prior box:

adding gray bars to an original image to prevent distortion, transmitting the image added with the gray bars into a trunk feature extraction network (ResNet 101) to obtain a plurality of public feature layers, namely dividing the image into grids with different sizes, wherein each grid has a plurality of prior frames with different sizes, and obtaining an adjustment parameter of the prior frames and whether objects are contained in the prior frames by using an RPN (resilient packet network) suggestion network to obtain suggestion frames;

obtaining a prediction box by using the suggestion box:

intercepting the feature layer by using the suggestion frame, wherein the intercepted part of the feature layer reflects different areas in the original image; transmitting the intercepted area to a ROIAlign layer for size unification; the input classification and regression network judges whether the intercepted content contains a target or not and adjusts the suggestion frame to obtain a prediction frame;

obtaining semantic segmentation results by using a prediction box:

and intercepting the content of the feature layer by using the prediction box, transmitting the content into the ROIAlign layer to be uniform in size, and transmitting the content into the semantic segmentation network to obtain a semantic segmentation result.

Referring to fig. 5, in this embodiment, the wheat imperfect grain example segmentation model is a Mask R-CNN model, the backbone extraction network is a ResNet101 model, and the ResNet101 has two basic blocks, which are ConvBlock and identyblock respectively, where ConvBlock is used to change the dimension of the network, and identyblock is used to deepen the network.

Referring to fig. 6, in this embodiment, the RPN proposed network is similar to that in the Faster R-CNN, and is used to obtain the prior frame adjustment parameter and whether an object exists inside the prior frame; firstly, carrying out convolution with the number of channels being 512 for 3 multiplied by 3 once on a feature map output from a trunk feature extraction network, and then respectively carrying out convolution with 4k and convolution with 2k once, wherein k is the number of anchor frames, in the example, the convolution result of 3.4k is taken to adjust a prior frame to obtain a new frame, and the convolution of 2k judges whether an object exists in the new frame; and screening a new frame with an object in the area by using non-maximum value inhibition to obtain a suggested frame.

In this embodiment, the S3 specifically includes:

in the embodiment, the imperfect wheat grain example segmentation model is pre-trained by using ImageNet or COCO or any data set in an equivalent open source data set, and then is trained by using the training set in the step one on the basis of the pre-trained model; in the training process, different hyper-parameter settings and optimizers are used for finding the best hyper-parameter combination, so that the model has the lowest loss function on the verification set;

in this embodiment, the number of categories of the initial deep learning network model is set to 6, Adam optimization gradient descent is specifically adopted as the learning rate adjustment mode, the number of pictures in each iteration batch processing is set to 2, the iteration frequency is set to 20, and the iteration precision is 1 × 10^-5。

In this embodiment, the network model calculates classification loss, bounding box regression loss, and segmentation loss as total loss during training, and then uses Adam method for optimization training.

The total loss calculation method is as follows: l ═ L_cls+L_box+L_mask(ii) a The classification loss comprises RPN suggested network classification loss and Fast R-CNN classification loss, the loss function of the RPN suggested network classification loss adopts multi-class cross entropy loss, and the loss function of the Fast R-CNN classification loss adopts two-class cross entropy loss; the multi-class cross entropy loss is calculated as: l is_cls(p,u)＝-logp_uThe two-class cross entropy loss is: l is_cls＝-log(p_i)。

Where p is the probability distribution of softmax predicted by the classifier, u is the true class label, p_iIs the probability that the ith anchor is predicted to be the target.

The boundary box regression loss comprises RPN boundary box regression loss and Mask R-CNN regression loss, and the loss function adopts smooth_L1Loss function:

wherein t is^uIs the regression parameter of the corresponding class u predicted by the bounding box regressor

v is the bounding box regression parameter of the real target (v_x,v_y,v_w,v_h)。

The loss function is:

the segmentation loss function adopts binary cross entropy loss:

L_mask(p,u)＝-(plog(u)+(1-p)log(1-u))；

wherein p is a mask label and u is a sigmoid output predicted by the segmentation model;

referring to fig. 7, in this embodiment, the step S4 specifically includes:

200 test sets are selected, the test sets are used for testing in the trained model, and finally the AP result of the network model on the test sets is 0.73.

In this embodiment, the collecting images of perfect grains and imperfect grains of wheat specifically includes: the method comprises the steps of collecting images of the upper surface and the lower surface of perfect grains and imperfect grains of wheat in batches through collection equipment, then carrying out image preprocessing, then carrying out cutting boundaries and removing image backgrounds.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims

1. A wheat imperfect grain identification method based on Mask R-CNN is characterized by comprising the following steps:

2. The Mask R-CNN-based wheat imperfect grain identification method according to claim 1, wherein the S1 specifically is: collecting images of perfect grains and imperfect grains of wheat, and storing the images in a picture form; carrying out pixel level labeling on the collected image to obtain a mask picture representing the wheat region; and (3) the obtained mask picture is processed according to a training set: 9:1 ratio split for training set, graying out training set to prevent distortion and resizing to allow for integer division by 2⁶。

3. The method for identifying imperfect wheat grains based on Mask R-CNN of claim 2, wherein in S1, the collected sample data sets of imperfect wheat grains are 2000 sheets, and the data sets are obtained according to a training set: and (4) verification set: test set 7: 2: the data set is divided according to the proportion of 1 to obtain 1400 training sets, 400 verification sets and 200 test sets.

4. The method for identifying imperfect wheat grains based on Mask R-CNN according to claim 1, wherein in S2, a Mask R-CNN model is adopted as an imperfect wheat grain example segmentation model, and a ResNet101 model is adopted as a trunk extraction network model.

5. The method for identifying imperfect wheat grains based on Mask R-CNN as claimed in claim 1, wherein in S3, the number of classes of the initial deep learning network model is set to 6 during training, and the learning rate adjustment mode specifically adopts: adam optimizes gradient descent, the number of pictures processed in each iteration batch is set to be 2, the iteration times are set to be 20, and the iteration precision is 1 multiplied by 10^-5。

6. The Mask R-CNN-based wheat imperfect grain identification method of claim 1, wherein in S3, the network model calculates classification loss, bounding box regression loss and segmentation loss as total loss during training; the classification loss comprises RPN suggested network classification loss and Fast R-CNN classification loss, the loss function of the RPN suggested network classification loss adopts multi-class cross entropy loss, and the loss function of the Fast R-CNN classification loss adopts two-class cross entropy loss; the boundary box regression loss comprises RPN boundary box regression loss and Fast R-CNN classification loss, and the loss functions are adopted

A loss function; the segmentation loss function adopts binary cross entropy loss.

7. The method for identifying imperfect wheat grains based on Mask R-CNN as claimed in claim 2, wherein the collecting images of perfect and imperfect wheat grains comprises: the method comprises the steps of collecting images of the upper surface and the lower surface of perfect grains and imperfect grains of wheat in batches through collection equipment, then carrying out image preprocessing, then cutting boundaries and removing image backgrounds.