CN113158956B

CN113158956B - Garbage detection and identification method based on improved yolov network

Info

Publication number: CN113158956B
Application number: CN202110481897.4A
Authority: CN
Inventors: 程自帅; 林志赟; 范钰捷; 王博; 韩志敏; 钟深友
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2024-06-14
Anticipated expiration: 2041-04-30
Also published as: CN113158956A

Abstract

The invention discloses a garbage detection and identification method based on an improved yolov network, which comprises the steps of firstly, establishing and marking a garbage data set containing 36 categories, and expanding the whole data set by utilizing a data enhancement means and supplementing the categories with insufficient quantity after analysis by using a data analysis method. Then, a target detection network is established based on yolov, an attention mechanism is introduced to improve the backbone network part, and a new small-size branch is added to improve the PaNet detection head part. The bottleneckCPS module of the network introduces a ghost structure, and uses deep separable convolution in the high-dimensional network, so that the parameter number of the network is reduced, and the improved yolov network is finally obtained. Finally, the garbage picture is sent to the improved yolov network after being preprocessed, and a detection and identification result is output. The method can realize simultaneous detection and identification of multiple garbage in the same frame image, the identification types are up to 36 types, and compared with the original yolov network, the method improves the detection precision, reduces the network parameter quantity and has certain application value.

Description

Garbage detection and identification method based on improved yolov network

Technical Field

The invention belongs to the technical field of deep learning target detection, and particularly relates to a garbage detection and identification method based on an improved yolov network.

Background

With the improvement of living standard of residents and the acceleration of urban progress, the garbage yield of residents in China is continuously improved, the average garbage yield reaches 0.36 ton in 2019, the data can be further increased, and great challenges are brought to environmental protection and garbage recovery. Currently, nations and society call residents to sort garbage, so that garbage is turned into wealth, and resources at wrong positions are reused. In the society of present, the garbage throwing process still mainly adopts volunteer supervision, or only mechanically placing garbage cans of different types for recycling; for sorting after garbage recovery, a pipeline manual operation mode is mostly adopted, so that a great deal of manpower and material resources are wasted. Meanwhile, the mainstream garbage classification algorithm only can give single garbage class output to an input image, and multi-class detection and recognition cannot be carried out when multiple kinds of garbage coexist, so that the intelligent supervision of the garbage throwing terminal and the intelligent sorting of the mechanical device bring much inconvenience. Therefore, the limitation of the current algorithm and the requirements of higher precision, more categories and faster processing speed of the intelligent AI monitoring system of the garbage throwing terminal and the intelligent sorting equipment during recycling force us to design a detection and identification method which is more perfect and can simultaneously identify garbage of various categories. Therefore, a garbage detection data set containing various types is established, a yolov network is improved, a garbage detection and identification algorithm for various types of garbage is provided, garbage detection and identification problems can be better solved, and the garbage detection and identification method has extremely wide application scenes and market value.

Disclosure of Invention

In view of the above circumstances, the invention provides a garbage detection and identification method based on an improved yolov network, which can detect and identify multiple kinds of garbage in a picture or video stream and give out position information at the same time, improves the precision to a certain extent, reduces the computational complexity of a network structure, is convenient for terminal equipment deployment, and can be used for the intelligent garbage treatment process.

In order to achieve the above purpose, the technical scheme of the invention is as follows: a rubbish detection and identification method based on an improved yolov network comprises the following steps:

S1: and collecting and labeling garbage images containing various categories, carrying out data enhancement after data statistics analysis and data set expansion, and establishing a garbage detection data set containing various categories.

S2: the yolov network is selected as a reference network, the structure of the detection head is improved, the branches of the small-size target detection head are increased, and a attention mechanism is introduced into a backbone network, so that the network is more focused on a region with garbage, the omission rate is reduced, meanwhile, the standard bottleneck structure in the network is replaced by a bottleneck structure based on ghost, the depth separable convolution is used in a network layer with the channel dimension being more than 512 dimensions, the parameter number of the network is reduced, the improved yolov network is constructed as a multi-type garbage detection network, the dataset of the step S1 is preprocessed to the standard size, and then the multi-type garbage detection network is trained, so that the multi-type garbage detection reasoning model is obtained.

S3: and inputting the garbage picture which is preprocessed to the standard size and needs to be detected and identified into a trained multi-type garbage detection inference model, and detecting the types and positions of all garbage in the image.

Further, in the step S1:

step S11: all the garbage categories required are determined, and a crawler script is written to collect public garbage picture data sets of corresponding categories on the network; labeling the collected data set by using labImg tools, wherein labeling information is stored in an xml format and mainly comprises position coordinate information, garbage size length and width information and garbage type information of garbage in a picture;

Step S12: after the marking is finished, respectively writing a garbage number statistical script, marking information modification scripts and marking position information statistical scripts by using python to carry out data analysis; aiming at the garbage types with the picture data quantity less than 600, manually shooting garbage pictures under different illumination, environments and shooting angles, marking and expanding the garbage pictures into a data set;

Step S13: and supplementing and expanding the whole data set by using a target detection universal data enhancement means of pixel content transformation and space geometric transformation for the expanded data set.

Further, in the step S2: the small-size target detection head branch is used for detecting small target objects, including toothpicks, eggshells and the like.

Further, in the step S2:

Step S21: selecting a basic network yolov, wherein the basic network mainly comprises a backbone network and a detection head network PaNet, an original detection head with the size of PaNet increased by 160 x 160 pixels is improved into a four-detection head structure, a branch is led out from a characteristic layer with the size of 160 x 160 pixels in the backbone network, a branch with the size of 160 x 160 pixels is led out from a characteristic image with the size of 80 x 80 pixels in the highest dimension of the detection head PaNet network through up-sampling operation, the two branches are combined and then input into a bottleneck structural integration characteristic image, and a convolution kernel with the size of 1*1 pixels is input for output; introducing an attention mechanism in the last layer of the backbone network and maintaining the network dimension unchanged; the standard bottleneck structure in the network is replaced by a bottleneck structure based on the ghost, for the original standard bottleneck structure with the step length of 1, the input feature map is divided into two branches, the first branch is sequentially output after passing through a ghost module, a BN layer, a ReLU activation function, the ghost module and the BN layer, the second branch is output after passing through a depth separable convolution layer, the two branch outputs are combined to be output as a whole, for the original standard bottleneck structure with the step length of 2, the input map is also divided into two branches, the first branch is sequentially output after passing through the ghost module, the BN layer, the ReLU activation function, the depth separable convolution module, the ghost module and the BN layer, the second branch is sequentially output after passing through the depth separable convolution layer and the standard convolution layer, and the two branch outputs are combined to be output as a whole; replacing a common convolution module with a channel dimension larger than 512 dimensions in a network with a depth separable convolution module; sequentially stacking the modified structures and modules according to the original yolov network form to obtain an improved yolov network;

Step S22: preprocessing the garbage detection data set obtained in the step S1, converting all labeling files from an xml format into a txt format, normalizing coordinates, uniformly scaling the txt text after completion to 640 x 640 pixel sizes, taking 80% data as a training set and 20% data as a verification set, and thus finishing all preprocessing before picture input;

Step S23: and building a virtual environment for a training model on the GPU server, inputting a training set into the improved yolov network to train the target detection model after the training is completed, and obtaining an inference model for detecting and identifying various garbage at the same time after the training is completed.

Further, in the step S3: and (2) inputting the picture or video stream to be detected into an inference model obtained after training in the step (S2), namely obtaining the categories of all garbage in the output file or the output video stream, and framing out the coordinate positions, thereby realizing the simultaneous identification, detection and positioning of various garbage.

The invention has the beneficial effects that: the number of the detectable and identifiable garbage categories is greatly increased to 36 categories, and different from a classification network, the garbage categories can be detected and identified in the same picture or video stream, and coordinate information in an image is given, so that the detection accuracy is improved, the calculation complexity of a model is reduced, and the method can be used for the intelligent garbage treatment process.

Drawings

FIG. 1 is a flow chart of a method for detecting and identifying garbage based on a modified yolov network according to one embodiment of the present invention;

fig. 2 is a schematic diagram showing a size change of a modified four detection heads PAnet of a garbage detection and identification method based on a modified yolov network according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating the attention mechanism of a method for detecting and identifying spam based on the improved yolov network according to one embodiment of the present invention;

Fig. 4 is a schematic diagram of a structure of a ghost-based bottleneck of a method for detecting and identifying garbage based on a modified yolov network according to an embodiment of the present invention;

Fig. 5 is a network structure of a modified yolov of a method for detecting and identifying garbage based on a modified yolov network according to an embodiment of the present invention.

Detailed Description

As shown in fig. 1, the garbage detection and identification method based on the improved yolov network provided by the invention comprises the following steps:

S1: the method comprises the steps of collecting and labeling garbage images containing various categories, acquiring a dataset by referring to a network public database and writing a crawler script to crawl, labeling the acquired images by using labelImg tools according to the part of data, wherein the main labeling information is the type, the size and the position of a target in the images, and acquiring a complete initial dataset after completion. After data statistics analysis and expansion of the data set, data enhancement is performed to establish a garbage detection data set containing 36 kinds of kinds. The method comprises the following specific steps:

Step S11: all the garbage categories required are determined, and a crawler script is written to collect public garbage picture data sets of corresponding categories on the network; labeling the collected data set by using labImg tools, wherein labeling information is stored in an xml format and mainly comprises position coordinate information, garbage size length and width information and garbage type information of garbage in a picture; all garbage categories include 36 kinds of garbage, which are respectively: the household multifunctional household food comprises a charger, a bag, a washing article, a plastic toy, a plastic container, a plastic clothes hanger, a glassware, a metal container, an express paper bag, a plug wire, old clothes, a pop can, a pillow, a plush toy, shoes, a chopping board, a paper box, a seasoning bottle, a wine bottle, a metal food can, a metal kitchen ware, a pot, a plastic bottle, book paper, a big bone, a dry cell, ointment, expired medicine, a disposable snack box, stained plastic, cigarette butts, toothpicks, flowerpots, ceramic containers, chopsticks and stained paper.

Step S12: and (3) carrying out data analysis on the obtained initial dataset, respectively writing a garbage number statistics script, a labeling information modification script and a labeling position information statistics script by using python for carrying out data analysis, checking data analysis conditions, screening out pictures with fuzzy, labeling errors and overproof sizes, manually shooting and complementing garbage types with the picture data quantity of less than 600 under different angles, illumination, environments and forms, labeling the newly added dataset, merging all datasets to obtain a preliminary expansion dataset, wherein the preliminary expansion dataset contains 36 types, 13591 pictures and 21657 labeling targets.

Step S13: and the obtained preliminary expansion data set is subjected to data enhancement, so that samples of the model are increased, and a large amount of data is helpful to improve the generalization capability of the model. The method mainly uses a general enhancement means for target detection of pixel content transformation and space geometry transformation. And finally, a complete data set for model training, testing and verification is obtained.

S2: the yolov network is selected as a reference network, the structure of the detection head is improved, and small-size target detection head branches are added for detecting small target objects, including toothpicks, eggshells and the like, and the modified detection head structure and the size transformation of the characteristic diagram are shown in the figure 2. Attention mechanisms are introduced into the backbone network, so that the network is more focused on the area with garbage, the omission ratio is reduced, and the attention mechanism structure is shown in the figure 3; meanwhile, a standard bottleneck structure in a network is replaced by a bottleneck structure based on the ghest, for an original standard bottleneck structure with the step length of 1, an input feature graph is divided into two branches, the first branch sequentially passes through a ghest module, a BN layer, a ReLU activation function, a ghest module and a BN layer and then is output, the second branch sequentially passes through a depth separable convolution layer and then is output, merging operation is carried out on the two branch outputs to serve as integral output, for an original standard bottleneck structure with the step length of 2, the input graph is also divided into two branches, the first branch sequentially passes through the ghest module, the BN layer, the ReLU activation function, the depth separable convolution module, the ghest module and the BN layer and then is output, the second branch sequentially passes through the depth separable convolution layer and then is output, the two branch outputs are combined to serve as integral output, and a bottleneck structure based on the ghest is shown in a drawing 4; the depth separable convolution is used in the network layer with the channel dimension larger than 512 dimensions, the parameter quantity of the network is reduced, the improved yolov network is constructed to serve as a multi-type garbage detection network, the overall structure of the network is shown in the figure 5, the dataset in the step S1 is preprocessed to the standard dimension, and then the multi-type garbage detection network is trained, so that a multi-type garbage detection reasoning model is obtained. The method comprises the following specific steps:

Step S21: selecting a basic network yolov, wherein the basic network mainly comprises a backbone network and a detection head network PaNet, an original detection head with the size of PaNet increased by 160 x 160 pixels is improved into a four-detection head structure, a branch is led out from a characteristic layer with the size of 160 x 160 pixels in the backbone network, a branch with the size of 160 x 160 pixels is led out from a characteristic image with the size of 80 x 80 pixels in the highest dimension of the detection head PaNet network through up-sampling operation, the two branches are combined and then input into a bottleneck structural integration characteristic image, and a convolution kernel with the size of 1*1 pixels is input for output; introducing an attention mechanism in the last layer of the backbone network and maintaining the network dimension unchanged; replacing the standard bottleneck structure in the network with a ghost-based bottleneck structure; replacing a common convolution module with a channel dimension larger than 512 dimensions in a network with a depth separable convolution module; sequentially stacking the modified structures and modules according to the original yolov network form to obtain an improved yolov network;

Step S23: and constructing a virtual environment container for training a model on a GPU server by utilizing a dock technology, loading a necessary dependency library, completing the writing of frame codes by utilizing pytorch, uploading and loading a data set to a modified yolov5 network to train a target detection model, and obtaining an inference model for simultaneously detecting and identifying various garbage after training.

S3: and inputting the garbage picture which is preprocessed to the standard size and needs to be detected and identified into a trained multi-type garbage detection inference model, and detecting the types and positions of all garbage in the image. The method comprises the following steps: and (2) inputting the picture or video stream to be detected into an inference model obtained after training in the step (S2), namely obtaining the categories of all garbage in the output file or the output video stream and framing the coordinate positions, and not only outputting one category, so that the simultaneous identification, detection and positioning of multiple garbage are realized.

It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. While still being apparent from variations or modifications that may be made by those skilled in the art are within the scope of the invention.

Claims

1. The garbage detection and identification method based on the improved yolov network is characterized by comprising the following steps of:

s1: collecting and labeling garbage images containing various categories, carrying out data enhancement after data statistics analysis and data set expansion, and establishing a garbage detection data set containing various categories;

S2: the yolov network is selected as a reference network, the structure of a detection head is improved, a small-size target detection head branch is increased, and a attention mechanism is introduced into a backbone network, so that the network is more focused on a region with garbage, the omission rate is reduced, meanwhile, a standard bottleneck structure in the network is replaced by a bottleneck structure based on ghost, depth separable convolution is used in a network layer with a channel dimension of more than 512 dimensions, the parameter of the network is reduced, an improved yolov network is constructed as a multi-class garbage detection network, a dataset is preprocessed to a standard size, and after GPU training, a multi-class garbage detection reasoning model is obtained; the specific implementation is as follows:

step S23: setting up a virtual environment for a training model on a GPU server, inputting a training set into an improved yolov network to train a target detection model after the training is finished, and obtaining an inference model for detecting and identifying various garbage at the same time after the training is finished;

2. The method for detecting and identifying garbage based on the improved yolov network according to claim 1, wherein in the step S1:

3. The method for detecting and identifying garbage based on the improved yolov network according to claim 1, wherein in the step S2: the small-size target detection head branch is used for detecting small target objects, including toothpicks, eggshells and the like.

4. The method for detecting and identifying garbage based on the improved yolov network as claimed in claim 1, wherein in the step S3: and (2) inputting the picture or video stream to be detected into an inference model obtained after training in the step (S2), namely obtaining the categories of all garbage in the output file or the output video stream, and framing out the coordinate positions, thereby realizing the simultaneous identification, detection and positioning of various garbage.