CN116363138B - Lightweight integrated identification method for garbage sorting images - Google Patents

Lightweight integrated identification method for garbage sorting images Download PDF

Info

Publication number
CN116363138B
CN116363138B CN202310638350.XA CN202310638350A CN116363138B CN 116363138 B CN116363138 B CN 116363138B CN 202310638350 A CN202310638350 A CN 202310638350A CN 116363138 B CN116363138 B CN 116363138B
Authority
CN
China
Prior art keywords
lightweight integrated
training
lightweight
training set
classifier unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310638350.XA
Other languages
Chinese (zh)
Other versions
CN116363138A (en
Inventor
梁桥康
邓淞允
秦海
邹坤霖
肖海华
方乐缘
汤琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202310638350.XA priority Critical patent/CN116363138B/en
Publication of CN116363138A publication Critical patent/CN116363138A/en
Application granted granted Critical
Publication of CN116363138B publication Critical patent/CN116363138B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B07SEPARATING SOLIDS FROM SOLIDS; SORTING
    • B07CPOSTAL SORTING; SORTING INDIVIDUAL ARTICLES, OR BULK MATERIAL FIT TO BE SORTED PIECE-MEAL, e.g. BY PICKING
    • B07C5/00Sorting according to a characteristic or feature of the articles or material being sorted, e.g. by control effected by devices which detect or measure such characteristic or feature; Sorting by manually actuated devices, e.g. switches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B07SEPARATING SOLIDS FROM SOLIDS; SORTING
    • B07CPOSTAL SORTING; SORTING INDIVIDUAL ARTICLES, OR BULK MATERIAL FIT TO BE SORTED PIECE-MEAL, e.g. BY PICKING
    • B07C2501/00Sorting according to a characteristic or feature of the articles or material to be sorted
    • B07C2501/0054Sorting of waste or refuse
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a lightweight integrated recognition method for garbage sorting images, which comprises the steps of constructing a lightweight integrated recognition network, wherein the lightweight integrated recognition network comprises a basic prediction model and two lightweight integrated classifier units B1 and B2; the basic prediction model is divided into a front end structure and a tail end classifier unit B0; the backbone structures of B1 and B2 are the same as B0, and a channel attention mechanism is added between the last convolution layer and the global pooling layer; training a basic prediction model; fixing the structural parameters before the channel attention mechanisms of B1 and B2 to be consistent with B0, and training the rest structural parameters of B1 and B2 by using the current training set; updating the training sets of B1 and B2 to differentiate the two based on the category of the misprediction result, and retraining the corresponding classifier units; and (3) carrying out light integrated identification on the garbage sorting image by combining the voting module with the B0, the B1 and the B2. The method can greatly improve the classification precision of the garbage sorting images.

Description

Lightweight integrated identification method for garbage sorting images
Technical Field
The application belongs to the field of image processing, and particularly relates to a lightweight integrated identification method for garbage sorting images.
Background
With the rapid development of human science and technology and the acceleration of the urban process, the quality of life of urban residents is continuously improved, which leads to the rapid increase of the yield of household garbage. How to treat the garbage and further suppress the generation of more garbage is an urgent social problem, and in order to build a better living environment, sorting of garbage is an essential means for treating the living garbage, and how to sort the garbage better is a current research hotspot.
The rapid classification of garbage images can provide great convenience for classifying garbage at high quality and high speed, and with the development of artificial intelligence technology, more and more deep learning methods are applied to garbage sorting, such as ResNet, VGGNet, garbageNet. However, since the areas where the household garbage is generated are wide, the classification scenes are different, and most of the treatment sites and collection sites of the household garbage are difficult to use high-power equipment to replace manual distinction. Existing conventional deep learning models are often difficult to apply to low-power devices due to the large number of parameters and computational complexity. Meanwhile, the lightweight model is difficult to achieve a higher classification level in classification precision, and has poor robustness and generalization capability, so that the problem of high-quality garbage image classification in a low-computation force field is solved.
As a traditional machine learning method, ensemble learning is widely applied to various deep learning models in the front field of image recognition at present, so that different deep learning models are fused, and the overall classification accuracy is improved. However, the fusion of the method and the deep learning greatly increases the overall calculation complexity of the deep learning model and the parameter number of the model, which brings a difficult problem for the application of the method to a low-calculation force scene.
Disclosure of Invention
The application provides a lightweight integrated recognition method for garbage sorting images, which is characterized in that two lightweight integrated classifier units are used for carrying out differential retraining based on sample sets added with different prediction error classes, and a plurality of prediction results are fused by combining a basic prediction model, so that the sorting precision of the garbage sorting images is greatly improved on the basis of a small amount of enhancement parameters and calculation complexity.
In order to achieve the technical purpose, the application adopts the following technical scheme:
a lightweight integrated recognition method for a trash sorting image, comprising:
step 1, obtaining a large number of marked garbage images, and constructing an original training set, a verification set and a test set; the original training set is used as the current training set of each lightweight integrated classifier unit;
step 2, constructing a lightweight integrated recognition network, which comprises a basic prediction model, two lightweight integrated classifier units B1 and B2 and a voting fusion module; the basic prediction model is divided into a front-end structure and a tail-end classifier unit B0, and initial parameters of the model are obtained by pre-training an image classification data set ImageNet; each lightweight integrated classifier unit adopts the same backbone structure as B0, and a channel attention mechanism is added between the last convolution layer and the subsequent global pooling layer; b0, B1 and B2 share the front end structure of the same basic training model during training and testing;
step 3, training the basic prediction model again by using the original training set;
step 4, fixing the structural parameters before the channel attention mechanism of the lightweight integrated classifier unit to be consistent with the corresponding structural parameters in the basic prediction model after retraining, and training the rest structural parameters of the lightweight integrated classifier unit by using the current training set; then using the verification set to calculate the prediction precision of the lightweight integrated classifier unit obtained in the training;
step 5, judging whether the prediction precision of the lightweight integrated classifier unit reaches a given precision;
if the prediction precision of the two lightweight integrated classifier units reaches the given precision, executing the step 6;
if the prediction precision of a certain lightweight integrated classifier unit does not reach a given precision, discarding the training of the lightweight integrated classifier unit in the step 4 in the current cycle, modifying the random factors of the lightweight integrated classifier unit and the random factors of the image input sequence, and executing the operations of the step 4 and the step 5 on the lightweight integrated classifier unit again;
step 6, the lightweight integrated recognition network obtained by the current training is used for carrying out integrated recognition on the verification set, an integrated recognition precision curve is observed, if the curve converges, the lightweight integrated recognition network obtained by the current training is used for carrying out integrated recognition on the test set, and the lightweight integrated recognition on the garbage sorting image of the test set is completed; if the curve is not converged, continuing the step 7;
step 7, inputting the current training sets of the two lightweight integrated classifier units into the lightweight integrated recognition network and the basic prediction model to obtain a prediction result; aiming at a lightweight integrated classifier unit B1, screening training set samples in which a predicted result belongs to a first type of error, enhancing the training set samples, and adding the training set samples to a current training set of the lightweight integrated classifier unit B1; aiming at a lightweight integrated classifier unit B2, screening training set samples in which a predicted result belongs to a second class of errors, enhancing the training set samples, and adding the training set samples to a current training set; and re-executing the step 4-6 until the light integrated identification of the test set garbage sorting image is completed.
Further, the basic prediction model adopts MobileNet V2.
Further, the first type of error and the second type of error are defined as: the method comprises the steps of randomly dividing the label category of the garbage sorting image into two major categories A1 and A2, defining the condition that the actual label in the current training set belongs to A1 and the predicted result belongs to A2 as a first type error, and defining the condition that the actual label in the current training set belongs to A2 and the predicted result belongs to A1 as a second type error.
Further, the tag class of the garbage image includes: cardboard, metal, plastic, glass, paper, other waste.
Further, the sample enhancement processing in step 7, specifically, the sample replicationPortion and go onEnhancement treatment of the copies, i.e. corresponding increase for each sample screenedA new sample of the correlation;
for training set samples with prediction results belonging to a first type of errors, the offline enhancement mode comprises the following steps: (1) random horizontal and vertical flipping, (2) random rotation by a plurality of 90 °, (3) random addition of gaussian noise with 50% probability, (4) random filtering of images with 50% probability;
for training set samples with prediction results belonging to the second class of errors, the offline enhancement mode comprises the following steps: (1) color equalization processing with 50% probability, (2) dithering color saturation with 50% probability, (3) changing image contrast over a random range interval.
Further, when the current training set obtained after the enhancement processing in the step 7 is used for returning to the step 4 to retrain the lightweight integrated classifier unit, the on-line enhancement complexity of the garbage sorting image is reduced within a given amplitude range; the online complexity enhancement method comprises the following steps: the range of random image cuts, the range of random color dithering, and the number of such on-line enhancements applied.
Further, the smaller the range of random image interception is, the greater the on-line enhancement complexity is; the larger the random color dithering range is, the larger the on-line enhancement complexity is; the greater the number of online enhancements applied, the greater the online enhancement complexity.
The method greatly improves the recognition accuracy of the model on the basis of less increasing the calculation complexity of the original basic prediction model, thereby further improving the recognition accuracy of the existing lightweight garbage sorting image recognition model, overcoming the problem of unbalanced categories of the original data set to a certain extent, and being capable of carrying out real-time recognition on low-power consumption and low-calculation-force equipment. Compared with the existing lightweight identification technology for garbage sorting images, the method has the following advantages:
(1) The training method for the basic prediction model and the lightweight integrated classifier unit provided by the method can be established on any image classification and identification model based on a deep learning method, and have universality.
(2) Compared with the prior art, the training method for the basic prediction model can effectively improve the identification precision of the original model, and in the embodiment of the method, the method can effectively improve the classification performance of the MobileNet V2 on garbage sorting images, can achieve 94.42% of test precision on a TrashNet data set, and improves the precision by 3% compared with the prior art.
(3) According to the application, the voting strategy is used for fusing the basic prediction model and the two lightweight integrated classifier units, and through the terminal integrated learning method, the classification precision of the original recognition model can be effectively improved on the basis of lightweight, and the robustness of overall model recognition is improved. In the embodiment of the method, the testing precision of the original model can be improved by 1.16 percent by using the method, and the computational complexity is improved by only 0.207GFLOPs.
(4) The method has strong practicability, can be conveniently used for identifying the garbage image on the equipment with low power consumption and low calculation power based on the light-weight technology, and has higher identification precision and generalization capability compared with the prior art.
Drawings
FIG. 1 is a training flow chart of the lightweight integrated unit according to the present application.
Fig. 2 is a schematic diagram of an overall recognition flow of the lightweight integrated recognition method for garbage sorting images according to the present application.
Fig. 3 is a schematic diagram of a channel attention module according to an embodiment of the application.
Fig. 4 is an iteration result of integrating a model on a verification set after training a lightweight integrated classifier unit according to an embodiment of the present application.
FIG. 5 is an iteration result of the integrated model on the test set after training the lightweight integrated classifier unit according to the embodiment of the present application.
Fig. 6 is a class activation diagram of different classifiers of a final model of an embodiment of the application for different classes of pictures on a TrashNet dataset.
Detailed Description
The following describes in detail the embodiments of the present application, which are developed based on the technical solution of the present application, and the detailed embodiments and specific operation procedures are shown in the drawings, and the technical solution of the present application is further explained.
The present example provides a light-weight integrated recognition method, which can be applied to experiments or engineering by using programming languages such as Python, C/c#/c++, etc., and is shown in fig. 1, and includes the following steps:
step 1, obtaining a large number of marked garbage images, and constructing an original training set, a verification set and a test set; the original training set is used as the current training set of each lightweight integrated classifier unit.
The example adopts a recoverable garbage classification data set trashNet proposed by Stanford university, wherein the data set is one of the most widely used open acquisition data sets in the current recoverable garbage image classification, and consists of 2527 pictures, and comprises garbage images with the following 6 label categories: cardboard, metal, plastic, glass, paper, other waste. Among them, 403 sheets of cardboard-like image, 482 sheets of plastic image, 501 sheets of glass image, 410 sheets of metal image, 594 sheets of paper image, 137 sheets of other garbage image are not overlapped with each other. The pixel sizes of all images are 513×384.
Each image is explicitly classified into a garbage category and is classified according to 70:13:17 to construct training, validation and test sets, respectively.
Step 2, constructing a lightweight integrated recognition network, which comprises a basic prediction model, two lightweight integrated classifier units B1 and B2 and a voting fusion module; the basic prediction model is divided into a front-end structure and a tail-end classifier unit B0, and initial parameters of the model are obtained by pre-training an image classification data set ImageNet; each lightweight integrated classifier unit adopts the same backbone structure as B0, and a channel attention mechanism is added between the last convolution layer and the subsequent global pooling layer; b0, B1 and B2 all share the front end structure of the same basic training model during training and testing. As shown in fig. 2 and 3.
The basic prediction model can be derived from the existing lightweight convolutional network model and can also be reconstructed, in this embodiment, mobilenet v2 is adopted as the basic prediction model, the number of end categories of the network full-connection layer is set to be 6, and image classification dataset ImageNet is used for pre-training. The back end part of the MobileNet V2 comprises a convolution layer, a global pooling layer and a full connection layer.
The two lightweight integrated classifier units B1, B2 have the same backbone structure as the end classifier unit B0 and add a channel attention mechanism between the last convolutional layer and the following global pooling layer.
The end classifier unit B0 and the two lightweight integrated classifier units B1 and B2 of the basic prediction model are arranged in a sampleThe output value is expressed asDimension vectors, i.e.
Wherein,,is a lightweight integrated classifierAnd a base prediction model end classifierIs used to determine the output vector of (a),representing the addition of a classifierNetwork input samples of (2)After at the firstAnd outputting the result in the dimension.
Because the adopted voting strategy needs to convert the class probability into the class label, in order to avoid sequence errors, the output class probability needs to be converted into the range of 0-1, namely
The maximum probability is then set to 1 and the remaining probabilities are set to 0, i.e
The final voting result can be expressed asThe calculation method is that
And step 3, training the basic prediction model again by using the original training set.
And training the basic prediction model by using the established original training set, and adopting the model with the highest prediction precision in the verification set as the basic model of the iterative training method of the next integrated classifier in a certain iteration range.
In this example, ubuntu 18.04.5 LTS and beyond is required for training, python3.8.5, pythoch 1.10.1, tonchvision 0.11.2, CUDA11.1 and beyond is required for system environment. The hardware platform needs to meet the requirements that the display card is NVIDIA GeForce RTX 3090 (24G), more than 16G exists in the hardware platform, and a hard disk with the capacity not lower than 256G is adopted. When model training is carried out, 600 epochs are trained in total, an optimizer is Adam, a loss function is a cross entropy loss function, a learning rate scheduler is oneyclelr, an initial learning rate is 0.0009, a maximum learning rate is 0.005, a learning rate modification frequency is 1, a learning rate rising round is 120, a final learning rate is 0.000001, and 256 pictures are loaded each time.
This example uses an on-line enhancement of the image during training, using random horizontal and vertical flipping with color dithering of the image (ColorJitter) and random range clipping of the image (random resize). Wherein the color dithering parameter is 0.12, the parameter intercepted in a random range is 0.07, and the size of the image is compressed to 224×224 after on-line enhancement, and then the image is input into a model for training.
In this example, the training method is repeated 5 times, the model with the highest test precision in the verification set is selected as the basic prediction model of the subsequent step, and if a plurality of models with the same precision exist, the model with the lowest corresponding loss is selected. The highest verification precision of the basic prediction model tested by the example is 95.40%, the loss of the corresponding model in the verification set and the precision of the corresponding model in the test set are 0.26508 and 94.42%, and the arrived Epoch is 548.
Step 4, fixing the structural parameters before the channel attention mechanism of the lightweight integrated classifier unit to be consistent with the corresponding structural parameters in the basic prediction model after retraining, and training the rest structural parameters of the lightweight integrated classifier unit by using the current training set; and then using the verification set to calculate the prediction precision of the lightweight integrated classifier unit obtained in the training.
According to the application, the structural parameters before the channel attention mechanism are fixed to be consistent with the corresponding structural parameters in the basic prediction model after retraining, so that lower training cost can be provided for subsequent integrated iterative training to achieve the iterative condition.
In the training of the lightweight integrated classifier unit B1 in the pair of embodiments, the initial parameter of color shake is set to 0.28, the parameter intercepted in the random range is 0.069, and the parameters are adjusted after iteration according to the ratio of 0.0997 to 1.01. The training parameters of 150 epochs per training of the training set are fixed to be the initial learning rate of 0.00024, the maximum learning rate of 0.0024, the learning rate modification frequency of 1, the learning rate rising cycle of 60, the final learning rate of 0.0000016, and the rest parameters are consistent with the training parameters when the basic prediction model is trained.
In training the lightweight integrated classifier unit B2, the initial parameters of color dithering were set to 0.24, the parameters of random range clipping were 0.065, and the parameters were adjusted after iteration according to the ratio of 0.0997 to 1.01, respectively. The training parameters of 150 epochs per training of the training set are fixed to be the initial learning rate of 0.00029, the maximum learning rate of 0.002, the learning rate modification frequency of 1, the learning rate rising cycle of 60, the final learning rate of 0.0000013, and the rest parameters are consistent with the training parameters when the basic prediction model is trained.
Step 5, judging whether the prediction precision of the lightweight integrated classifier unit reaches a given precision;
if the prediction precision of the two lightweight integrated classifier units reaches the given precision, the lightweight integrated classifier units with lower training errors are represented, so that the recognition precision of the whole model can be improved due to the increase of the difference between different classifiers, and the step 6 can be further performed;
if a certain lightweight integrated classifier unit cannot be tested on the verification set to reach a given precision when the maximum iteration number is reached, the overall recognition precision of the integrated model is adversely affected, so that training of the lightweight integrated classifier unit in step 4 in the current cycle is abandoned, random factors initialized by an unfixed module model in the lightweight integrated classifier unit and random factors of an image input sequence are modified, and operations of step 4 and step 5 are executed again on the lightweight integrated classifier unit.
The present embodiment will give a given accuracySet as 95.40% of the highest validation accuracy of the underlying predictive model training.
Step 6, the lightweight integrated recognition network obtained by the current training is used for carrying out integrated recognition on the verification set, an integrated recognition precision curve is observed, if the curve converges, the lightweight integrated recognition network obtained by the current training is used for carrying out integrated recognition on the test set, and the lightweight integrated recognition on the garbage sorting image of the test set is completed; if the curve does not converge, the step 7 is continued.
Step 7, inputting the current training sets of the two lightweight integrated classifier units into the lightweight integrated recognition network and the basic prediction model to obtain a prediction result; aiming at a lightweight integrated classifier unit B1, screening training set samples in which a predicted result belongs to a first type of error, enhancing the training set samples, and adding the training set samples to a current training set of the lightweight integrated classifier unit B1; aiming at a lightweight integrated classifier unit B2, screening training set samples in which a predicted result belongs to a second class of errors, enhancing the training set samples, and adding the training set samples to a current training set; and re-executing the step 4-6 until the light integrated identification of the test set garbage sorting image is completed.
The present embodiment defines the first type of error and the second type of error as: the method comprises the steps that the label categories of the garbage sorting images are randomly divided into two major categories A1 and A2, wherein the major category A1 comprises three label categories of cardboard, metal and plastic, and the major category A2 comprises three label categories of glass, paper and other garbage; and then defining the condition that the actual label in the current training set belongs to A1 and the predicted result belongs to A2 as a first type error, and defining the condition that the actual label in the current training set belongs to A2 and the predicted result belongs to A1 as a second type error.
Aiming at the lightweight integrated classifier unit B1, training set samples in which the prediction result belongs to the first type of errors are screened, and the training set samples are added to the current training set after enhancement processing. Aiming at the lightweight integrated classifier unit B2, training set samples in which the prediction result belongs to the second class of errors are screened, and the training set samples are added to the current training set after enhancement processing. In which the samples are enhanced, in particular copiedPortion and go onEnhancement treatment of the copies, i.e. corresponding increase for each sample screenedThe new sample was correlated. The embodiment is setAnd is also provided withI.e. the way in which the erroneous samples are enhanced is only by image offline enhancement rather than copying the original imageMode(s).
When in offline enhancement, all contents in the corresponding column are enhanced for each sample, namely, each row of contents in the corresponding column in the table 1 is enhanced once, and finally, an enhancement result is output. The manner in which offline enhancements are detailed in table 1 below.
Different training parameter settings related to the lightweight integrated classifier units B1 and B2 are based on achieving more iteration rounds and more differentiated training iterations, which can be performed in parallel.
The method can finally realize 14 iterations in 100 tests, and finally realize that the integration precision on the verification set is up to 96.01%, the integration precision on the test set is up to 95.58% compared with the basic prediction model, the integration precision on the test set is up to 1.16% compared with the basic prediction model, the total calculation complexity of the final integrated model is 0.527GFLOPs, the parameter quantity is 12.9M, and the calculation complexity and the parameter quantity of the basic prediction model are only respectively improved by 0.207GFLOPs and 9.4M.
The existing integrated model has no integrated learning method adopting light weight on the TrashNet data set, and table 2 shows the comparison of the method of the application with other existing integrated models for training on the TrashNet data set, so that the method of the application is advanced in accuracy, computational complexity or parameter quantity over the training results of the existing integrated models.
The existing garbage sorting image recognition method based on the single depth network model has the problems of optimizing training precision and model calculation amount. Table 3 shows the performance comparison of the method of the present application with the current advanced existing identification method on the TrashNet dataset, and it can be seen that the method of the present application can achieve extremely high identification accuracy with extremely little computational complexity. Compared with the ViT-B/32 method with highest precision in the table, the method obtains the near recognition precision with the computational complexity which is more than 10 times lower than that of the Vit-B/32 method, and simultaneously, the recognition precision is 4.16 percent higher than that of the existing method using MobileNet V2, and the calculated amount is only 0.207GFLOPs more.
In the embodiment of the application, the iterative results on the verification set and the test set are respectively shown in fig. 4 and 5, the recognition effect of the final model is shown in fig. 6, and the attention areas of three classifiers of the final integrated model on different types of images are described in fig. 6 by adopting a Grad-Cam method. The method has the advantages that the different models have different attention areas on the same picture, so that the attention difference among the different models is increased, the different models have better diversity, and the problem of unbalanced categories of the original data set can be overcome to a certain extent. Meanwhile, the techniques such as training threshold value and the like provided by the method in the integrated model training ensure the basic prediction capability of different lightweight integrated classifier units, and finally, the recognition precision is greatly improved.
According to the application, error samples of different error types in the training set are iteratively enhanced by a differential enhancement method, and then the error samples are used for retraining of different lightweight integrated classifier units, so that the lightweight integrated classifier units with differential recognition capability are obtained, different models have better diversity, the problem of unbalanced categories of an original data set can be overcome to a certain extent, and the classification precision is greatly improved on the basis of slightly increasing the parameter number and the calculation complexity by combining with the original classifier.
The above embodiments are preferred embodiments of the present application, and various changes or modifications may be made thereto by those skilled in the art, which should be construed as falling within the scope of the present application as claimed herein, without departing from the general inventive concept.

Claims (5)

1. A lightweight integrated recognition method for a trash sorting image, comprising:
step 1, obtaining a large number of marked garbage images, and constructing an original training set, a verification set and a test set; the original training set is used as the current training set of each lightweight integrated classifier unit;
step 2, constructing a lightweight integrated recognition network, which comprises a basic prediction model, two lightweight integrated classifier units B1 and B2 and a voting fusion module; the basic prediction model is divided into a front-end structure and a tail-end classifier unit B0, and initial parameters of the model are obtained by pre-training an image classification data set ImageNet; each lightweight integrated classifier unit adopts the same backbone structure as B0, and a channel attention mechanism is added between the last convolution layer and the subsequent global pooling layer; b0, B1 and B2 share the front end structure of the same basic training model during training and testing;
step 3, training the basic prediction model again by using the original training set;
step 4, fixing the structural parameters before the channel attention mechanism of the lightweight integrated classifier unit to be consistent with the corresponding structural parameters in the basic prediction model after retraining, and training the rest structural parameters of the lightweight integrated classifier unit by using the current training set; then using the verification set to calculate the prediction precision of the lightweight integrated classifier unit obtained in the training;
step 5, judging whether the prediction precision of the lightweight integrated classifier unit reaches a given precision;
if the prediction precision of the two lightweight integrated classifier units reaches the given precision, executing the step 6;
if the prediction precision of a certain lightweight integrated classifier unit does not reach a given precision, discarding the training of the lightweight integrated classifier unit in the step 4 in the current cycle, modifying the random factors of the lightweight integrated classifier unit and the random factors of the image input sequence, and executing the operations of the step 4 and the step 5 on the lightweight integrated classifier unit again;
step 6, the lightweight integrated recognition network obtained by the current training is used for carrying out integrated recognition on the verification set, an integrated recognition precision curve is observed, if the curve converges, the lightweight integrated recognition network obtained by the current training is used for carrying out integrated recognition on the test set, and the lightweight integrated recognition on the garbage sorting image of the test set is completed; if the curve is not converged, continuing the step 7;
step 7, inputting the current training sets of the two lightweight integrated classifier units into the lightweight integrated recognition network and the basic prediction model to obtain a prediction result; aiming at a lightweight integrated classifier unit B1, screening training set samples in which a predicted result belongs to a first type of error, enhancing the training set samples, and adding the training set samples to a current training set of the lightweight integrated classifier unit B1; aiming at a lightweight integrated classifier unit B2, screening training set samples in which a predicted result belongs to a second class of errors, enhancing the training set samples, and adding the training set samples to a current training set; re-executing the step 4-6 until light integrated identification of the test set garbage sorting image is completed;
the first type of error and the second type of error are defined as: the method comprises the steps of randomly dividing the label category of the garbage sorting image into two major categories A1 and A2, defining the condition that the actual label in the current training set belongs to A1 and the predicted result belongs to A2 as a first type error, and defining the condition that the actual label in the current training set belongs to A2 and the predicted result belongs to A1 as a second type error.
2. The lightweight integrated recognition method for garbage sorting images according to claim 1, wherein the base prediction model employs MobileNetV2.
3. The lightweight integrated recognition method for spam images of claim 1, wherein the label categories of spam images comprise: cardboard, metal, plastic, glass, paper, other waste.
4. The lightweight integrated recognition method for spam sorting images according to claim 1, wherein the sample enhancement processing in step 7, in particular the copying of the samplePart and go->Enhancement treatment of the copies, i.e.corresponding increase +/for each sample screened>A new sample of the correlation;
for training set samples with prediction results belonging to a first type of errors, the offline enhancement mode comprises the following steps: (1) random horizontal and vertical flipping, (2) random rotation by a plurality of 90 °, (3) random addition of gaussian noise with 50% probability, (4) random filtering of images with 50% probability;
for training set samples with prediction results belonging to the second class of errors, the offline enhancement mode comprises the following steps: (1) color equalization processing with 50% probability, (2) dithering color saturation with 50% probability, (3) changing image contrast over a random range interval.
5. The lightweight integrated recognition method for garbage sorting images according to claim 1, wherein when the lightweight integrated classifier unit is retrained using the current training set obtained after the reinforcement processing of step 7, returning to step 4, the on-line reinforcement complexity of the garbage sorting images is reduced within a given range of magnitudes; the online complexity enhancement method comprises the following steps: the range of random image interception, the random color dithering range and the number of the on-line enhancement modes applied;
the smaller the random interception range of the image is, the greater the on-line enhancement complexity is; the larger the random color dithering range is, the larger the on-line enhancement complexity is; the greater the number of online enhancements applied, the greater the online enhancement complexity.
CN202310638350.XA 2023-06-01 2023-06-01 Lightweight integrated identification method for garbage sorting images Active CN116363138B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310638350.XA CN116363138B (en) 2023-06-01 2023-06-01 Lightweight integrated identification method for garbage sorting images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310638350.XA CN116363138B (en) 2023-06-01 2023-06-01 Lightweight integrated identification method for garbage sorting images

Publications (2)

Publication Number Publication Date
CN116363138A CN116363138A (en) 2023-06-30
CN116363138B true CN116363138B (en) 2023-08-22

Family

ID=86923401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310638350.XA Active CN116363138B (en) 2023-06-01 2023-06-01 Lightweight integrated identification method for garbage sorting images

Country Status (1)

Country Link
CN (1) CN116363138B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553227A (en) * 2020-04-21 2020-08-18 东南大学 Lightweight face detection method based on task guidance
CN111714118A (en) * 2020-06-08 2020-09-29 北京航天自动控制研究所 Brain cognition model fusion method based on ensemble learning
CN113065558A (en) * 2021-04-21 2021-07-02 浙江工业大学 Lightweight small target detection method combined with attention mechanism
CN114120019A (en) * 2021-11-08 2022-03-01 贵州大学 Lightweight target detection method
CN115471704A (en) * 2022-09-21 2022-12-13 云南大学 Skin disease image identification and classification method and system
CN115512399A (en) * 2021-06-04 2022-12-23 长沙理工大学 Face fusion attack detection method based on local features and lightweight network
CN115909011A (en) * 2022-12-27 2023-04-04 中国科学院国家天文台南京天文光学技术研究所 Astronomical image automatic classification method based on improved SE-inclusion-v 3 network model

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230011635A1 (en) * 2021-07-09 2023-01-12 Viettel Group Method of face expression recognition
US20230076575A1 (en) * 2021-09-03 2023-03-09 Nec Laboratories America, Inc. Model personalization system with out-of-distribution event detection in dialysis medical records

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553227A (en) * 2020-04-21 2020-08-18 东南大学 Lightweight face detection method based on task guidance
CN111714118A (en) * 2020-06-08 2020-09-29 北京航天自动控制研究所 Brain cognition model fusion method based on ensemble learning
CN113065558A (en) * 2021-04-21 2021-07-02 浙江工业大学 Lightweight small target detection method combined with attention mechanism
CN115512399A (en) * 2021-06-04 2022-12-23 长沙理工大学 Face fusion attack detection method based on local features and lightweight network
CN114120019A (en) * 2021-11-08 2022-03-01 贵州大学 Lightweight target detection method
CN115471704A (en) * 2022-09-21 2022-12-13 云南大学 Skin disease image identification and classification method and system
CN115909011A (en) * 2022-12-27 2023-04-04 中国科学院国家天文台南京天文光学技术研究所 Astronomical image automatic classification method based on improved SE-inclusion-v 3 network model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于改进MobileNet v2的垃圾图像分类算法";陈智超;《浙江大学学报:工学版》;第1-10页 *

Also Published As

Publication number Publication date
CN116363138A (en) 2023-06-30

Similar Documents

Publication Publication Date Title
Gu et al. Stack-captioning: Coarse-to-fine learning for image captioning
CN109271522B (en) Comment emotion classification method and system based on deep hybrid model transfer learning
CN111126386B (en) Sequence domain adaptation method based on countermeasure learning in scene text recognition
Zhang et al. Fine-grained scene graph generation with data transfer
CN101968853B (en) Improved immune algorithm based expression recognition method for optimizing support vector machine parameters
CN110033008B (en) Image description generation method based on modal transformation and text induction
Lee et al. Query-efficient and scalable black-box adversarial attacks on discrete sequential data via bayesian optimization
CN109344884A (en) The method and device of media information classification method, training picture classification model
CN108459999B (en) Font design method, system, equipment and computer readable storage medium
CN112699247A (en) Knowledge representation learning framework based on multi-class cross entropy contrast completion coding
CN110188195B (en) Text intention recognition method, device and equipment based on deep learning
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN114092742B (en) Multi-angle-based small sample image classification device and method
CN112819063B (en) Image identification method based on improved Focal loss function
CN109670559A (en) Recognition methods, device, equipment and the storage medium of handwritten Chinese character
CN112883931A (en) Real-time true and false motion judgment method based on long and short term memory network
Pan et al. A Novel Combinational Convolutional Neural Network for Automatic Food-Ingredient Classification.
CN115909011A (en) Astronomical image automatic classification method based on improved SE-inclusion-v 3 network model
Das et al. Determining attention mechanism for visual sentiment analysis of an image using svm classifier in deep learning based architecture
Xu et al. Resilient binary neural network
CN110287981B (en) Significance detection method and system based on biological heuristic characterization learning
Kong et al. 3lpr: A three-stage label propagation and reassignment framework for class-imbalanced semi-supervised learning
Lukic et al. Galaxy classifications with deep learning
CN112883930A (en) Real-time true and false motion judgment method based on full-connection network
CN116363138B (en) Lightweight integrated identification method for garbage sorting images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant