CN114332075A

CN114332075A - Rapid structural defect identification and classification method based on lightweight deep learning model

Info

Publication number: CN114332075A
Application number: CN202210073320.4A
Authority: CN
Inventors: 陈柳洁; 姚皓东; 傅继阳
Original assignee: Guangzhou University
Current assignee: Guangzhou University
Priority date: 2022-01-21
Filing date: 2022-01-21
Publication date: 2022-04-12

Abstract

A method for quickly identifying and classifying structural defects based on a lightweight deep learning model comprises the following steps: the method comprises the steps of utilizing a VGG16-U-Net model to conduct semantic segmentation processing on collected defect images, removing image background noise interference, constructing an EfficientNet B0 model, training an EfficientNet B0 model to obtain a trained EfficientNet B0 model, inputting the defect images after the semantic segmentation processing into the trained EfficientNet B0 model for recognition, and outputting recognition results. The performance of the data evaluation model is quite accurate, and the scale of the model and the training speed are well balanced. The training speed and the recognition speed of the model are increased, the learning rate of the model is improved by using a random gradient descent algorithm of cosine annealing, the model is prevented from falling into a local minimum value, and a global minimum value is quickly searched.

Description

Rapid structural defect identification and classification method based on lightweight deep learning model

Technical Field

The invention relates to the technical field of defect identification, in particular to a method for quickly identifying and classifying structural defects based on a lightweight deep learning model.

Background

With the rapid development of economy, various buildings are being built at faster and faster speeds, such as: buildings, bridges, dams and various industrial buildings, meanwhile, the buildings are damaged and aged in the long-term use process, and therefore the buildings need to be detected and repaired regularly to prevent safety accidents.

Therefore, how to provide an image defect identification method with high accuracy is a problem to be urgently solved by the technical personnel in the field.

Disclosure of Invention

The embodiment of the application provides a method for quickly identifying and classifying structural defects based on a lightweight deep learning model, and aims to solve the problem of low accuracy of the existing image defect identification technology.

In a first aspect, the application provides a method for quickly identifying and classifying structural defects based on a lightweight deep learning model, which includes:

performing semantic segmentation processing on the acquired defect image by using a VGG16-U-Net model to remove image background noise interference;

constructing an EfficientNet B0 model, and training the EfficientNet B0 model to obtain a trained EfficientNet B0 model;

and inputting the defect image subjected to semantic segmentation processing into a trained EfficientNet B0 model for recognition, and outputting a recognition result.

In a second aspect, the present application further provides a system for rapidly identifying and classifying structural defects based on a lightweight deep learning model, the system comprising:

the semantic segmentation unit is used for performing semantic segmentation processing on the acquired defect image by using a VGG16-U-Net model to remove image background noise interference;

the model training unit is used for constructing an EfficientNet B0 model, training the EfficientNet B0 model and obtaining a trained EfficientNet B0 model;

and the defect identification unit is used for inputting the defect image subjected to semantic segmentation processing into a trained EfficientNetB0 model for identification and outputting an identification result.

In a third aspect, the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the method for quickly identifying and classifying structural defects based on a lightweight deep learning model according to the first aspect.

In a fourth aspect, the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the processor is caused to execute the method for quickly identifying and classifying a structural defect based on a lightweight deep learning model according to the first aspect.

The method is characterized in that a lightweight model EfficientNet B0 is adopted, the model is small, the operation speed is high, and the method is conveniently integrated to a mobile terminal comprising mobile equipment terminals such as a mobile phone or a travel recorder; the performance of the data evaluation model is quite accurate, and the scale of the model and the training speed are well balanced. The training speed and the recognition speed of the model are increased, the learning rate of the model is improved by using a random gradient descent algorithm of cosine annealing, the model is prevented from falling into a local minimum value, and a global minimum value is quickly searched; by using the Softmax function, the precision loss can be reduced, the efficiency is improved by about 15%, the training cost is reduced, and the performance of the convolutional neural model is obviously improved.

Drawings

For better clarity of the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of a method for quickly identifying and classifying structural defects based on a lightweight deep learning model according to an embodiment of the present application;

FIG. 2 is a structural diagram of a VGG16-U-Net model provided in an embodiment of the present application;

fig. 3 is a structural diagram of an EfficientNetB0 model provided in an embodiment of the present application;

fig. 4 is a structural diagram of MBconvk provided in the embodiment of the present application;

fig. 5 is a block diagram of an SE module according to an embodiment of the present disclosure.

Detailed Description

The following further describes embodiments of the present invention with reference to the drawings. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

Referring to the flowchart of the method for quickly identifying and classifying the structural defects based on the lightweight deep learning model shown in the embodiment of fig. 1, the method comprises the following steps:

s101, performing semantic segmentation processing on the acquired defect image by using a VGG16-U-Net model, and removing image background noise interference.

Referring to the structure diagram of the VGG16-U-Net model provided by the embodiment of FIG. 2;

the VGG16-U-Net model is composed of 3 parts of an encoding layer, a decoding layer and a final convolutional layer, and parameter initialization is completed by using a trained VGG16-U-Net model through transfer learning. The image is input into the encoder through the input layer, the encoder extracts image characteristics, the image characteristics are input into the decoder through jumping connection, the deconvolution layer of the decoder performs upsampling on the image, the characteristics are gradually restored to the original size of the image, the full convolution layer performs prediction classification on each pixel in the image, and the pixels in different classes are marked by different colors.

In one embodiment, training of the VGG16-U-Net model of the present application needs to be cycled for multiple times, so that the model is completely converged, the training process includes two processes of forward propagation and backward propagation each time, the image input model inputs the model through an input layer, a result is finally output, a Dice loss is calculated together with an image with a pixel class, the obtained Dice loss is propagated to the model in a backward direction, the parameter weights of each layer in the model are updated in a random gradient descent mode, and then a new training is restarted. During training, the random gradient of cosine annealing is decreased, the learning rate is continuously adjusted, and the situation that the learning rate falls into a local optimal point is prevented. And testing the performance of the model after each training of the model is finished, and when the model reaches the set training times, storing the model with the best performance on the test set, and using the model as the trained VGG16-U-Net model.

The defect image has a large amount of background noise, and how to efficiently process the noise is the key for automatically identifying the defect. The application adopts VGG16-U-Net to remove image background noise for image semantic segmentation. Semantic segmentation is an important branch of deep learning computer vision. Different from the classification task, semantic segmentation needs to judge the category of each pixel point of the image for accurate segmentation. U-Net is a full convolution neural model developed for realizing biomedical image segmentation, adopts a pyramid structure, utilizes an anti-convolution layer to sample the output of the last convolution layer so as to restore the output to the size of an original input image, predicts each pixel of the image and simultaneously reserves the spatial position information of the original input image. The model has a model structure in which convolutional encoding and convolutional decoding are completely symmetrical. The model is used for solving the pixel positioning problem in a shallow layer and solving the pixel classification problem in a deeper layer. The combination of low-level feature mapping is constructed into high-level complex features, so that accurate positioning is realized, the problem of image segmentation is solved, and the model is a full convolution model with good expansibility at present.

In one embodiment, the encoder of the U-Net of the present application employs the first 15 layers of VGG16, and VGG16 employs a 3 × 3 convolution kernel instead of a larger convolution kernel in a conventional convolution neural model, so that the model is more compact and efficient. In order to prevent the overfitting phenomenon caused by excessive model parameters, dropout layers are added between convolution layers. In order to rapidly extract the characteristics, the trained VGG16 model parameters are initialized by the parameters of the encoder part through transfer learning, so that the convergence speed of the network can be increased, and meanwhile, the generalization capability can be improved. In the decoder, the image is up-scaled using the deconvolution layer, gradually restoring the features to the original size of the image. A1 x 1 convolutional layer with sigmoid activation function is connected after the decoder to generate a prediction for each pixel in the image. And the encoder and the decoder are connected through jumping, and finally a VGG16-U-Net full convolution neural model is constructed, so that the defects such as cracks and the like are quickly separated from the complex picture background. VGG16-U-Net classifies different pixels in an image by predicting, inferring, classifying each pixel, represents pixels belonging to background noise in black, represents pixels belonging to defects such as cracks in white, separates defects such as cracks from background noise in the image, and retains the form and orientation of the defects such as cracks. The U-Net model taking the VGG16 as the encoder gives full play to the advantages of VGG and traditional U-Net, solves the under-segmentation phenomenon existing in the traditional segmentation network with the defects of cracks and the like, has better robustness, and can better identify the details of the defects of small cracks and the like which are possibly ignored in the manual labeling of human beings.

S102, constructing an EfficientNet B0 model, and training the EfficientNet B0 model to obtain a trained EfficientNet B0 model.

Referring to the model structure diagram of EfficientNetB0 in the embodiment of FIG. 3;

the specific structure of the MBconvk is shown in figure 4, a characteristic matrix input into the MBconvk is subjected to dimension raising after passing through a 1 x 1 convolution layer, Batch Normalization processing and swish function activation are carried out, then the characteristic matrix is subjected to the depth separable convolution layer of k x k, then the Batch Normalization processing and the swish function activation are carried out again, then an SE module is used, the structure of the SE module is shown in figure 5, the 1 x 1 convolution layer is connected behind the SE module, the convolution layer receives the input of the SE module, the Batch Normalization processing is carried out on the convolution layer, and finally the deactivation of a certain proportion is carried out through a drop out layer.

After the EfficientNet B0 model is constructed, training is required to be carried out on the model;

the training of the EfficientNet B0 model needs to be circulated for multiple times, so that the model is completely converged, the training process comprises a forward propagation process and a backward propagation process every time, an image input model is input into the model through an input layer, the final output result is obtained, cross entropy is calculated together with an image real label, the obtained cross entropy loss is propagated to the model in a backward mode, the parameter weight of each layer in the model is updated in a random gradient descent mode, and then new training is started again. During training, the random gradient of cosine annealing is decreased, the learning rate is continuously adjusted, and the situation that the learning rate falls into a local optimal point is prevented. And testing the performance of the model after each training of the model is finished, and when the model reaches the set training times, storing the model with the best performance on the test set, and taking the model as the final EfficientNet B0 model.

In an embodiment, the defect image data set is trained using the acquired defect image data set, namely, EfficientNetB0, and is divided into a training set, a verification set and a test set, wherein the proportion is 6: 2: 2, the training set is used for training EfficientNet B0 to automatically identify defects such as cracks; the verification set is used for observing the accuracy of the model for identifying the defects such as cracks in the training process; the test set does not participate in the training process, and the performance of the model is finally tested. And during training, continuously adjusting the learning rate, testing the performance of the model after the model training is finished, and storing the model with the best performance on the test set.

In the past, researchers need to adjust the network depth, the network width and the image resolution of the CNN through manual amplification or reduction, but due to limited computing resources, only a single model dimension parameter can be adjusted, and the best model dimension combination cannot be found. According to the method, the EfficientNet B0 model is used, a series of fixed scale scaling coefficients are used for uniformly scaling the dimensions of the model, so that the model has higher precision and higher efficiency, the model is light in weight, and a better result is obtained on ImageNet.

In the training process, the model is easy to fall into a local optimal point, and the learning rate of the cosine annealing random gradient descent algorithm proposed in 2016 can be adjusted within a certain time, so that the model jumps out of the local optimal point to reach the global optimal point. During training, when the stochastic gradient descent algorithm falls into the local minimum, the falling into the local minimum can be avoided, and a path leading to the global minimum is found.

Therefore, the random gradient descent algorithm of cosine annealing adopted by the method is as follows:

wherein i is the number of reboots;

and

respectively a maximum value and a minimum value of the learning rate; t is_curIs the currently executing epochs; t is_iIs the total epochs from the i-th restart.

The patent loss function adopts an improved cross entropy loss function and a Dice loss function. In the weight parameter optimization of deep learning, a loss function is used for evaluating the degree of inconsistency of a predicted value and a true value of a model, and is an optimized objective function loss function in a neural model. Compared with other loss functions, the cross entropy loss function is more robust when facing noise data, is more accurate when facing data with less noise, and can converge to a better local minimum point. In order to improve the calculation efficiency, the output result of the last full-link layer of the neural model is activated by a softmax function, the output result is mapped into a (0,1) interval, and the obtained result is processed by a cross entropy loss function.

The cross entropy loss function is:

wherein n is the number of samples; y is the actual label; a is the predicted output.

The softmax activation function is:

wherein z is_iz_iIs the output value of the ith node; and c is the number of output nodes.

The Dice loss function is a set similarity measurement function, is usually used for calculating the similarity of two samples, has a value range of [0,1], and is commonly used for semantic segmentation. The Dice loss function is:

wherein the content of the first and second substances,

y_ioutputting for the neural network; t is t_iIs the true tag value; ε is the smoothing coefficient.

S103, inputting the defect image subjected to semantic segmentation into a trained EfficientNet B0 model for recognition, and outputting a recognition result.

In an embodiment, the present application further provides a system for rapidly identifying and classifying structural defects based on a lightweight deep learning model, the system comprising:

In an embodiment, the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method for quickly identifying and classifying a structural defect based on a lightweight deep learning model according to any of the above embodiments is implemented.

In an embodiment, the present application further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the processor is enabled to execute the method for quickly identifying and classifying a structural defect based on a lightweight deep learning model according to any one of the embodiments.

The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the described embodiments. It will be apparent to those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, and the scope of protection is still within the scope of the invention.

Claims

1. A method for quickly identifying and classifying structural defects based on a lightweight deep learning model is characterized by comprising the following steps:

2. The method for rapidly identifying and classifying structural defects based on a lightweight deep learning model as claimed in claim 1, wherein the VGG16-U-Net model comprises an encoding layer, a decoding layer and a final convolutional layer.

3. The structural defect rapid identification and classification method based on the lightweight deep learning model as claimed in claim 1, wherein before performing semantic segmentation processing on the acquired defect image by using the VGG16-U-Net model, the method comprises:

and (5) initializing the parameters of the trained VGG16-U-Net model by migration learning.

4. The structural defect rapid identification and classification method based on the lightweight deep learning model as claimed in claim 2, wherein the semantic segmentation processing of the acquired defect image by using the VGG16-U-Net model comprises:

the method comprises the steps that collected defect images are input into an encoder through an input layer, image features are extracted by the encoder and input into a decoder through jumping connection, a deconvolution layer of the decoder performs upsampling on the images, the features are gradually restored to the original size of the images, the full convolution layer performs prediction classification on each pixel in the images, and the pixels of different classes are marked by different colors, so that semantic segmentation processing is achieved.

5. The structural defect rapid identification and classification method based on the lightweight deep learning model as claimed in claim 4, wherein the training of the EfficientNet B0 model comprises:

dividing the acquired defect image data set into a training set, a verification set and a test set, wherein the proportion is respectively 6: 2: 2, training an EfficientNet B0 model by using a training set to automatically identify cracks, observing the accuracy of the model for identifying the cracks in the training process by using a verification set, and testing the performance of the final model by using a test set.

6. The structural defect rapid identification and classification method based on the lightweight deep learning model as claimed in claim 1, wherein the training of the EfficientNetB0 model further comprises:

in the training process, the learning rate is adjusted by using a random gradient descent algorithm of cosine annealing.

7. The structural defect rapid identification and classification method based on the lightweight deep learning model as claimed in claim 6, wherein the calculation formula of the cosine annealing stochastic gradient descent algorithm is as follows:

wherein i is the number of reboots,

and

respectively a maximum value and a minimum value of the learning rate; t is_curFor currently executing epochs, T_iIs the total epochs from the i-th restart.

8. Structural defect quick identification and classification device based on lightweight deep learning model, its characterized in that includes:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the computer program, implements the method for rapidly identifying and classifying structural defects based on a lightweight deep learning model according to any one of claims 1 to 7.

10. A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when executed by a processor, the computer program causes the processor to execute the method for rapidly identifying and classifying structural defects based on a lightweight deep learning model according to any one of claims 1 to 7.