CN116486246A

CN116486246A - Intelligent recognition method for bridge underwater image diseases based on convolutional neural network

Info

Publication number: CN116486246A
Application number: CN202310459591.8A
Authority: CN
Inventors: 宋泰毅; 张臣; 刘志洋; 王延臣; 钮晓洋; 黄博; 常伟琴; 王振宇; 刘薇云
Original assignee: Luohe Highway Development Center; Cccc Yuanyang Dalian Bridge Underwater Testing Co ltd
Current assignee: Luohe Highway Development Center; Cccc Yuanyang Dalian Bridge Underwater Testing Co ltd
Priority date: 2023-04-26
Filing date: 2023-04-26
Publication date: 2023-07-25

Abstract

The invention discloses an intelligent recognition method for bridge underwater image diseases based on a convolutional neural network, which comprises the steps of obtaining an image set of a bridge underwater structure, carrying out image enhancement and expansion on the image set, labeling the image set and dividing the image set into a training set and a testing set; constructing a YOLOv5 network model, modifying the YOLOv5 network model, expressing the modified YOLOv5 network model as a target detection model, training the target detection model according to a training set, and acquiring a trained target detection model; and testing the test set according to the trained target detection model, calculating the detection precision, taking the trained target detection model as an optimal target detection model when the detection precision meets a threshold value, optimizing the target detection model when the detection precision does not meet the threshold value, and acquiring the recognition result of the underwater image of the bridge to be detected according to the optimal target detection model. The detection efficiency and the detection precision are improved, the detection cost is reduced, and the automation and intelligent detection level of the bridge is improved.

Description

Intelligent recognition method for bridge underwater image diseases based on convolutional neural network

Technical Field

The invention relates to the technical field of bridge underwater detection, in particular to an intelligent recognition method for bridge underwater image diseases based on a convolutional neural network.

Background

The bridge pier foundation is a main bearing structure of the bridge, and the working state of the bridge pier foundation directly influences the bearing capacity of the bridge. The bridge pier foundation is mostly placed in water and is in complex hydrogeology for a long time and is eroded by running water, so that the diseases of the bridge pier foundation are increased and serious. These disease locations are underwater and difficult to find during routine inspection, which is a great potential safety hazard for bridge structures. Therefore, the condition of the bridge underwater component is comprehensively and systematically known and mastered, scientific and reasonable technical data and decision basis are provided for maintenance and repair of the bridge, so that the safety of the bridge structure is ensured, and the bridge underwater structure detection is very important.

At present, aiming at the defects of few bridge underwater detection cases and insufficient preparation of a technical scheme, the traditional underwater detection is mostly limited to inland bridges, and the traditional mode is to adopt divers to conduct underwater exploration. The underwater exploring of divers is carried out under the condition that the general flow velocity is less than 0.5m/s, the water depth is increased, the river is relatively turbulent, the underwater condition is complex, therefore, certain hidden danger exists in safety control, the detection result is greatly influenced by the subjective of the divers, in addition, the detection means is traditional, and the detection efficiency is low.

Disclosure of Invention

The invention provides an intelligent recognition method for bridge underwater image diseases based on a convolutional neural network, which aims to overcome the technical problems.

An intelligent recognition method for bridge underwater image diseases based on convolutional neural network comprises,

step one, acquiring an image set of a bridge underwater structure, dividing the image set into two types by manpower, wherein the two types comprise a normal state and a disease state,

step two, carrying out image enhancement on the image in the disease state in the image set, wherein the image enhancement is used for improving the definition of the disease image,

step three, generating an countermeasure network according to the depth convolution to expand the enhanced image set, labeling the expanded image set, dividing the expanded image set into a training set and a testing set,

a step four of constructing a YOLOv5 network model, wherein the YOLOv5 network model comprises an input end, backbone, neck and a head, the YOLOv5 network model is modified and expressed as a target detection model, the modification comprises replacing a C3 module of a backstone with a GhostBottleneck network, the GhostBottleneck network is formed by fusing the C3 module with a GhostNet network, the GhostNet network comprises Conv convolution and a redundancy feature generator, the last Conv convolution of the backstone is replaced with a SENet module, all Conv convolutions in the YOLOv5 network model except the last Conv convolution of the backstone are replaced with GhostConv convolutions, training is carried out on the target detection model according to a training set, the target detection model after training is obtained,

step five, testing a test set according to the trained target detection model and calculating the detection precision, when the detection precision meets a threshold value, taking the trained target detection model as an optimal target detection model, when the detection precision does not meet the threshold value, carrying out fine adjustment on the trained target detection model according to a transfer learning method to obtain a fine-adjusted target detection model, optimizing the fine-adjusted target detection model according to a random gradient descent method, wherein the optimization comprises freezing iterative training and thawing iterative training, obtaining an optimized target detection model, testing the test set according to the optimized target detection model and calculating the detection precision until the detection precision of the optimized target detection model meets the threshold value,

and step six, acquiring an underwater image of the bridge to be detected, and identifying the underwater image of the bridge to be detected according to the optimal target detection model to acquire an identification result.

Preferably, the generating an image set of the bridge underwater structure from the depth convolution generation countermeasure network comprises,

s11, acquiring a sub-image set with a disease state from the image set, generating a generation network of an antagonism network according to the depth convolution, learning real crack characteristics in the sub-image set, acquiring the crack characteristics, generating a virtual underwater crack image set according to the crack characteristics and noise,

s12, inputting the sub-image set and the virtual underwater crack image set into a discrimination network for generating an countermeasure network for discrimination, obtaining a discrimination result,

s13, updating parameters of the generating network according to the updating rules of the parameters in the generating network when the judging network judges the virtual underwater crack image as false, updating the parameters of the judging network according to the updating rules of the parameters in the judging network when the judging network judges the virtual underwater crack image as true,

s14, repeatedly executing S12 and S13 until the network is judged to be unable to judge the authenticity of the virtual underwater crack image, obtaining the number of the images in the sub-image set, and generating the virtual underwater images with the same number according to the generation network.

Preferably, the image enhancement of the disease state image includes image enhancement of the disease state image according to MSRCR algorithm and CLAHE algorithm.

Preferably, the GhostConv convolution includes halving the number of feature channels of the input feature map by a 1×1 convolution check with a step length of 1 to obtain a first feature map, performing a depth convolution on the first feature map according to a 5×5 convolution check with a step length of 5 to obtain a second feature map, and splicing the first feature map and the second feature map into a third feature map.

Preferably, the GhostBottleneck network includes halving the number of feature channels of the input feature map by the 1 st GhostConv convolution to obtain a fourth feature map, restoring the number of feature channels of the fourth feature map by the 2 nd GhostConv convolution to obtain a fifth feature map as early as possible, performing 3×3 depth convolution on the fifth feature map to obtain a sixth feature map, and fusing the fourth feature map, the fifth feature map and the sixth feature map into a seventh feature map.

The invention provides an intelligent recognition method for bridge underwater image diseases based on a convolutional neural network, which is characterized in that an inspector observes and detects a bridge underwater structure on a water surface through controlling a camera, video data is transmitted to a target detection model through a cable, the target detection model can automatically recognize the disease type and select the position of the disease, compared with a traditional detection mode, the detection result is less influenced by human subjective, the accuracy is higher, the detection efficiency is improved, the cost of inspectors is reduced, the safety of the underwater inspectors is ensured, and the automatic and intelligent detection level of the bridge is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it will be obvious that the drawings in the following description are some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort to a person skilled in the art.

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a schematic diagram of the Retinex algorithm of the present invention;

FIG. 3 is a flow chart of the Retinex algorithm of the present invention;

FIG. 4 is a graph showing the contrast of the image enhancement effect of the present invention;

FIG. 5 is a schematic diagram of a DCGAN network of the present invention;

FIG. 6 is a flow chart of the DCGAN network training of the present invention;

FIG. 7 is a disease map generated based on a DCGAN network in accordance with the present invention;

FIG. 8 is a diagram of the construction of the YOLOv5 model of the present invention;

FIG. 9 is a network schematic diagram of Conv convolution and Ghost of the present invention;

fig. 10 is a diagram of the network structure of the inventive GhostConv;

FIG. 11 is a diagram of the improved YOLOv5 model of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

FIG. 1 is a flowchart of the method of the present invention, as shown in FIG. 1, the method of the present embodiment may include:

the fine tuning refers to performing iterative optimization calculation on the Loss function of the trained target detection model with the aim of fine tuning by taking the value of the minimized Loss function Loss as a fine tuning, and the fine tuning (fine-tune) is a method of transfer learning: the training of the new task model continues on the basis of an already trained model. The fine tuning can accelerate the training speed of the convolutional neural network, and the realization effect is very good when the number of the data sets is not very large.

Based on the scheme, according to the intelligent recognition method for the bridge underwater image diseases based on the convolutional neural network, provided by the invention, an inspector observes and detects the bridge underwater structure on the water surface through controlling the camera, video data are transmitted to the target detection model through the cable, and the target detection model can automatically recognize the disease types and frame the positions of the diseases. Therefore, detection personnel can detect the bridge underwater structure on the detection vessel by controlling detection equipment on the detection vessel. The detection of the bridge underwater structure by the personnel diving in person is avoided, and the detection work is safer and more efficient. The underwater crack image of the bridge is enhanced based on the MSRCR algorithm and the CLAHE algorithm, so that the target detection model can learn the detail characteristics of the bridge disease image better, and the detection precision of the model is improved. Compared with the traditional image augmentation mode, the method has the advantages that the quality of the augmented data set is better, and the accuracy of the bridge underwater disease target detection model is improved. The Ghostnet network and the YOLOV5 network are combined and recombined, a lightweight target detection network is built, compared with other network models, the network parameter quantity is reduced, the detection speed is improved, the network can be deployed in equipment with limited computing resources for detecting bridge underwater diseases, and the economic cost is saved.

Step one, acquiring an image set of a bridge underwater structure, namely, a inspector controls a camera on a water surface, shooting the bridge underwater structure, dividing the image set into two types by manpower, wherein the two types comprise a normal state and a disease state, the normal state is the normal state of the bridge, the disease state is the crack disease of the bridge, all picture formats are required to be unified in size for facilitating later image processing, ACDSee software is used for setting the sizes of all pictures to 416x416pixel, the picture formats are set to jpg patterns,

the data set has high sample quality and clear imaging, the more the image data characteristic information which can be learned by the neural network is, the better the training effect of the target detection model is, the better the generalization capability of the model is, and the bridge underwater diseases can be identified more accurately. In order to solve the problem that the underwater disease image is not clear, the second step is implemented to carry out image enhancement on the disease state image, wherein the image enhancement on the disease state image comprises image enhancement on the disease state image according to an MSRCR algorithm and a CLAHE algorithm.

The Retinex algorithm is often used for underwater image enhancement, and the theory considers that the color of an object observed by people in eyes is determined by the light reflection capability of the surface of the object, not by the light reflection intensity, and the observed color of the object is not changed due to uneven illumination, so that the color has consistency. As shown in fig. 2, the light emitted by the sun irradiates the surface of an object, and the reflected light of the surface of the object is received by the image acquisition device, so as to finally obtain an image of the object, and the expression is as follows:

S(x，y)＝R(x，y)·l(x，y) (1)

wherein S represents an original image acquired, R represents a reflected image, L represents an incident light image, and (x, y) represents a pixel point in the image acquired by the device. R represents the reflection capability of an object on light, L represents the range of pixel values in an acquired image, the Retinex core idea is to remove L in an equipment acquired image S to obtain a reflected image R, so that the influence of uneven illumination intensity is removed, and finally, the effect of enhancing an underwater image is achieved, and an algorithm flow chart of the method is shown in figure 3.

MSRCR is a Retinex series third generation algorithm, and compared with the improvement of the former generation, a color recovery factor is added, and the addition of the value effectively avoids the problem of image color distortion caused by improving the image contrast, but the algorithm has certain defects, and the image contrast after enhancement is lower. The invention aims to solve the defects of the MSRCR algorithm, and embeds a CLAHE algorithm on the basis of the MSRCR algorithm.

The CLAHE algorithm is a method for limiting the contrast self-adaptive histogram equalization, and averages gray values in a certain region of an image to the whole image region so as to achieve the aim of improving the contrast of the image.

The implementation flow of the CLAHE algorithm is as follows:

(1) The gray scale area of the input image is distributed evenly;

(2) Calculating a histogram in each region, and setting a critical value;

(3) Dividing the image area exceeding the critical value and evenly distributing the image area into the whole underwater image;

(4) A gray level averaging operation is performed for each region.

The algorithm enhancement effect provided by the invention is obvious to the enhancement effect of the underwater crack of the bridge, for example, as shown in fig. 4, wherein the fig. 4 comprises original pictures, pictures obtained through an MSRCR algorithm and pictures after the algorithm is improved.

In order to improve the prediction precision of the later network training, a large number of bridge underwater disease photographs are needed, an image set of a bridge underwater structure is expanded according to a deep convolution generation countermeasure network (DCGAN network), the deep convolution generation countermeasure network is an unsupervised learning technology, and the deep convolution generation countermeasure network mainly comprises two parts, wherein one part is a generation network and the other part is a discrimination network. The generating network can learn the characteristics of a real disease sample, input one-dimensional data is converted into a crack image through micro-step convolution, and a schematic diagram is shown in fig. 5; the discrimination network is a conventional convolutional neural network, and can identify whether an input image is an underwater crack image or not.

The generating an image set of the bridge underwater structure from the depth convolution generation countermeasure network includes,

s11, acquiring a sub-image set with a disease state from the image set, learning real underwater crack characteristics in the sub-image set according to a generation network, acquiring the characteristics of the underwater crack, generating a virtual underwater crack image set according to the characteristics and noise,

s12, inputting the sub-image set and the virtual underwater crack image set into a judging network to judge, obtaining a judging result,

s14, repeatedly executing S12 and S13 until the authenticity of the virtual underwater crack image can not be determined by the judging network, acquiring the number of the images in the sub-image set, and generating the virtual underwater images with the same number according to the generating network. Labeling the expanded image set, dividing the image set into a training set and a testing set,

the DCGAN network training process diagram is shown in fig. 6, and the generating network and the discriminating network train continuously, similar to the game process. And importing the bridge underwater image data set into a DCGAN network, and learning and training the input real disease data set by the DCGAN network to generate more false and spurious bridge underwater disease images, so that the purpose of expanding the data set is finally achieved. Setting 500 iterations of network training, and generating disease pictures by the DCGAN network according to different iteration times as shown in figure 7.

Labeling the expanded image set and dividing the image set into a training set and a testing set, wherein the method specifically comprises the steps of combining the enhanced and expanded high-quality data sample with the original data sample to obtain a disease database, and dividing the data set into the training set, the verifying set and the testing set, wherein the proportion is 8:1:1. labeling each bridge underwater disease image (namely labeling disease type and position in the picture by using a rectangular frame) by using labelimg software, generating a label file, wherein each file corresponds to the information of one underwater crack picture, the information of the name and storage position of the picture, the position of the image of the crack, the labeling frame and the like are contained,

step four, constructing a YOLOv5 network model, wherein the YOLOv5 network model comprises an input end, backbone, neck and a head,

YOLOv5 is a version of the YOLO target detection algorithm that is relatively good in all aspects, such as detection accuracy and detection speed. The model network structure consists of four major parts, namely an input end, a backbone, neck end and a head end. For the input end, in order to enrich the data set, improve the generalization capability and robustness of the network, a Mosaic image enhancement is proposed, 4 pictures are randomly used for scaling and color space adjustment, and then the enhanced data set is spliced. The backstone structure is formed by combining a C3 module, a Conv module and an SPPF module, the C3 module divides an input characteristic diagram into two parts by referring to the design of CSPNet, then the input characteristic diagram is respectively combined after respective stage operation to realize richer gradient combination, the calculation cost is reduced on the premise of ensuring the accuracy, and whether a residual error network exists or not is controlled through true and false values of shortcut; the SPPF module obtains the feature map with the channel number of 2 times by 3 progressive pooling operations, greatly improves the feature extraction capacity of the model, and is beneficial to detecting target objects with different sizes in the image. The neg layer adopts an FPN+PAN structure to enhance the whole pyramid feature map, the PAN is based on Mask R-CNN and an FPN framework, a feature extractor of the network adopts a new FPN structure for enhancing a bottom-up path, the salient feature information of the lower layer is transferred to the upper layer, and the richer feature information is reserved. The head layer predicts features of the target and applies anchor boxes on the target feature map to generate final output vectors with class probabilities and target boxes. The model structure is shown in fig. 8.

The YOLOv5 network model has a large number of convolution layers, the excessive convolution layers can increase the calculated amount and the network parameter amount, and powerful GPU calculation resources are required for training and prediction, so that the application on embedded equipment is difficult, classical lightweight networks MobileNet and ShuffleNet replace the traditional convolution by adopting deep separable convolution, and channel shuffling operation is introduced to reduce the calculated amount, but the problem of redundancy of target features is not fundamentally solved.

The GhostNet convolution module replaces a traditional convolution layer by adopting a mode of combining traditional convolution with a lightweight redundancy feature generator, so that network parameters and calculation amount are relatively less, and the GhostNet convolution module is easier to deploy to a terminal. Compared with the traditional convolution, the main part of the Ghoset convolution module is divided into two parts, firstly, a part of characteristic diagrams with fewer channels are obtained through normal convolution calculation, then, more characteristic diagrams are obtained through simple linear operation on the characteristic diagrams, finally, the two groups of characteristic diagrams are spliced to form new output, and the network principle of Conv convolution and Ghost is shown in fig. 9. The ratio of Conv convolution and Ghost convolution calculations can be expressed by the formula (2):

X∈R ^h×w×c characteristic diagram representing convolution input height h, width omega and channel number c, Y epsilon R ^{H′×W′×N} The convolution output N characteristic graphs with the height of H ' and the width of W ' are represented, and Y ' E R ^{H′×W′×n} N feature maps of H 'x W' are output after Conv convolution at Ghost, N=n·s, s represents an original feature map, and a series of inexpensive linear operations are applied to generate s feature maps, s < c.

Modifying a yolkenev 5 network model and representing the model as a target detection model, wherein the modification comprises replacing a C3 module of a backbone with a GhostBottleneck network, the GhostBottleneck network is formed by fusing the C3 module with a GhostNet network, the GhostNet network comprises Conv convolution and a redundancy feature generator, the GhostBottleneck network comprises obtaining a fourth feature map after halving the number of input feature channels through a 1 st GhostConv convolution, recovering the number of input feature channels as a first feature map by a 2 nd GhostConv convolution, obtaining a sixth feature map after performing 3 x 3 depth convolution on the fifth feature map, fusing the fourth feature map, the fifth feature map with the sixth feature map as a seventh feature map, replacing the Conv convolution of the last layer of the backbone with a SEnet module, replacing all Conv convolutions of the last layer of the Conv convolution with YOLOv5 network by a 1 st convolution, obtaining a fifth feature map after performing a first step size matching by the first step size, obtaining a second feature map by a second step size matching feature map according to the first step size, obtaining a third feature map by a second step size, obtaining a second feature map by a second step size, and obtaining a third feature map by a second step size by a first step size, and obtaining a second feature map by a step size by a second step size,

and the GhostNet network is integrated into the C3 module to form a new C3Ghost module, so that the calculated amount is reduced by the compression model, and the operation speed is improved. The model structure is shown in fig. 10. The first Conv of GhostConv adopts a 1 multiplied by 1 convolution kernel with the step length of 1 to halve the number of input characteristic channels, then the characteristic diagram obtained in the last step is subjected to the depth convolution of a 5 multiplied by 5 convolution kernel, and finally the splicing is carried out. The improved GhostBottenceck reduces the number of input characteristic channels by half through the 1 st GhostConv, then the 2 nd GhostConv restores the number of characteristic image channels as before, finally the GhostBottenceck replaces Bottenceck in the original C3 module with the addition fusion characteristic subjected to 3X 3 depth convolution to obtain the C3Ghost module.

And a GhostConv convolution layer is used for replacing Conv convolution in YOLOv5, so that the number of network parameters is further reduced, and the detection speed is improved. The modified model structure is shown in fig. 11.

The Mish activation function is used to replace the Swish activation function in the original network. The Mish activation function is a non-monotonic nerve activation function, and compared with Swish, the smooth characteristic of the Mish activation function can enable the neural network to extract more effective characteristic information, and the non-saturated characteristic eliminates the extreme case of gradient explosion, so that the generalization capability of the network is improved, and better accuracy is obtained.

A lightweight attention mechanism SENet module is introduced in the backlight layer of YOLOv 5. The conventional network model considers that each channel of the feature layer has information of the same importance, and actually, some channels contain less information, and equally important calculation is performed on each channel, which increases the calculation amount of the network. The SE architecture makes the network more optimal in performance, using limited resources on important channels. The SE Net module mainly comprises two parts, namely compression and excitation, an image can acquire the importance degree of each characteristic channel through the two steps, and a weight is given to each channel according to the importance degree, so that a network model can pointedly pay attention to certain channels, namely, important attention is paid to useful characteristic channels, useless characteristic channels are limited, and network performance is improved to the greatest extent.

Training the target detection model according to the training set to obtain a trained target detection model,

The whole beneficial effects are that:

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. An intelligent recognition method for bridge underwater image diseases based on convolutional neural network is characterized by comprising the following steps,

2. The intelligent recognition method of the bridge underwater image disease based on the convolutional neural network according to claim 1, wherein the generating the image set of the bridge underwater structure according to the deep convolutional generating the countermeasure network comprises,

3. The intelligent recognition method for the bridge underwater image diseases based on the convolutional neural network according to claim 1, wherein the image enhancement of the images in the disease states comprises the image enhancement of the images in the disease states according to an MSRCR algorithm and a CLAHE algorithm.

4. The intelligent recognition method for the bridge underwater image diseases based on the convolutional neural network is characterized in that the GhostConv convolution comprises halving the number of characteristic channels of an input characteristic image through a 1X 1 convolution check with a step length of 1 to obtain a first characteristic image, then carrying out deep convolution on the first characteristic image according to a 5X 5 convolution check with a step length of 5 to obtain a second characteristic image, and splicing the first characteristic image and the second characteristic image into a third characteristic image.

5. The intelligent recognition method for the bridge underwater image diseases based on the convolutional neural network is characterized in that the GhostBottleneck network comprises the steps of halving the number of characteristic channels of an input characteristic map through a 1 st GhostConv convolution to obtain a fourth characteristic map, recovering the number of characteristic channels of the fourth characteristic map into the original number of channels through a 2 nd GhostConv convolution to obtain a fifth characteristic map, carrying out 3X 3 depth convolution on the fifth characteristic map to obtain a sixth characteristic map, and fusing the fourth characteristic map, the fifth characteristic map and the sixth characteristic map into a seventh characteristic map.