CN111242955B

CN111242955B - Road surface crack image segmentation method based on full convolution neural network

Info

Publication number: CN111242955B
Application number: CN202010070397.7A
Authority: CN
Inventors: 苏超; 董义佳; 王文君; 袁荣耀
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2020-01-21
Filing date: 2020-01-21
Publication date: 2023-04-21
Anticipated expiration: 2040-01-21
Also published as: CN111242955A

Abstract

The invention discloses a road surface crack image segmentation method based on a full convolution neural network, which is characterized by comprising the following steps of: the method comprises the following steps: photographing the pavement cracks by a camera, constructing a picture data set, and dividing images in the data set into two types of cracking and cracking-free; dividing and labeling the image with the crack in the data set by using Lableme software; dividing the constructed data set into a training set and a testing set; constructing a full convolution neural network for image segmentation; training the full convolution neural network by using the divided training set, and optimizing relevant parameters until a global optimal solution is obtained; and submitting the crack images in the test set to a full convolution neural network, and outputting an image segmentation result.

Description

Road surface crack image segmentation method based on full convolution neural network

Technical Field

The invention belongs to the field of road quality and safety monitoring, and relates to a road surface crack image segmentation method based on a full convolution neural network.

Background

The detection of the pavement cracks is an important traffic maintenance work for ensuring the driving safety, is a crucial step for road management, and aims to obtain pavement maintenance state information. Cracks are the most common type of disease of the road surface. Because the automatic crack detection system has the advantages of high safety, low cost, high efficiency, objectivity and the like, the research of the automatic crack detection system is widely focused.

Image-based crack detection algorithms have been widely discussed and applied over the past few decades. In early studies, methods were based on more traditional digital combination or improved image processing techniques such as thresholding, mathematical morphology, edge detection, and the like. These methods are generally based on photometric and geometric assumptions about the nature of the fracture image. However, these methods are very sensitive to noise because they are performed on a single pixel. This remains a challenging task due to crack non-uniformity and background complexity, such as low contrast with surrounding pavement, and shadows of similar intensity that may occur.

In recent years, deep Convolutional Neural Networks (CNNs) have found widespread use in computer image processing.

Deep neural networks have an impressive manifestation in many computer vision tasks, demonstrating the effectiveness of learned depth features. And constructing a deep Full Convolution Network (FCN) model, performing semantic segmentation on the pavement crack image, realizing automatic detection of pavement cracks, and having important significance for pavement condition monitoring.

Disclosure of Invention

The invention aims to: aiming at the defects of the prior art, the invention provides a road surface crack image segmentation method based on a full convolution neural network, so as to realize crack detection, further improve crack detection efficiency and reduce detection cost.

The technical scheme is as follows: the invention provides a road surface crack image segmentation method based on a full convolution neural network, which comprises the following steps:

s1, photographing a pavement crack by a camera, constructing a picture data set, and dividing an image in the data set into two types of crack-free and crack-free;

s2, carrying out segmentation data labeling on the image with the crack in the data set by using Lableme software;

s3, dividing the constructed data set into a training set (3/5 of the data set) and a test set (2/5 of the data set);

s4, constructing a full convolution neural network for image segmentation;

s5, training the full convolution neural network by utilizing the divided training set, and optimizing relevant parameters until a global optimal solution is obtained;

s6, submitting the crack images in the test set to a full convolution neural network, and outputting an image segmentation result.

And S2, labeling the image with the crack, wherein the labeling is pixel-level labeling.

The built full convolution neural network specifically comprises:

the full convolutional neural network comprises 1 convolutional layer, 16 extended convolutional layers, 1 pyramid sampling layer, 2 decoder layers and 1 semantic output layer.

The step S4 is to make the structure and parameters of the full convolution neural network model specifically as follows:

the structure of the full convolution neural network expansion module is specifically as follows:

a. when the span is 1: the input picture is overlapped with the input through the results of point convolution 1x1 (the activation function is Relu 6), depth convolution 3x3 (the activation function is Relu 6) and point convolution 1x1 (the activation function is linear), and then the output is obtained.

b. Span is 2: the input picture is directly output after point convolution 1x1 (the activation function is Relu 6), depth convolution 3x3 (the activation function is Relu 6) and point convolution 1x1 (the activation function is linear).

The pyramid sampling module of the full convolution neural network has 4 paths, which are respectively used for sampling the average value with the side length of 10, and averaging the whole characteristic diagram; three holes with hole ratios of 1,2 and 4 are convolved, and different receptive fields are used for extracting information from the feature map; the output results of the 4 paths are combined in a stacked manner after being scaled to the same size.

The full convolution neural network is characterized in that a decoder 0 is connected with an extended convolution 3#, 16 feature graphs are input, and 32 feature graphs are output; the decoder 1 is connected to the convolutional layer, inputs 8 feature maps, and outputs 32 feature maps.

The full convolution neural network parameter optimization uses a movement optimizer, the dynamic value of the optimizer is 0.9, the initial learning rate is 0.0001, a decreasing learning rate updating strategy is adopted, and each time of 2000 steps of attenuation is 0.9 times of the previous learning rate; and during training, calculating the gradient of the error pair network parameters by taking two pictures, and further updating the network parameters.

The objective function of the full convolution neural network for image segmentation is the average number of cross entropy of a given label and a prediction result prediction of the neural network at each pixel point:

/>

the beneficial effects are that: compared with the prior art, the road surface crack image segmentation method based on the full convolution neural network has the following beneficial effects:

1. the recognition efficiency is high: the invention constructs a picture data set and divides the image in the data set into two categories of cracking and cracking-free; dividing and labeling the image with the crack in the data set by using Lableme software; dividing the constructed data set into a training set and a testing set; constructing a full convolution neural network for image segmentation; and training the full convolution neural network by utilizing the divided training set, and optimizing relevant parameters until a global optimal solution is obtained, so that the recognition efficiency of the picture is effectively improved.

2. The crack detection model is applicable to detection tasks of road surface images of the same material after training is completed, and has wide applicability.

3. According to the crack identification method, the detection vehicle with the camera is used for driving on a road to obtain an original image, and then the crack condition is evaluated through neural network processing, so that the requirements of manpower, material resources and financial resources are reduced.

Drawings

FIG. 1 is an overall flow chart designed based on the method of the present invention;

FIG. 2 is a block diagram of a full convolutional neural network according to the present invention;

FIG. 3 shows the image segmentation result of the road surface crack by the method of the invention.

Detailed Description

The present invention is further illustrated in the accompanying drawings and detailed description which are to be understood as being merely illustrative of the invention and not limiting of its scope, and various modifications of the invention, which are equivalent to those skilled in the art upon reading the invention, will fall within the scope of the invention as defined in the appended claims.

In the present embodiment, an image of a road surface crack image is subjected to image segmentation to detect cracks. Firstly, carrying out preparation work of a data set, photographing a campus road by using a camera, acquiring an original crack image, and constructing the data set. Then, the image acquired by the camera is cut on the computer, the size of the cut image is 320X320, and the data set comprises 6000 pictures after the operation of enhancing the data set. The number of images with and without cracks is 3000, and the number of images for model training and testing is 1800 and 1200 respectively. And then, the Lableme software is used for marking the split data of the cracked image in the data set. The model construction and training work is completed under the Tensorflow framework, the batch size is set to 2, the movement optimizer is used for optimizing the cost function, the initial learning rate is set to 0.0001, the descending learning rate updating strategy is adopted, and each time the learning rate is attenuated to be 0.9 times of the previous learning rate after 2000 steps. During training, two pictures are taken to calculate the gradient of the error to the network parameters, so that the network parameters are updated. The neural network inputs pictures with the image size unified to 320×320. The fracture image segmentation results are shown in fig. 3.

The experimental platform environment is configured as follows:

intel (R) Core (TM) [email protected], NVIDIA GTX1080Ti video card, 16G DDR4 memory (main frequency 2400 MHz), 11G video memory (frequency 1.4 GHz). The whole training process is completed on the GPU in an acceleration way.

Referring to fig. 1, the method for dividing the road surface crack image according to the present invention comprises the following specific steps:

s1, photographing pavement cracks by using a camera, constructing a data set, and dividing images in the data set into two types of cracks and crack-free images;

s4, constructing a full convolution neural network for image segmentation;

the full convolution neural network consists of 1 convolution layer, 16 expansion convolution layers, 1 pyramid sampling layer, 2 decoder layers and 1 semantic output layer.

The step S4 is a full-product neural network model structure and parameters specifically as follows:

s5, training the full convolution neural network by using the divided training set, and optimizing relevant parameters until a global optimal solution is obtained;

s6, submitting the crack images in the test set to a full convolution neural network, and outputting an image segmentation result;

The pyramid sampling module of the full convolution neural network has 4 paths, which are respectively used for sampling the average value with the side length of 10, and averaging the whole characteristic diagram; three holes with hole ratios of 1,2 and 4 are convolved, and different receptive fields are used to extract information from the feature map. The output results of the 4 paths are combined in a stacked manner after being scaled to the same size.

The full convolution neural network parameter optimization uses a movement optimizer, the dynamic value of the optimizer is 0.9, the initial learning rate is 0.0001, a decreasing learning rate updating strategy is adopted, and each time of 2000 steps of attenuation is 0.9 times of the previous learning rate. During training, two pictures are taken to calculate the gradient of the error to the network parameters, so that the network parameters are updated.

it should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be comprehended within the scope of the present invention. The components not explicitly described in this embodiment can be implemented by using the prior art.

Claims

1. A road surface crack image segmentation method based on a full convolution neural network is characterized by comprising the following steps of: the method comprises the following steps:

s2, labeling the image with the crack, wherein the labeling is pixel level labeling;

s3, dividing the constructed data set into a training set and a testing set;

s4, constructing a full convolution neural network for image segmentation;

the built full convolution neural network specifically comprises:

the full convolution neural network comprises 1 convolution layer, 16 expansion convolution layers, 1 pyramid sampling layer, 2 decoder layers and 1 semantic output layer;

a. when the span is 1: the input picture is overlapped with the input through the results of point convolution 1x1 (the activation function is Relu 6), depth convolution 3x3 (the activation function is Relu 6) and point convolution 1x1 (the activation function is linear), and then output is obtained;

b. span is 2: the input picture is directly output after point convolution 1x1 (the activation function is Relu 6), depth convolution 3x3 (the activation function is Relu 6) and point convolution 1x1 (the activation function is linear);

the pyramid sampling module of the full convolution neural network has 4 paths, which are respectively used for sampling the average value with the side length of 10, and averaging the whole characteristic diagram; three holes with hole ratios of 1,2 and 4 are convolved, and different receptive fields are used for extracting information from the feature map; the output results of the 4 paths are combined in a stacking mode after being scaled to the same size;

2. The road surface crack image segmentation method based on the full convolution neural network as set forth in claim 1, wherein the method comprises the following steps: the full convolution neural network is characterized in that a decoder 0 is connected with an extended convolution 3#, 16 feature graphs are input, and 32 feature graphs are output; the decoder 1 is connected to the convolutional layer, inputs 8 feature maps, and outputs 32 feature maps.

3. The road surface crack image segmentation method based on the full convolution neural network as set forth in claim 1, wherein the method comprises the following steps: the full convolution neural network parameter optimization uses a movement optimizer, the dynamic value of the optimizer is 0.9, the initial learning rate is 0.0001, a decreasing learning rate updating strategy is adopted, and each time of 2000 steps of attenuation is 0.9 times of the previous learning rate; and during training, calculating the gradient of the error pair network parameters by taking two pictures, and further updating the network parameters.

4. The road surface crack image segmentation method based on the full convolution neural network as set forth in claim 1, wherein the method comprises the following steps: the objective function of the full convolution neural network for image segmentation is the average number of cross entropy of a given label and a prediction result prediction of the neural network at each pixel point:

/>