CN111275684A

CN111275684A - Strip steel surface defect detection method based on multi-scale feature extraction

Info

Publication number: CN111275684A
Application number: CN202010063720.8A
Authority: CN
Inventors: 黄成�; 吴贇
Original assignee: Donghua University
Current assignee: Donghua University; National Dong Hwa University
Priority date: 2020-01-20
Filing date: 2020-01-20
Publication date: 2020-06-12

Abstract

The invention provides a strip steel surface defect detection method based on multi-scale feature extraction, which utilizes feature maps of different levels and different scales in a detection network to extract defect features, a shallow large-scale feature map is used for detecting small defects, and a deep small-scale feature map is used for detecting large defects, thereby realizing the detection of different types of defects with different sizes. The invention mainly comprises the following steps: acquiring a data set of a strip steel surface defect image; preprocessing the acquired data set; off-line training a strip steel surface defect detection network; and detecting the defect image by using the trained network. Compared with the existing method for detecting the surface defects of the strip steel, the method has the advantages of good robustness, strong generalization capability, detection speed and precision, and capability of adding new defect types to train and detect on the basis of the existing method.

Description

Strip steel surface defect detection method based on multi-scale feature extraction

Technical Field

The invention relates to a band steel surface defect detection method based on multi-scale feature extraction, and particularly relates to the technical field of pattern recognition and defect detection.

Background

Strip steel is used as an important raw material in the manufacturing industry, and the quality of the strip steel directly influences the quality and the performance of products. However, the production process of the strip steel is complex, has higher requirements on production environment, and is easy to generate defects due to the influence of machinery, human or environment. At present, the detection of the surface defects of the strip steel is mainly divided into a traditional nondestructive detection method and a traditional machine learning detection method. The traditional nondestructive detection method mainly comprises the following steps: manual visual inspection, eddy current inspection, magnetic flux leakage inspection, infrared inspection, and the like. The traditional nondestructive testing methods have various defects: the manual visual detection method has the problems of incomplete detection, lack of quantification, low confidence coefficient, high labor intensity and difficulty in ensuring the surface quality of the steel plate; the eddy current detection method has high requirements on the surface state, the iron scale has great influence on the detection effect, the detection speed is low, and the method is not suitable for detecting the surface defects of the high-speed rolled steel plate; the magnetic flux leakage detection method is easily influenced by environmental factors, and the detection precision and speed are difficult to meet the industrial requirements; the infrared detection method has more limit conditions and limited detectable defect types, and is generally only used for off-line small-range detection.

The traditional machine learning detection method generally comprises the steps of image preprocessing, image segmentation, feature extraction and image classification to complete a defect detection task, and a good feature extraction method usually needs a large amount of experiments and manual selection verification. Some common feature extraction methods include wavelet decomposition, multi-scale geometric analysis, an LBP feature extraction method, a Hog feature extraction method and the like. Because the defects which need to be detected actually are various, the results of different types of defects are greatly different if the same feature extraction method is adopted, and the complexity of the algorithm is multiplied if different feature extraction methods are adopted for each type of defect, so that the problems of instability and poor robustness of the algorithm are solved, and real-time detection cannot be realized.

In recent years, the deep learning is developed rapidly, and a plurality of breakthrough achievements are obtained in the field of pattern recognition. Different from the traditional machine learning detection method, the implicit characteristic extraction is completed by deep learning through designing a neural network with a large scale and training a large number of samples and updating the reverse gradient parameters, so that the applicability of the algorithm is greatly improved.

The detection of the surface defects of the strip steel is a composite task of positioning and classification, and the accuracy of the detection are difficult to be considered by the traditional machine learning detection method. Under the background, the invention provides a strip steel surface defect detection method based on multi-scale feature extraction by adopting a deep learning method.

Disclosure of Invention

The invention aims to solve the technical problem of how to detect the surface defects of the strip steel based on multi-scale feature extraction.

In order to solve the technical problems, the method for detecting the surface defects of the strip steel based on multi-scale feature extraction realizes the accurate positioning of the positions of the defects in the image and the accurate classification of the types of the defects, and has the characteristics of high detection accuracy, high detection speed, good robustness and the like, and comprises the following specific steps of:

step 1: and acquiring a data set of the strip steel surface defect image.

Step 2: and preprocessing the acquired data set, including two steps of off-line enhancing the data set and making an LMDB format data set. The method specifically comprises the following steps:

step 2.1: the data set is enhanced offline. Data enhancement is particularly important for improving detection accuracy of various scales (especially small targets), and two data enhancement modes are used in the invention: the pixel content transformation comprises the steps of randomly changing the brightness of an image, randomly changing the contrast, the chromaticity and the saturation and randomly changing the illumination noise; and spatial geometric transformation, including image expansion, random cropping and random mirroring. The enhanced data set was 6 times the size of the previous one, and the image resolution was uniform 300 x 300.

Step 2.2: an LMDB format data set is produced. Firstly, manually marking the defect position of the acquired strip steel surface defect image through labelImg software, and generating a corresponding xml file. Then, a python script is written, and the defect image is randomly divided into a trainval data set and a test data set according to the 8:2 ratio. And generating a trace.txt and a test.txt with images and labels information and a test _ name _ size.txt with image size information by using a create _ list.sh script. And finally, modifying the labelmap file storing the type information to enable the labelmap file to contain N defect types and background types, wherein the N +1 types are total, and generating an LMDB format data set through a create _ data.

And step 3: and (5) off-line training a strip steel surface defect detection network. And (4) inputting the LMDB format data set manufactured in the step (2) into a network for training until the model is converged. The method specifically comprises the following steps:

step 3.1: firstly, constructing a strip steel surface defect detection network. The band steel surface defect detection network constructed by the invention comprises the following layers: conv1_1, conv1_2, pool1, conv2_1, conv2_2, pool2, conv3_1, conv3_2, conv3_3, pool3, conv4_1, conv4_2, conv4_3, pool4, conv5_1, conv5_2, conv5_3, pool5, conv6, conv7, conv8_1, conv8_2, conv9_1, conv9_2, conv10_1, conv10_2, conv11_1, conv11_ 2.

Step 3.2: inputting the well-made training set into a network, sequentially transmitting the training set through convolutional layers conv4_3, conv7, conv8_2, conv9_2, conv10_2, conv11_2 and respective classifiers to obtain the category confidence coefficient and the predicted position value of each prior frame about each defect category, only keeping the maximum category confidence coefficient for each prior frame and inhibiting the prior frames with the category confidence coefficients smaller than a threshold value and belonging to a background category, inhibiting through a non-maximum value to obtain a final predicted frame containing the defect position value and the category confidence coefficient information, and comparing the final predicted frame with a real target frame containing artificially labeled defect category and position information for training to calculate a loss function value. And (3) reversely transmitting the output error through a random gradient descent method and an error back propagation algorithm, and distributing the error to all units of each layer so as to obtain error signals of the units of each layer and further adjust the weight. When the loss function values converge, the model converges.

The loss function is as follows:

where x is the input image, c is the class confidence, l is the final predicted box, g is the true target box, N, α are constants.

Position loss mainly adopts L₁The loss is as follows:

and (4) judging whether the ith final prediction box and the jth real target box are matched with each other or not in terms of the category k, if so, the category k is 1, and the mismatch is 0. Wherein

For the final prediction box of the ith class m,

is the real target box with j category being m.

The classification loss function is as follows:

wherein

The following formula:

and (4) representing the prediction probability of the ith final prediction box corresponding to the category p, wherein softmax loss is adopted.

And 4, step 4: inputting the surface defect image (300 × 300) of the strip steel to be detected into the trained detection network, and sequentially obtaining feature maps output by conv4_3, conv7, conv8_2, conv9_2, conv10_2 and conv11_2 layers.

And 5: and setting a priori blocks for feature maps of step 4 detection network conv4_3, conv7, conv8_2, conv9_2, conv10_2 and conv11_2 layer outputs. The priori block setting mainly comprises two aspects of scale and aspect ratio, feature maps output by the conv4_3, conv7, conv8_2, conv9_2, conv10_2 and conv11_2 layers are respectively set as feature maps of the k-th layer, and the priori block scale setting is shown as the following formula:

in the above formula, s_kIs a priori box scale (proportion of the image size), s_min＝0.25，s_max0.9. The prior frame aspect ratio is set to {1, 2,3, 1/2, 1/3, 1'}, 1' as a scale

And the prior frame with the aspect ratio of 1 is arranged, and two square prior frames with the aspect ratio of 1 and different scales are arranged on each layer of feature map. Thus, each layer of the feature map has 6 a priori blocks in common, but the conv4_3, conv10_2, and conv11_2 layers use only 4 a priori blocks, which do not use a priori blocks with aspect ratios of 3, 1/3. The sizes of the feature maps generated by the 6 convolutional layers, conv4_3, conv7, conv8_2, conv9_2, conv10_2 and conv11_2, are 38 × 38, 19 × 19, 10 × 10, 5 × 5, 3 × 3 and 1 × 1 respectively. Each layer of n x n size characteristic diagram has n x n central points, each central point generates t prior frames, and t of 6 layers of characteristic diagrams is 4, 6, 4 and 4 respectively. So the 6-layer signature generates 38 × 4+19 × 6+10 × 6+5 × 6+3 × 4+1 × 4 ═ 8732 a priori frames in total.

Step 6: all prior frames in the k-th layer (k 1...., 6) feature map in the step 5 are processed by a classifier to obtain a predicted position value of each prior frame and a class confidence degree about each defect class. Specifically, all prior frames in a k-th layer feature map are convolved with two 3 x 3 convolution kernels in a classifier, each prior frame obtains a prediction position value of the prior frame and a class confidence coefficient related to each defect class, and the prediction position value comprises a central coordinate x offset, a central coordinate y offset, a height h offset and a width w offset;

and 7: only the maximum class confidence is reserved for each prior frame, and the prior frames with the class confidence less than 0.5 and belonging to the background class in all the prior frames are suppressed.

And 8: dividing the remaining prior frames of the step 7 into m (m ═ 1,2, 3.) prior frame groups, wherein the prior frame groups are composed of prior frames intersecting with each other, and each prior frame in the prior frame groups has the same defect class confidence.

And step 9: and for the mth (m is 1,2, 3.) priori frame group, selecting the predicted position value of the prior frame with the maximum confidence coefficient for the same defect type as the final defect position value through a non-maximum suppression algorithm, and obtaining the final defect positioning and defect classification result.

Compared with the prior art, the invention has the advantages that:

1. the detection method of the invention utilizes the advantages of deep learning in feature extraction and utilizes the convolutional neural network as a feature extraction part. The different levels of features encode different levels of information, taking into account the hierarchical structure of the convolutional neural network. The low-level features contain more detailed information, the high-level features focus more on semantic information, and more complex high-level concepts are captured. The defect characteristics are extracted by utilizing the characteristic diagrams of different levels and different scales in the detection network, the shallow large-scale characteristic diagram is used for detecting small defects, and the deep small-scale characteristic diagram is used for detecting large defects, so that the detection of different types of defects with different sizes is realized.

2. Compared with the existing method for detecting the surface defects of the strip steel, the method has high detection precision, and the defect feature extraction and the defect detection are completed in a network at one time, so that the detection speed is greatly improved. The defects are detected by loading the trained model, the detection speed is high, the robustness is good, and the generalization capability is strong.

3. The defect type adaptability is strong, the method is not limited to detecting a certain type of specific defects, various defects can be detected simultaneously, and new defect types are added on the existing basis for training and detecting.

Drawings

FIG. 1 is a flow chart of a method for detecting defects on the surface of a strip steel based on multi-scale feature extraction according to the present invention;

FIG. 2 is a network level diagram of the surface defect detection of the strip steel of the present invention;

FIG. 3 is three general strip steel surface defect images obtained by the present invention;

FIG. 4 is a flow chart of the offline data enhancement of the present invention;

FIG. 5 is a diagram illustrating manual defect labeling according to an embodiment of the present invention;

FIG. 6 is a graph of loss function values as a function of iteration number for an embodiment of the present invention;

FIG. 7 is a diagram illustrating the detection effect of the surface defects of three common strip steels obtained by the embodiment of the invention;

FIG. 8 is a diagram illustrating various defect detection effects according to an embodiment of the present invention.

Detailed Description

In order to make the invention more comprehensible, the invention is described in detail below with reference to the accompanying drawings.

The invention discloses a flow chart of a method for detecting the surface defects of strip steel based on multi-scale feature extraction, which is shown in figure 1 and comprises the following actual steps:

step 1: and acquiring a data set of the strip steel surface defect image. Three kinds of common strip steel surface defect image data are obtained through a relevant database in a test, wherein the three kinds of common strip steel surface defect image data are respectively plaques (patches), pits (pits) and inclusions (inclusions), and each defect image is 300, and a strip steel surface defect image data set is established according to the defects. Three common strip steel surface defect images obtained by the invention are shown in figure 3.

Step 2: and preprocessing the acquired data set, including two steps of off-line enhancing the data set and making an LMDB format data set.

Step 2.1: the data set is enhanced offline. Data enhancement is particularly important for improving detection accuracy of various scales (especially small targets), and two data enhancement modes are used in the invention: the pixel content transformation comprises the steps of randomly changing the brightness of an image, randomly changing the contrast, the chromaticity and the saturation and randomly changing the illumination noise; and spatial geometric transformation, including image expansion, random cropping and random mirroring. The enhanced data set was 6 times the size of the previous data set, i.e., 5400 defect images in total, with a uniform 300 x 300 image resolution. The offline data enhancement flow diagram is shown in fig. 4.

Step 2.2: an LMDB format data set is produced. Firstly, manually marking the defect position of the acquired strip steel surface defect image through labelImg software, and generating a corresponding xml file. Then, a python script is written, and the defect image is randomly divided into a trainval data set and a test data set according to the 8:2 ratio. And generating a trace.txt and a test.txt with images and labels information and a test _ name _ size.txt with image size information by using a create _ list.sh script. And finally, modifying the labelmap file storing the category information to enable the labelmap file to contain 3 types of defect categories and 4 types of background categories, and generating an LMDB format data set through a create _ data. The manual labeling diagram of the defect is shown in fig. 5.

step 3.1: firstly, constructing a strip steel surface defect detection network. The level of the band steel surface defect detection network constructed by the invention is shown in figure 2.

The loss function is as follows:

where x is the input image, c is the class confidence, l is the final predicted box, g is the true target box, N, α are set to 1.

Position loss mainly adopts L₁The loss is as follows:

For the final prediction box of the ith class m,

is the real target box with j category being m.

The classification loss function is as follows:

wherein

The following formula:

Fig. 6 is a graph showing the change of the Loss function value with the iteration number in the training process of the method provided by the present invention, and the Loss function value (Train Loss) is continuously decreased as the iteration number (Train Iterations) is increased. When the iteration times reach about 25000 times, the loss function value basically tends to be stable.

And 5: and setting a priori blocks for feature maps of step 4 detection network conv4_3, conv7, conv8_2, conv9_2, conv10_2 and conv11_2 layer outputs. The prior frame setting mainly includes two aspects of scale and aspect ratio, and the feature maps output by the conv4_3, conv7, conv8_2, conv9_2, conv10_2 and conv11_2 layers are respectively set as feature maps of the k-th layer (k is 1, 6), and the prior frame scale setting is shown as the following formula:

Fig. 7 and 8 are diagrams illustrating the effect of the method of the present invention on detecting the surface defects of the strip steel, and it can be seen that the detection results include the results of defect localization and defect classification. The detection conditions of 180 defect images (60 patches, pocks and inclusions) are shown in the following table:

in this example, the computer configuration and environment used for the experiment are shown in the following table:

processor with a memory having a plurality of memory cells	Display card	Memory device	Hard disk	System for controlling a power supply	Frame structure
						Intel i7-3770S	GTX 1080Ti	16G DDR3	512G SSD	Ubuntu 18.0464 bit	Caffe

The test result shows that the method provided by the invention has good detection effects on three defects of patches, pocks and inclusions, can accurately position the positions of the defects in the image, and has the detection accuracy rates of 100%, 100% and 96.7%, the overall detection accuracy rate of 98.9% and the detection speed of 33 FPS. Compared with the existing method for detecting the surface defects of the strip steel, the method has the advantages of good robustness, strong generalization capability, detection speed and precision, and capability of adding new defect types to train and detect on the basis of the existing method.

Claims

1. A strip steel surface defect detection method based on multi-scale feature extraction is characterized by comprising the following steps:

step 1: acquiring a data set of a strip steel surface defect image;

step 2: preprocessing the acquired data set, including an offline enhanced data set and making an LMDB format data set;

and step 3: off-line training the strip steel surface defect detection network, inputting the LMDB format data set manufactured in the step (2) into the detection network for training until the model converges;

and 4, step 4: inputting the surface defect image of the strip steel to be detected with the resolution of 300 × 300 into a trained detection network, and sequentially obtaining feature maps output by conv4_3, conv7, conv8_2, conv9_2, conv10_2 and conv11_2 layers;

and 5: setting prior frames for feature maps of the detection networks conv4_3, conv7, conv8_2, conv9_2, conv10_2 and conv11_2 output in the step 4, wherein the prior frame settings comprise two aspects of scale and aspect ratio, setting the feature maps of conv4_3, conv7, conv8_2, conv9_2, conv10_2 and conv11_2 output to k-th layer feature maps respectively, and setting the k value to be in a range of 1-6, wherein the prior frame scale settings are as shown in the following formula:

where k is e { 1.,. 6}, s_min＝0.25,s_max＝0.9

Step 6: obtaining a predicted position value of each prior frame and a category confidence degree related to each defect category by all prior frames in the k-th layer feature map in the step 5 through a classifier, specifically, performing convolution on all prior frames in the k-th layer feature map and two 3 × 3 convolution kernels in the classifier, wherein each prior frame obtains a predicted position value of the prior frame and a category confidence degree related to each defect category, and the predicted position value comprises a central coordinate x offset, a central coordinate y offset, a height h offset and a width w offset;

and 7: only keeping the maximum class confidence coefficient for each prior frame, and inhibiting the prior frames with the class confidence coefficient smaller than 0.5 and belonging to the background class in all the prior frames;

and 8: dividing the remaining prior frames in the step 7 into m prior frame groups, wherein the prior frame groups are composed of mutually intersected prior frames, and each prior frame in the prior frame groups has the confidence coefficient of the same defect type;

and step 9: and for the mth prior frame group, selecting the predicted position value of the prior frame with the maximum confidence coefficient aiming at the same defect type as the final defect position value through a non-maximum suppression algorithm, and obtaining the final defect positioning and defect classification result.

2. The method for detecting the surface defects of the strip steel based on the multi-scale feature extraction as claimed in claim 1, wherein the step 2 comprises the following steps:

the first step is as follows: the method comprises the following steps of off-line enhancing a data set, wherein data enhancement is particularly important for improving detection accuracy of various scales, and the data enhancement comprises two data enhancement modes: the pixel content transformation comprises the steps of randomly changing the brightness of an image, randomly changing the contrast, the chromaticity and the saturation and randomly changing the illumination noise; spatial geometric transformation, including image expansion, random cropping, random mirroring, the enhanced data set size is 6 times of the former size, and the image size is 300 × 300;

the second step is that: the method comprises the steps of manufacturing an LMDB format data set, firstly, manually marking the defect position of an obtained strip steel surface defect image through labelImg software, generating a corresponding xml file, then writing a python script, randomly dividing the defect image into a train entry data set and a test data set according to the proportion of 8:2, then generating a script by utilizing a data set index to generate a text for storing image and label file name information and a text with image size information, finally modifying the labelmap file for storing type information to enable the labelmap file to contain N defect types and background types, wherein the N +1 types are shared, and then generating the LMDB format data set through the script.

3. The method for detecting the surface defects of the strip steel based on the multi-scale feature extraction as claimed in claim 1, wherein the step 3 specifically comprises the following steps:

the first step is as follows: firstly, constructing a strip steel surface defect detection network. The band steel surface defect detection network constructed by the invention comprises the following layers: conv1_1, conv1_2, pool1, conv2_1, conv2_2, pool2, conv3_1, conv3_2, conv3_3, pool3, conv4_1, conv4_2, conv4_3, pool4, conv5_1, conv5_2, conv5_3, pool5, conv6, conv7, conv8_1, conv8_2, conv9_1, conv9_2, conv10_1, conv10_2, conv11_1, conv11_ 2;

the second step is that: inputting the well-made training set into a network, sequentially transmitting the training set through convolutional layers conv4_3, conv7, conv8_2, conv9_2, conv10_2, conv11_2 and respective classifiers to obtain the category confidence coefficient and the predicted position value of each prior frame about each defect category, only keeping the maximum category confidence coefficient for each prior frame and inhibiting the prior frames with the category confidence coefficients smaller than a threshold value and belonging to a background category, inhibiting through a non-maximum value to obtain a final predicted frame containing the defect position value and the category confidence coefficient information, and comparing the final predicted frame with a real target frame containing artificially labeled defect category and position information for training to calculate a loss function value. And (3) reversely transmitting the output error through a random gradient descent method and an error back propagation algorithm, and distributing the error to all units of each layer so as to obtain error signals of the units of each layer and further adjust the weight. When the loss function values converge, the model converges. The loss function is as follows:

wherein x is an input image, c is a category confidence coefficient, l is a final prediction frame, g is a real target frame, N and α are constants, and the loss function is divided into two parts, namely the position loss and the category confidence coefficient loss of the final prediction frame, wherein the position loss function is as follows:

position loss mainly adopts L₁The loss is as follows:

whether the ith final prediction box and the jth real target box are matched with respect to the class k or not is determined, if so, the class k is 1, and the mismatch is 0, wherein

For the final prediction box of the ith class m,

for the jth real object box of class m,

the classification loss function is as follows:

wherein

The following formula: