CN112700444A

CN112700444A - Bridge bolt detection method based on self-attention and central point regression model

Info

Publication number: CN112700444A
Application number: CN202110188973.2A
Authority: CN
Inventors: 鞠晓臣; 赵欣欣; 肖鑫; 郭辉; 左照坤; 陈令康; 王丽; 刘晓光
Original assignee: China Academy of Railway Sciences Corp Ltd CARS; Railway Engineering Research Institute of CARS; China State Railway Group Co Ltd
Current assignee: China Academy of Railway Sciences Corp Ltd CARS; Railway Engineering Research Institute of CARS; China State Railway Group Co Ltd
Priority date: 2021-02-19
Filing date: 2021-02-19
Publication date: 2021-04-23
Anticipated expiration: 2041-02-19
Also published as: CN112700444B

Abstract

The invention discloses a bridge bolt detection method based on a self-attention and central point regression model, which comprises the following steps: acquiring a bridge bolt image to be detected; detecting the bridge bolt image through a self-attention and central point regression model obtained through pre-training; and determining whether the image bolt of the bridge bolt has a disease or not according to the detection result of the self-attention and central point regression model. The self-attention and central point regression-based model comprises a semantic segmentation network and a detection network. The two networks share the middle-layer features extracted by the convolutional neural network, the semantic segmentation network aims to perform semantic segmentation on the rectangular region where the bolt is located, and the features used for semantic segmentation are connected with the features of the detection module to position the bolt together. The method is applied to the field of bolt disease detection and identification, can overcome the defects of the traditional bridge disease image detection and identification technology, and can well solve the problems of efficiency, cost, safety and the like in bolt disease detection and identification.

Description

Bridge bolt detection method based on self-attention and central point regression model

Technical Field

The invention belongs to the technical field of vision, relates to application of a vision technology in bridge diseases, and particularly relates to a bridge bolt detection method based on a self-attention and central point regression model.

Background

High-strength bolts are one of the main connection modes of large steel structure facilities such as bridges, the steel for the high-strength bolts of bridges in China is developed from 40B to 20MnTiB and 35VB, and the popularization and the use to the present show that the high-strength bolts made of two materials can meet the use requirements through more than forty years of engineering practice tests. In recent years, the delayed fracture probability of high-strength bolts has increased due to various factors. In operation, the railway bridge is subjected to the effects of designing operation load, accelerating speed and overloading a train for a long time, is subjected to invasion of natural disasters such as earthquake, flood, debris flow and the like and invasion of harmful substances such as wind, rain, ice blocks, harmful ions and the like, bears accidental impact of the train or the ship and the like, so that the pier and the foundation can be damaged in different degrees, different types of diseases exist, and the bolt disease occupies a main position in bridge diseases. Therefore, timely detection of bridge bolt diseases is an important guarantee for avoiding bridge accidents and ensuring personal safety of people and property of the country from being damaged.

The existing methods mostly adopt a method of manually detecting and adding an auxiliary tool, regularly searching the bridge by bridge maintenance personnel, and assisting the bridge by visual inspection and equipment such as an ultrasonic detector, a cable force meter and the like. For example, ultrasonic detectors: the surface and internal quality of the bolt is checked by ultrasonic inspection. Compared with X-ray flaw detection, ultrasonic flaw detection has the advantages of higher flaw detection sensitivity, short period, low cost, flexibility, convenience, high efficiency, no harm to human bodies and the like. However, the use of the ultrasonic detector requires the bolt surface to be smooth, and the defect type can be identified only by experienced inspectors, so that the defect detection standard is not intuitive.

In the prior art, the bridge bolt diseases are mainly determined by the professional knowledge and the supervisor of the detection personnel, so that the time and labor waste efficiency is not high, the influence of subjective factors of the detection personnel is also received on the detection accuracy, and the potential safety hazard is often brought to the bridge maintenance staff. The requirements of light weight, automation and real-time property cannot be met. The difficulties are as follows:

(1) the bolt target is small and the density is high;

(2) when the precision of the auxiliary tool is reduced or the auxiliary tool is damaged, the auxiliary tool is inconvenient to replace and has higher cost;

(3) automation cannot be realized and a large amount of manual participation is required.

With the deep application of the deep convolutional neural network in the field of computer vision, the YOLO algorithm and the algorithm such as fast Rcnn based on Region pro posal have good detection effects in the industrial field and in practical application scenes, but the YOLO algorithm and the algorithm have the following challenges:

(1) both of the above methods are performed by enumerating all potential regions and then performing a classification operation on each region separately, which is wasteful, inefficient and requires additional post-processing (e.g., non-maxima suppression, etc.). This post-processing operation presents great difficulties in differentiation and training, resulting in the inability of most current detectors to truly achieve end-to-end training.

(2) The extracted candidate regions are considered to have the same contribution to final detection by algorithms such as the YOLO algorithm and the fast Rcnn algorithm, while in an actual life scene, the periphery of an object to be detected in a picture often has complex and rich semantic information, and the contribution of each region to final detection is different.

(3) When constructing a keypoint heat map (heatmap), if the center-point pixel is directly set to 1 and the remaining pixels are set to 0, errors are difficult to calculate due to discontinuity of an optimization target, and the whole model is difficult to optimize in a mode of error back propagation. Some use gaussian kernels to disperse the center point over the entire heat map, which has a certain effect, but because the calculated coordinate range is the entire heat map, the generated heat map foreground image cannot effectively reflect the shape of the bolt.

Therefore, practitioners of the same industry need to solve the problem of how to provide a bridge bolt detection method with high recognition rate and low cost.

Disclosure of Invention

The invention mainly aims to provide a bridge bolt detection method based on self-attention and central point regression models for at least partially solving the technical problems, combines a deep neural network technology with a traditional image detection and identification technology, is applied to the field of bolt disease detection and identification, can overcome the defects of the traditional bridge disease image detection and identification technology, and can well solve the problems of efficiency, cost, safety and the like in bolt disease detection and identification.

In order to achieve the purpose, the invention adopts the technical scheme that:

the embodiment of the invention provides a bridge bolt detection method based on a self-attention and central point regression model, which comprises the following steps:

acquiring a bridge bolt image to be detected;

detecting the bridge bolt image through a self-attention and central point regression model obtained through pre-training; the self-attention and central point regression model is obtained by training a plurality of groups of training data of a bolt detection scene data set; each group of data of the multiple groups of training data comprises multiple types of bridge bolt images and bolt marking frames in the bridge bolt images; the self-attention and center point based regression model comprises: a convolutional neural network, and a self-attention semantic segmentation network and a detection network based on object center point regression respectively connected with the convolutional neural network;

and determining whether the image bolt of the bridge bolt has a disease or not according to the detection result of the self-attention and central point regression model.

Further, the construction process of the bolt detection scene data set comprises the following steps:

selecting shot bolt scene images of a plurality of bridges; the scene images comprise a plurality of types of images, and each image is provided with a bolt marking frame;

zooming, cutting and turning the scene image to obtain a scene image data enhancement and increase sample set;

performing mask image processing on the image of the sample set, and taking all pixels in the bolt labeling frame as the foreground of the mask image;

and carrying out heat map construction on the bolt center point sensitive to the object size on the sample set data processed by the mask image.

Further, performing object size sensitive bolt center point heat map construction, comprising:

order to

Indicates a width to be detected of

Has a height ofHThe three-channel color image of (a),

is a real number; marking frames according to bolts, generating and original drawingsISame size central point heatmap

(ii) a The central point heat map

Have only 0 and 1 pixel values;

since object detection based on convolutional networks leads to a reduction in the size of the generated feature map due to down-sampling, the purpose of the pre-processing is to generate a central point heat map

Here, the

Is due to the scaling factor formed by the convolution network downsampling;

original drawingIThe true heatmap is

，

The pixel value of each coordinate in the image is calculated according to a preset formula;

when the marked frame areas of a plurality of bolts in the same image are partially overlapped, the generated heat map of the central point

The value of the middle overlapping pixel selects the largest gaussian spread result.

Further, the detection process based on the self-attention and central point regression model includes:

extracting middle-layer features of the bridge bolt image to be detected through a convolutional neural network;

inputting the extracted middle-layer features into a self-attention semantic segmentation network, and outputting a semantic segmentation feature map of the bridge bolt image;

inputting the extracted middle-layer features into a detection network based on object center point regression, and outputting a detection feature map of the bridge bolt image;

and connecting the semantic segmentation feature map with the detection feature map, and positioning the bolt through the detection network based on object center point regression.

Further, inputting the extracted middle-layer features into a self-attention semantic segmentation network, and outputting a semantic segmentation feature map of the bridge bolt image, wherein the semantic segmentation feature map comprises:

assuming that the extracted middle layer features are

The output is the original drawingICorresponding annotated mask imageM；

Using a four-layer expanded convolutional network

To label mask imageMSo that the mapping is used to predict the annotated mask imageMIs semantically segmented into feature dimensions of

；

Using a convolution kernel size

Is based on

Predictive annotation mask imageM；

Assuming self-attention semantic segmentation of network pairsOn the original pictureIThe generated semantic segmentation result image is

；

Then the optimization objective of the self-attention semantic segmentation network is to make the predicted segmentation result image

Approximation label mask imageMDefining an error loss function;

and outputting a semantic segmentation feature map according to the error loss definition function.

Further, an error loss function is defined as follows:

（2）

(2) in the formula (I), the compound is shown in the specification,

for marking mask imagesMWhether the pixel with the x-axis and the y-axis as the abscissa is in the marked bolt rectangular area or not is 1 if the pixel with the x-axis and the y-axis is in the marked bolt rectangular area, and otherwise is 0;

segmenting a resulting image for prediction

X on the abscissa and y on the ordinate, 1 if yes, and 0 otherwise.

Further, connecting the semantic segmentation feature map with the detection feature map, and positioning the bolt through the detection network based on object center point regression, including:

connecting the semantic segmentation feature map with the detection feature map to obtain complete features

；

Characterizing said integrity feature

The predicted bolt size is compared to the actual bolt size using the L1 loss function as the optimization objective function for bolt size prediction:

（3）

(3) in the formula, k is a bolt serial number; n is the total number of the bolts;

is the predicted bolt size;

the bolt size actually used as network supervision information;

the detection network based on object center point regression automatically learns the offset between the center point of the bolt marking frame and the real position, and the detection network is formulated to optimize the following formula:

(4)

(4) in the formula (I), the compound is shown in the specification,

due to the scaling factor formed by the convolution network downsampling,

is the local displacement amount for each object center point;

predicting integrity features

Whether each coordinate position of (a) is the center of the bolt; having the central point heat map of the detection network prediction represented as

Taking a logistic regression loss function in the form of Focal loss as an optimization target:

(5)

(5) in the formula (I), the compound is shown in the specification,

is a hyper-parameter in the focal loss,

the number of bolts in the current image;

detection network for complete features based on object center point regression

Predicting 5 values at each position to realize bolt positioning; the 5 values are the probability of belonging to the center point, the offset of the two dimensions of the center point, and the width and height of the object, respectively.

Further, the overall optimization objective function of the self-attention and center point regression model is as follows:

（6）

wherein the content of the first and second substances,

and the hyperparameter is used for controlling the contribution degree of different loss functions to the integral model.

Compared with the prior art, the invention has the following beneficial effects:

the invention provides a bridge bolt detection method based on a self-attention and central point regression model, which comprises the following steps: acquiring a bridge bolt image to be detected; detecting the bridge bolt image through a self-attention and central point regression model obtained through pre-training; and determining whether the image bolt of the bridge bolt has a disease or not according to the detection result of the self-attention and central point regression model. The self-attention and central point regression-based model comprises a semantic segmentation network and a detection network. The two networks share the middle-layer features extracted by the convolutional neural network, the semantic segmentation network aims to perform semantic segmentation on the rectangular region where the bolt is located, and the features used for semantic segmentation are connected with the features of the detection module to position the bolt together. The two branch tasks are mutually promoted and mutually enhanced, the semantic segmentation network can provide object position prior guidance for the detection network, and the detection network can correct the semantic segmentation result.

In addition, the detection network adopts a detection method based on object center point regression, and reduces a large amount of detection time consumption brought by the traditional seed box (anchor) method while not influencing the detection precision.

Drawings

Fig. 1 is a flowchart of a bridge bolt detection method based on a self-attention and center point regression model according to an embodiment of the present invention;

FIG. 2 is a block diagram of an overall algorithm based on a self-attention and center point regression model according to an embodiment of the present invention;

fig. 3 is a flowchart of a bolt detection scene data set construction provided in the embodiment of the present invention;

FIG. 4 is an exemplary illustration of a constructed bolt detection dataset image sample;

FIG. 5 is a schematic diagram of a data preprocessing process according to an embodiment of the present invention;

FIG. 6 is a diagram of a detection process based on a self-attention and center point regression model according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of bolt position prediction according to an embodiment of the present invention;

FIG. 8 is a diagram illustrating the detection results of different models.

Detailed Description

In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.

In the description of the present invention, it should be noted that the terms "upper", "lower", "inner", "outer", "front", "rear", "both ends", "one end", "the other end", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In the description of the present invention, it is to be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "disposed," "connected," and the like are to be construed broadly, such as "connected," which may be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

The invention provides a bridge bolt detection method based on a self-attention and central point regression model, as shown in figure 1, wherein the method comprises the following steps:

s10, acquiring a bridge bolt image to be detected;

s20, detecting the bridge bolt image through a self-attention and central point regression model obtained through pre-training; the self-attention and central point regression model is obtained by training a plurality of groups of training data of a bolt detection scene data set; each group of data of the multiple groups of training data comprises multiple types of bridge bolt images and bolt marking frames in the bridge bolt images; the self-attention and center point based regression model comprises: a convolutional neural network, and a self-attention semantic segmentation network and a detection network based on object center point regression respectively connected with the convolutional neural network;

and S30, determining whether the image bolt of the bridge bolt has a disease or not according to the detection result of the self-attention and central point regression model.

In the embodiment, the deep neural network technology is combined with the traditional image detection and identification technology, the method is applied to the field of bolt disease detection and identification, the defects of the traditional bridge disease image detection and identification technology can be overcome, the problems of efficiency, cost, safety and the like in bolt disease detection and identification can be well solved, and the development of the field of bridge safety monitoring in China is promoted.

The overall algorithm framework is shown in fig. 2, and for the convenience of subsequent expression, the self-attention semantic segmentation network is called a semantic segmentation module, and the detection network based on object center point regression is called a detection module.

The invention provides a self-attention mechanism and central point regression model (SACPR) algorithm based high-performance bolt detection, which comprises the following steps:

(1) different from a method based on region extraction, the method directly takes the target as a point to predict, and provides a new bolt center point heat map construction method, so that the target size of the bolt region is accurately predicted, and the operation has no non-maximum value to inhibit post-processing, and is simpler, faster and more accurate than a detection algorithm based on region extraction;

(2) in order to accurately detect the bolt area, the invention adopts a self-attention semantic segmentation algorithm, focuses attention on the target bolt area and inhibits a background area based on the illuminated target area in the semantic segmentation result, thereby improving the detection performance.

In order to realize the high-performance bridge bolt area detection, the invention constructs a high-quality bridge bolt missing scene data set based on a real scene. Then, a Keras deep learning framework is adopted to realize the SACPR algorithm, and a bolt detection experiment is carried out. In order to solve the problems in the prior art, the technical scheme provided by the invention is realized by adopting the following technical means and measures:

the construction of the bolt detection scene data set in step S20 described above, as shown in fig. 3, includes:

s21, selecting shooting bolt scene images of a plurality of bridges; the scene images comprise a plurality of types of images, and each image is provided with a bolt marking frame;

s22, zooming, cutting and turning the scene image to obtain a scene image data enhancement and increase sample set;

s23, performing mask image processing on the image of the sample set, and taking all pixels in the bolt labeling frame as the foreground of the mask image;

and S24, carrying out object size sensitive bolt center point heat map construction on the sample set data processed by the mask image.

When the data set is constructed in step S21, several bridges are selected as the shooting targets, for example, a bolt scene may be shot by using a mobile phone or other conventional mobile devices. In order to ensure the diversity of data, when shooting a target area of a specific scene, multiple images are required to be shot under the conditions of different angles, focal lengths, illumination and the like. And manually screening effective images and marking frames on the bolts in each image, wherein the marked frame images are shown in fig. 4, for example, 4205 pieces of information of the bolt detection data set preliminarily constructed through the steps are counted.

Step S22 data enhancement:

to further increase the diversity of the training data, data enhancement is used to increase the samples. The enhancement method comprises the following steps:

1) zooming: such as first scaling the short edge to 224; (the input image size for the classification detection task is often 224 x 224), the long edges are scaled equally;

2) cutting: then randomly cropping 224 x 244 sized regions from the post-zoom picture;

3) turning: and then, random horizontal overturning, random color change and random affine transformation operation are carried out on the cut image to increase the diversity of the training set pictures.

Step S23-S24 data preprocessing

(A) Masking image

As shown in fig. 2, the algorithm framework proposed by the present invention includes a detection module based on object center point regression and semantic segmentation. Semantic segmentation needs a mask image of an object, all pixels in a rectangular frame of the object marked in the construction process of a data set are directly used as the foreground of the mask image, as shown in (2) in fig. 5, a bolt area of the generated mask image is represented by 1, and other areas are represented by 0.

FIG. 5 is a schematic diagram of a data preprocessing process, in which (1) represents an original picture labeled in the data set construction; in the figure, (2) represents a mask picture for semantic segmentation; in the figure, (3) represents a heat map for central point regression for an existing method; in the figure, (4) shows a heat map for centroidal regression according to the invention.

(B) Object size sensitive bolt center point heat map construction method

In the detection module of the present invention, the object is represented by its center point, and the width and height of the object can be represented by the relative distance from the center point. Based on the representation of the central point, by using a relevant method in human body posture key point prediction literature for reference, the invention regards object detection as a key point prediction task, so a key point heat map (heatmap) needs to be constructed.

The invention provides a method for constructing a heat map of a bolt center point with sensitive object size, which is introduced in detail as follows:

order to

Indicates a width to be detected of

Has a height ofHThe three-channel color image of (a),

is a real number; firstly, generating and original drawing according to the labeling informationICentral point heat map of the same size having only "0, 1" pixel values

. Secondly, since object detection based on convolutional networks results in a smaller size of the generated feature map due to down-sampling, the goal of the pre-processing is to generate a center point heat map

Where a is the scaling factor due to the convolutional network downsampling. Through the steps, the bolt is arranged on the original drawingICenter point of

At generated central point heat map

The coordinates in (A) are

. Subsequently, the original imageIThe true heatmap is

，

The pixel value of each coordinate in (a) is calculated as follows:

（1）

in formula (1)

Is the gaussian kernel radius calculated from the dimensions of the bolt;

and

respectively center point is located at

Bolt at generated center point heat map

Medium corresponding width and height, then

And

respectively is a central point heat map generated by a bolt rectangular frame marked during data set construction

Coordinates of the upper left corner point;

representing true heatmaps

The middle abscissa is

The ordinate is

Processing the corresponding pixel value; if the rectangular areas of a plurality of bolts are partially overlapped, a central point heat map is generated

Selecting the maximum Gaussian propagation result from the values of the middle overlapped pixels; the heat map generated using the algorithm of the present invention for centroidal regression is shown in fig. 5 (4).

Comparing fig. 5 (3) with fig. 5 (4), it can be seen that this method has the following two advantages:

(1) in the heat map for center point regression, the method provided by the invention is 0 only outside the bolt marked rectangular region, while the existing method has non-0 value outside the bolt marked rectangular region;

(2) the shape of the foreground image in the heat map obtained by the method changes along with the shape change of the bolt, and the existing method generates a round foreground image for the bolt with any shape.

The training process of the self-attention and central point regression model by using the above data set is the prior art, and the details thereof are not described in this embodiment.

In one embodiment, referring to fig. 6, the detection process based on the self-attention and center point regression model includes:

s61, extracting middle-layer features of the bridge bolt image to be detected through a convolutional neural network;

s62, inputting the extracted middle-layer features into a self-attention semantic segmentation network, and outputting a semantic segmentation feature map of the bridge bolt image;

s63, inputting the extracted middle-layer features into a detection network based on object center point regression, and outputting a detection feature map of the bridge bolt image;

and S64, connecting the semantic segmentation feature map with the detection feature map, and positioning the bolt through the detection network based on object center point regression.

The following description will be made from two aspects of self-attention semantic segmentation and detection based on object center point regression.

1. Self-attention semantic segmentation

As shown in FIG. 2, the input to the semantic segmentation module is

The output is the original drawingICorresponding annotated mask imageM. In order to better capture the context semantic information of the image, the invention uses a four-layer expansion convolution network to complete

To

The number of output channels of the four expansion convolutional layers is (48, 64, 96, 128), respectively, so thatIn predicting

Is semantically segmented into feature dimensions of

(is described as

) Then using a convolution kernel of size

Is based on

PredictionM. Hypothesis semantic segmentation module for imagesIThe generated semantic segmentation result image is

Then the optimization goal of the semantic segmentation network is to make the prediction

AndMas equal as possible, the error loss is defined as follows:

（2）

in the above formula, the first and second carbon atoms are,

representing annotated mask imagesMOn the abscissa of

The ordinate is

Is in the marked bolt rectangular area, if yes, is 1, otherwise is 0.

Based on the illuminated target area in the semantic segmentation result, the attention is focused on the target bolt area and the background area is suppressed, so that the detection performance can be improved.

2. Object center point regression-based detection

As shown in FIG. 2, the input to the bolt detection module is still

In keeping with the semantic segmentation module, the same four-layer expanded convolution network is used to obtain the dimensionality of

And (5) feature diagrams. It should be noted that these feature maps are not directly used for the prediction of the bolt position, and the complete feature for detection is the connection between the feature map of the detection module and the feature map of the semantic segmentation module, i.e. the complete feature for the prediction of the bolt position is

(dimension of

）。

Order to

Bolt with indication mark

The rectangular area of (a) is,

respectively represent the coordinates of the upper left corner and the lower right corner, so that the bolt

The coordinate of the central point is

Its width and height dimensions can be expressed as

. And because of the characteristic diagram obtained by the detection module

Compared with the original image size, the width and the height which are actually used as network supervision information are reduced

. So that the detection module detects the secondary characteristics

Predicted bolt size

To the extent possible and practical

Equally, embodiments of the present invention use the L1 loss function as the optimization objective function for bolt size (width and height) prediction:

（3）

although the above formula is optimized, the predicted object center point can be close to the position of the real center point of the object, but the feature map is reduced by a times, so that a pixel areas of the original picture belong to the same pixel on the feature map. Therefore, if the center point obtained by the optimization formula (3) is directly offset from the real position, a certain offset error will occur. To solve this problem, the present embodiment allows the detection network to automatically learn the offset, and the formula is expressed as the following formula:

(4)

where a is the scale factor due to the convolution network downsampling,

is the local displacement amount for each object center point.

And whether each coordinate position of the feature map is the center of the bolt needs to be predicted. The heat map of the center point predicted by the detection module is shown as

Then one optimization objective of the present embodiment is to make

Obtained by preprocessing dataYConsidering that the central point is generally sparse as equal as possible, in order to alleviate the problem of imbalance between positive and negative samples, the present embodiment adopts a logistic regression loss function in the form of Focal loss as an optimization target:

(5)

in the above formula

Is a hyper-parameter in the focal loss,Nis the number of bolts in the current image.

From the above analysis, the detection module needs to predict 5 values for each position of the feature map, which are the probability of belonging to the central point, the offset of the two dimensions of the central point, and the width and height of the object. This embodiment achieves 5-value prediction simultaneously with one convolutional layer, as shown in FIG. 7, with five groups

The convolution kernel of

Of five channels of outputFive classes of values for prediction are required.

The overall optimization objective function of the model provided by the invention is as follows:

（6）

wherein the content of the first and second substances,

and the hyperparameter is used for controlling the contribution degree of different loss functions to the integral model. In the experiment, the

The value is set to 0.1,

the number of bits is set to 1,

is set to 0.5. The hyper-parameter may be chosen within several real numbers.

Finally, the detection result of the bridge bolt detection method based on the self-attention and center point regression model provided by the embodiment of the present invention is shown in fig. 8, where the rightmost side is the detection result of the model of the present invention.

Therefore, compared with the detection results of YOLO, false-RCNN and RetinaNet, the detection result of the model disclosed by the invention has higher identification accuracy.

The bridge bolt detection method based on the self-attention and center point regression model provided by the embodiment of the invention is mainly characterized by the following points:

(1) the patent provides a method for constructing a bolt center point heat map with sensitive object size, a bolt target is directly used as a point to predict, and then the target size of a bolt area is accurately predicted, and the operation has no non-maximum value to inhibit post-processing, so that the method is simpler, faster and more accurate than a detection algorithm based on area extraction.

(2) The self-attention semantic segmentation algorithm adopted by the invention focuses attention on the target bolt area and suppresses the background area based on the illuminated target area in the semantic segmentation result, thereby improving the detection performance.

(3) The invention provides a self-attention mechanism and central point regression model (SACPR) algorithm based on which the semantic segmentation features and the detection module features are connected to locate the bolt. The two branch tasks are mutually promoted and mutually enhanced, the semantic segmentation characteristics can provide object position prior guidance for the detection module, and the detection module can correct the semantic segmentation result.

Compared with the prior art, the method has the following advantages:

(1) automation: the high-performance bridge bolt detection technology based on the self-attention mechanism and the central point regression model simultaneously comprises a semantic segmentation and detection module, realizes automatic bolt detection and liberates manual labor;

(2) light weight: the material only needs to take pictures and does not need people to carry a plurality of tools to the site.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. The bridge bolt detection method based on the self-attention and center point regression model is characterized by comprising the following steps: the method comprises the following steps:

acquiring a bridge bolt image to be detected;

2. The bridge bolt detection method based on the self-attention and center point regression model according to claim 1, characterized in that: the construction process of the bolt detection scene data set comprises the following steps:

3. The bridge bolt detection method based on the self-attention and center point regression model according to claim 2, characterized in that: performing object size sensitive bolt center point thermographic construction, comprising:

order to

Indicates a width to be detected of

Has a height ofHThe three-channel color image of (a),

(ii) a The central point heat map

Have only 0 and 1 pixel values;

Where a is the scaling factor due to the convolutional network downsampling;

original drawingIThe true heatmap is

，

4. The bridge bolt detection method based on the self-attention and center point regression model according to claim 3, characterized in that: the detection process based on the self-attention and central point regression model comprises the following steps:

5. The bridge bolt detection method based on the self-attention and center point regression model according to claim 4, characterized in that: inputting the extracted middle-layer features into a self-attention semantic segmentation network, and outputting a semantic segmentation feature map of the bridge bolt image, wherein the semantic segmentation feature map comprises the following steps:

assuming that the extracted middle layer features are

The output is the original drawingICorresponding annotated mask imageM；

Using a four-layer expanded convolutional network

；

Convolutional layer basis using a convolutional kernel size of 1X 1

Predictive annotation mask image

；

Approximation label mask imageMDefining an error loss function;

6. The bridge bolt detection method based on the self-attention and center point regression model according to claim 5, characterized in that: the error loss function is defined as follows:

（2）

(2) in the formula (I), the compound is shown in the specification,

segmenting a resulting image for prediction

X on the abscissa and y on the ordinate, 1 if yes, and 0 otherwise.

7. The bridge bolt detection method based on the self-attention and center point regression model according to claim 6, characterized in that: connecting the semantic segmentation feature map with the detection feature map, and positioning the bolt through the detection network based on object center point regression, wherein the method comprises the following steps:

；

Characterizing said integrity feature

（3）

is the predicted bolt size;

the bolt size actually used as network supervision information;

(4)

(4) where a is the scale factor due to the convolution network downsampling,

is the local displacement amount for each object center point;

predicting integrity features

(5) in the formula (I), the compound is shown in the specification,

is a hyper-parameter in the focal loss,

the number of bolts in the current image;

detection network for complete features based on object center point regression

8. The bridge bolt detection method based on the self-attention and center point regression model according to claim 7, characterized in that: the total optimization objective function of the self-attention and central point regression model is as follows:

（6）

wherein the content of the first and second substances,