CN116843649B

CN116843649B - Intelligent defect detection method for power transmission line based on improved YOLOv network

Info

Publication number: CN116843649B
Application number: CN202310809176.0A
Authority: CN
Inventors: 王腾; 刘洪亮; 李广跃; 冯江霞; 王满满; 张�浩
Original assignee: Shandong University; Weifang Power Supply Co of State Grid Shandong Electric Power Co Ltd
Current assignee: Shandong University; Weifang Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date: 2023-07-04
Filing date: 2023-07-04
Publication date: 2024-05-17
Anticipated expiration: 2043-07-04
Also published as: CN116843649A

Abstract

The invention discloses an intelligent defect detection method for a power transmission line based on an improved YOLOv network, which comprises the following steps: improving YOLOv network, constructing a target detection network, wherein the target detection network comprises a backbone network, a path fusion network and an output module, constructing a data set training target detection network by using a transmission line database, and storing the trained target detection network; and inputting the transmission line image to be detected into a trained target detection network, and outputting a corresponding target detection result. The detection method disclosed by the invention solves the problem of high industrial data collection cost by providing the data enhancement algorithm, and greatly reduces the omission rate and the false detection rate. By adding the ghost convolution module, the cross-stage local network of Yolov is improved, the network prediction speed can be effectively improved, a attention mechanism is added in the backbone network, and the defect detection precision and the small target detection capability are improved in a complex power transmission scene.

Description

Intelligent defect detection method for power transmission line based on improved YOLOv network

Technical Field

The invention relates to the field of intelligent inspection of power transmission lines, in particular to an intelligent defect detection method of a power transmission line based on an improved YOLOv network.

Background

In recent years, the contradiction between the increasing number of new energy station equipment and the deficiency of operation and maintenance personnel is increasingly prominent, and the operation and inspection team faces the serious situation that the total amount of the equipment is absent and the structural personnel is absent, so that the intelligent inspection of the new energy station is challenged. In order to improve the working efficiency and the recognition effect, the intelligent detection of the defects of the transmission line components is realized based on deep learning and gradually becomes a research hot spot, and a great achievement is achieved. The effect of the deep learning algorithm not only depends on the quantity and quality of model training data, but also the network structure of the model can influence the effect of the intelligent detection method.

Before deep learning is raised, defect detection of key parts (such as insulator parts, typical hardware parts, bolt parts and the like) of a power transmission line is mainly realized by a traditional image processing method and a manually designed characteristic engineering method. However, the feature extraction in the traditional method mainly depends on an artificially designed extractor, professional knowledge and a complex parameter adjusting process are required, and each method is relatively poor in generalization capability and robustness aiming at specific applications. The YOLO series algorithm is used as a representative of the one-stage algorithm, redefines the object detection as a regression problem, and realizes double improvement of precision and speed. However, in practical application, YOLO is still the same as most deep learning target detection algorithms, and there is still a shortage of small target detection. Aiming at the problems, the research on the detection of the target cap needs to improve the detection precision of the small target and ensure higher detection speed.

Disclosure of Invention

In order to solve the technical problems, the invention provides an intelligent defect detection method for a power transmission line based on an improved YOLOv network, so as to achieve the purposes of higher detection speed and small target detection precision.

In order to achieve the above purpose, the technical scheme of the invention is as follows:

an intelligent defect detection method for a power transmission line based on an improved YOLOv network comprises the following steps:

(1) Improving YOLOv network to construct a target detection network, wherein the target detection network comprises a backbone network, a path fusion network and an output module, and the backbone network comprises a Focus, a CBL convolution block, an improved cross-stage local network 1, CBAM module and a spatial pyramid pooling structure module; the path fusion network uses the structure of a feature pyramid FPN+PAN as the path fusion network, the FPN layer transmits and fuses the feature information of the high layer and the low layer in an up-sampling mode, and the PAN layer splices the low layer features and the high layer features to transmit the features with high resolution of the low layer to the upper layer; the output module comprises a first output classifier, a second output classifier and a third output classifier;

(2) Constructing a data set training target detection network by using a power transmission line database, and storing the trained target detection network;

(3) Inputting the transmission line image to be detected into a trained target detection network, and outputting a corresponding target detection result, wherein the target detection result comprises the position of a target area in the transmission line image to be detected and the category corresponding to each target area, and the category of the target area is a normal defect or a damage defect.

In the above scheme, the CBL convolution block is composed of a convolution layer, a batch normalization layer and an activation layer, wherein the convolution mode is ghost convolution.

In the above scheme, the improved content of the improved cross-stage local network 1 is to convert the residual modules in the cross-stage local network into a ghost network.

In the above solution, the CBAM module performs the processing of the channel attention mechanism and the processing of the spatial attention mechanism on the input feature layer respectively.

In the above scheme, the spatial pyramid pooling structure module processes the input feature map with the size of 19×19 by using four maximum pooling layers of 13×13, 9×9, 5×5 and 1×1 respectively, and splices the processed outputs of the four maximum pooling layers to obtain the pooled feature map output.

In the above scheme, the first output classifier is configured to receive the fusion feature with the size of 19×19 output by the backbone network, the second output classifier is configured to receive the fusion feature with the size of 38×38 output by the backbone network, and the third output classifier is configured to receive the fusion feature with the size of 76×76 output by the backbone network, so as to complete feature aggregation of different detection layers by different backbone layers.

In the above scheme, the path fusion network adopts a top-down and bottom-up bidirectional fusion network.

In the above scheme, in the training process, the target detection network sets epoch to 100, batch_size to 16, and image size to 640×640.

Through the technical scheme, the intelligent defect detection method for the power transmission line based on the improved YOLOv network has the following beneficial effects:

Compared with the traditional YOLO method, the method has higher precision and higher speed. The insulator defect detection method can effectively distinguish the working state of the insulator on the transmission line, and has good detection effect on defects such as insulator deficiency, damage and the like. After the operation detection network is mature, the operation detection network can be adjusted according to the characteristics of the detection device, and the operation detection network is expanded to defect detection of other parts of the power transmission line, so that the operation detection network is used for safety maintenance, driving protection and navigation of the power grid. By providing a data enhancement algorithm, the problem of high industrial data collection cost is solved, and the omission ratio and the false detection rate are greatly reduced. The convolution module Yolov is improved, the network prediction speed can be effectively improved, a attention mechanism is added in a backbone network, and the defect detection precision and the small target detection capability are improved in a complex power transmission scene.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

Fig. 1 is a schematic flow diagram of a power transmission line intelligent defect detection technology based on an improved YOLOv network disclosed by the invention;

FIG. 2 is an original YOLOv network frame;

FIG. 3 is a diagram of a YOLOv network after final modification of the present invention;

FIG. 4 is an original cross-stage local network architecture and cross-stage local network architecture improvement that improves the convolution approach;

fig. 5 is a CBAM channel attention and spatial attention superposition process.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

The invention provides an intelligent defect detection method for a power transmission line based on an improved YOLOv network, which is shown in fig. 1 and comprises the following steps:

1. Improved YOLOv network construction

As shown in fig. 2, the original YOLOv model mainly consists of a backbone network and a header, wherein the backbone network is used as a feature extraction module, and the header mainly comprises a neck network for extracting fusion features and an output module: firstly, the input end performs random cutting, scaling and splicing operation on the input image, finds out the most suitable self-adaptive anchor frame calculation, and improves the positioning accuracy of the small target defect of insulator breakage; then the backbone network is polymerized on different image fine granularity, and three scale feature graphs are extracted; secondly, the backbone network enlarges the receptive field of the feature map by adding a space pyramid pooling structure and using 3 groups of multi-scale maximum pooling layers, and the feature fusion is realized by splicing high and low feature layers obtained by up-sampling so as to obtain a new feature map, so that the propagation of low-layer features is improved, and then the features are transmitted from bottom to top through a path fusion network from weak to strong, so that the feature layers realize more feature fusion, the two are combined, and the capability of network feature fusion is enhanced; and finally, the output end is a prediction part of the network, the output part comprises three classes of output classifications including a first head classifier, a second head classifier and a third head classifier, and the auxiliary loss function algorithm respectively predicts the image characteristics, generates a boundary box and predicts the classes.

According to the invention, the improved YOLOv network structure is constructed on the original YOLOv network according to the requirement of transmission line inspection under a complex background. As shown in fig. 3, the improved target detection network comprises a backbone network, a path fusion network and an output module, wherein the backbone network comprises a Focus, a CBL convolution block, an improved cross-stage local network 1, CBAM module and a spatial pyramid pooling structure module; the path fusion network uses the structure of a feature pyramid FPN+PAN as the path fusion network, the FPN layer transmits and fuses the feature information of the high layer and the low layer in an up-sampling mode, and the PAN layer splices the low layer features and the high layer features to transmit the features with high resolution of the low layer to the upper layer; the output module includes a first output classifier, a second output classifier, and a third output classifier.

(1) Focus module: YOLO V5 defaults to 3 x 640 input, replicates four, then cuts the four pictures into four 3 x 320 slices by a slicing operation, then connects the four slices in depth using a concatenation, outputs 12 x 320, then generates 32 x 320 output by a convolution kernel of 32, and finally inputs the result to the next convolution layer by batch_ borm and leakage_ relu. The specific operation is that every other pixel in a picture is taken to a value similar to adjacent downsampling, thus four pictures are taken, the four pictures are complementary and long, but no information is lost, so W, H information is concentrated into a channel space, an input channel is expanded by 4 times, namely, the spliced picture becomes 12 channels relative to the original RGB three-channel mode, and finally, the obtained new picture is subjected to convolution operation, and a double downsampling characteristic diagram under the condition of no information loss is finally obtained.

(2) Cross-phase local network module: the cross-stage local network structure divides the original input into two branches to respectively carry out convolution operation, namely a small amount of common convolution and a simple linear operation. The invention improves here: the residual modules in the structure of the cross-stage local network 1 in the YOLOv backbone network are replaced by ghost networks, fewer parameters are utilized to generate characteristic diagrams with the same number as that of common convolution layers, and the structure for reducing the parameter number and the calculated amount in the ghost networks is a ghost module, as shown in fig. 4, which is a schematic diagram of the cross-stage local network improvement.

(3) The space pyramid pooling structure module comprises: the input of the spatial pyramid pooling structure is 512×20×20, 256×20×20 is output after passing through a convolution layer of 1×1, then downsampling is performed through three parallel maximum pooling, the result is added with the initial characteristics thereof, 1024×20×20 is output, and finally 512×20×20 is restored by using a convolution kernel of 512. The method comprises the following specific steps: extracting features, pooling, filling and merging, and extracting the most important features after increasing receptive fields without influencing the detection speed.

(4) Path fusion network: the feature extractor of the network employs a new FPN structure that enhances the bottom-up path, improving the propagation of low-level features. Each stage of the third pass takes as input the feature map of the previous stage and processes them with a 3 x 3 convolutional layer; the output is added to the same stage profile of the top-down path through the cross-connect, which provides information for the next stage. A lateral connection, called a shortcut connection, helps to shorten the path; meanwhile, the self-adaptive feature pooling is used for recovering the destroyed information path between each candidate region and all feature layers, and each candidate region on each feature layer is aggregated to avoid being arbitrarily allocated; for Mask-RCNN, FCN can preserve spatial information and reduce the number of parameters in the network, but since parameters are shared for all spatial locations, the network does not actually learn how to use pixel locations for prediction, while FC is location sensitive and can accommodate different spatial locations.

(4) Output loss function: in order to improve the detection precision and the confidence coefficient of the detection frame, the output end adopts CIOU functions to delete the output frame, and the CIOU considers the distance between the target and the candidate frame, the overlapping rate, the scale and the punishment item, which are different from the concept of the simple ratio of the common cross-over ratio (IOU), so that the regression of the target frame becomes more stable, and the problems of divergence in the training process and the like can not occur like the IOU. And the penalty factor takes into account the aspect ratio of the predicted frame aspect ratio fit to the aspect ratio of the target frame.

CIOU formula is as follows:

wherein, IOU represents the intersection ratio of the predicted frame and the real frame, ρ (·) is the Euclidean distance between the center points of the predicted frame and the real frame, c is the diagonal length of the minimum circumscribed rectangle of the predicted frame and the real frame, v represents the distance of the length-width ratio of the predicted frame and the real frame, and α is a weight coefficient.

(5) Attention mechanism

The main part of the final improved network is still composed of an original backbone network, a neck network and an output three part, but the improvement is that CBAM modules are added on the upper part of a spatial pyramid except for Focus, CBL, a cross-stage local network 1 and a spatial pyramid pooling structure in the dry network. As shown in fig. 5, the attention mechanism is one way to implement network adaptive attention, CBAM performs channel attention mechanism processing and spatial attention mechanism processing on the incoming feature layer, and adds independent attention layers by improving YOLOv network frame to improve the detection capability of "small objects".

A. Channel attention mechanism: first two different spatial context descriptors are generated using spatial information of the average pooling and max pooling operations aggregate feature maps: And/> Representing the average pooling feature and the maximum pooling feature, respectively, and then forwarding the two descriptors to a shared network to generate our channel attention map M _c∈R^C×1×1, the shared network being composed of an implicit layer of multi-layer perceptrons. To reduce the parameter overhead, the hidden activation size is set to R ^C/r×1×1, where R is a reduced scale, and after applying the shared network to each descriptor we merge the output feature vectors using element-wise summation. Briefly, channel attention is calculated as:

Where σ represents the sigmoid function, W ₀∈R^C/r×C,W₁∈R^C×C/r, and the MLP weights W ₀ and W ₁ are shared for the inputs of the two channels.

B. Spatial attention mechanism: the channel information of the feature map is aggregated by two pooling operations to generate two 2D maps: And/> Representing the average pooling feature and the maximum pooling feature, respectively, across channels, which are then concatenated and convolved by a standard convolution layer to generate our 2D spatial attention map. In short, spatial attention is calculated as:

Where σ represents a sigmoid function and f ^7×7 represents a convolution operation with a filter size of 7×7.

2. Establishing a database

And acquiring a China Power Line Insulator Database (CPLID), and carrying out data enhancement on the power transmission line insulator database to obtain a sample set.

The existing CPLID dataset is divided into two parts: the method comprises 600 images of normal insulators captured by the unmanned aerial vehicle and 248 images of defective insulators. Because the number of defective insulators is rare and cannot meet the training requirement, a data enhancement method is adopted, the defective insulators are segmented from a small part of original images by using an algorithm in TVSeg, and the segmentation result is a mask image; performing affine transformation to enhance the original image and its mask, the enhancement result being a large number of original-mask image pairs, which are used to train the U-Net; the rest of the trained U-Net segmentation image; connecting the insulators in different contexts; a complete data set is obtained.

3. Data preprocessing

Expanding the original 848 images by using Albumentations tools, and expanding the images to 4200 more images by means of rotating, cutting, erasing pixels, adjusting brightness saturation values of the images and the like; the image marking uses an open source tool labelImg, marking contents are the defect type and the left upper corner and right lower corner coordinates of the defect target, and the marked information file format is xml; after the data set picture and the label are standardized, the training set, the verification set and the test set are divided; sample tags in a dataset are of two types: when the target area is a normal defect, the defect is marked as an insulator, and when the target area is a defect insulator, the defect is a defect.

4. Model training and parameter updating

The following training parameters were used: learning rate 0.001, batch_size 16, training set validation set to data set partitioning ratio 0.9 and 0.1, optimizer using Adam. The network training strategy is to use the pre-training parameters in Imagenet and coco data sets as initial weights, freeze the weights of the backbone network part, train 50 epochs on the rest, and finally defrost the weights to train 50 epochs.

The trained target detection network is stored, the transmission line image to be detected is input into the trained target detection network, a corresponding target detection result is output, the target detection result comprises the position of a target area in the transmission line image to be detected and the category corresponding to each target area, and the category of the target area is a normal defect or a damage defect.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The intelligent defect detection method for the power transmission line based on the improved YOLOv network is characterized by comprising the following steps of:

(1) Improving YOLOv network to construct a target detection network, wherein the target detection network comprises a backbone network, a path fusion network and an output module, and the backbone network comprises a Focus, a CBL convolution block, an improved cross-stage local network 1, CBAM module and a spatial pyramid pooling structure module; the path fusion network uses the structure of a feature pyramid FPN+PAN as the path fusion network, the FPN layer transmits and fuses the feature information of the high layer and the low layer in an up-sampling mode, and the PAN layer splices the low layer features and the high layer features to transmit the features with high resolution of the low layer to the upper layer; the output module comprises a first output classifier, a second output classifier and a third output classifier; the first output classifier is used for receiving the fusion characteristics of the backbone network output with the size of 19 multiplied by 19, the second output classifier is used for receiving the fusion characteristics of the backbone network output with the size of 38 multiplied by 38, and the third output classifier is used for receiving the fusion characteristics of the backbone network output with the size of 76 multiplied by 76, so that different backbone layers perform characteristic aggregation on different detection layers;

2. The intelligent defect detection method for the power transmission line based on the improved YOLOv network according to claim 1, wherein the CBL convolution block consists of a convolution layer, a batch normalization layer and an activation layer, and the convolution mode is ghost convolution.

3. The intelligent defect detection method for the power transmission line based on the improved YOLOv network according to claim 1, wherein the improved content of the improved cross-stage local network 1 is to convert a residual module in the cross-stage local network into a ghost network.

4. The intelligent defect detection method for the power transmission line based on the improved YOLOv network according to claim 1, wherein the CBAM module performs the processing of the channel attention mechanism and the processing of the spatial attention mechanism on the input feature layer respectively.

5. The intelligent defect detection method for the power transmission line based on the improved YOLOv network according to claim 1, wherein the spatial pyramid pooling structure module processes the input characteristic diagram with the size of 19×19 by using four largest pooling layers of 13×13, 9×9, 5×5 and 1×1 respectively, and splices the processed outputs of the four largest pooling layers to obtain a pooled characteristic diagram output.

6. The intelligent defect detection method for the power transmission line based on the improved YOLOv network according to claim 1, wherein the path fusion network adopts a top-down and bottom-up bidirectional fusion network.

7. The intelligent defect detection method for a power transmission line based on the improved YOLOv network according to any one of claims 1 to 6, wherein the target detection network sets epoch to 100, batch_size to 16, and image size to 640×640 during training.