CN113378672A - Multi-target detection method for defects of power transmission line based on improved YOLOv3 - Google Patents

Multi-target detection method for defects of power transmission line based on improved YOLOv3 Download PDF

Info

Publication number
CN113378672A
CN113378672A CN202110600438.3A CN202110600438A CN113378672A CN 113378672 A CN113378672 A CN 113378672A CN 202110600438 A CN202110600438 A CN 202110600438A CN 113378672 A CN113378672 A CN 113378672A
Authority
CN
China
Prior art keywords
image
data set
target
target data
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110600438.3A
Other languages
Chinese (zh)
Inventor
韩恒
陈万培
张涛
高绅
杨钦榕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangzhou University
Original Assignee
Yangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangzhou University filed Critical Yangzhou University
Priority to CN202110600438.3A priority Critical patent/CN113378672A/en
Publication of CN113378672A publication Critical patent/CN113378672A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-target detection method for defects of a power transmission line based on improved YOLOv3, which comprises the following steps: step one, screening a data set image, screening an original image, and selecting a target image meeting requirements; step two, carrying out image augmentation on the images obtained in the step one to obtain a target data set; after the data amplification is completed, image preprocessing operation needs to be carried out on partial photos of the target data set, and the images are processed by using a piecewise linear transformation gray level transformation method, a histogram equalization method, a homomorphic filtering method and a smooth denoising method; fourthly, sorting and labeling the target data set preprocessed in the third step to obtain a target data set; step five, improving YOLOv3 by 'combining' the feature attention mechanism and the fusion; and step six, training the target data set in the improved algorithm to detect pictures.

Description

Multi-target detection method for defects of power transmission line based on improved YOLOv3
Technical Field
The invention relates to the technical field of target detection and identification, in particular to a power transmission line defect multi-target detection method based on improved YOLOv 3.
Background
The transmission line is divided into an overhead transmission line and a cable line, the overhead transmission line is composed of a line tower, a conducting wire, a line fitting, an insulator, a stay wire, a grounding device and the like, is widely distributed and is distributed in various terrains such as fields, urban areas, deserts, lakes and the like. Because the long-term operation is in the field, experiences the impact of extreme weather such as stormy wind, storm and insolation, parts such as wire, gold utensil, insulator appear defects such as corrosion, damage, disconnected strand easily, simultaneously, the part installation is not standard also brings the hidden danger for transmission line safe operation.
Along with the development and the application of transmission line combination unmanned aerial vehicle target defect identification scheme, the picture data volume that patrols and examines the mode and acquire through unmanned aerial vehicle is exponential type and increases, and traditional artifical mode drawback of patrolling and examining shows gradually. And the computer is used for carrying out intelligent defect identification on the inspection picture, so that the requirement on professional quality of personnel is further improved. At present, the proportion of unmanned aerial vehicles in the electric power inspection operation is higher and higher, and along with unmanned aerial vehicle inspection is more and more intelligent, automatic, but future electric power inspection development direction should realize that unmanned aerial vehicle inspection operation scene covers entirely.
The Power computer vision (Power CV) is a sub-field of Power artificial intelligence, which solves the visual problem in each link of a Power system by utilizing the technologies of machine learning, pattern recognition, digital image processing and the like and combining with the knowledge in the Power professional field, and relates to each link of 'transmission and transformation' of the whole Power system. Various camera supervisory equipment of circuit installation utilizes unmanned aerial vehicle to patrol and examine work, and the content of patrolling and examining the circuit is shot, produces a large amount of videos and images, need combine the relevant knowledge of electric power system, just can be better carry out analysis processes to it. In the aspect of automatic identification of defects of massive images, because the images shot by the power transmission line have obvious multi-scale structural characteristics, on one hand, the background of the images shot by the close-distance unmanned aerial vehicle is complex, and higher misjudgment can be caused by the influence of light; on the other hand, when the unmanned aerial vehicle shoots at different shooting angles, a large number of shielding situations can exist, and the separation of the local outline structure is a difficult task.
Helicopter inspection methods initially utilized a super-red method to identify rusty parts using least squares fitting and geometric features on images taken artificially by helicopters in the air. However, the method has limited recognition accuracy and slow detection speed. And later, a helicopter is used for carrying a real-time infrared video sequence shot by the thermal infrared imager, and the defective area in the image is determined by using methods such as Hough transformation, an Otsu adaptive threshold algorithm, SIFT feature matching and the like. With the continuous promotion of science and technology, the helicopter patrols and examines this kind of semi-artificial mode of patrolling and examining and can not satisfy smart power grids development demand.
In recent years, by means of a new-generation artificial intelligence technology represented by deep learning, an inspection image defect identification algorithm is continuously innovated and is gradually applied to an unmanned aerial vehicle intelligent inspection project of an overhead transmission line. With the application of object detection algorithms based on CNN, such as RCNN, Fater-RCNN and YOLO, becoming more mature and the further improvement of hardware operation level, the object detection algorithms also play a unique advantage in the field of power computer vision. An improved Fater-RCNN algorithm is proposed, a self-built equipment sample library is used for model training, and target detection is carried out on an electric power inspection image, so that the detection precision and the detection speed of a model are improved, but the identification accuracy of small targets is not high, and the instantaneity cannot be guaranteed.
Compared with the method, the one-stage target detection algorithm represented by the YOLO algorithm based on the Convolutional Neural Network (CNN) in deep learning has a detection speed obviously higher than that of the fwo-stage target detection algorithm based on roi (region of interest) such as the far-RCNN on the premise of keeping high identification accuracy, and can meet the real-time requirement of the system, so that the method is more suitable for application in an industrial field.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a multi-target detection method for the defects of the power transmission line based on improved YOLOv3, the original FPN characteristic Fusion mode is improved by adopting an Attention mechanism-Fusion (Attention-Fusion) mode, a cosine learning rate, a synchronous normalization technology and other special neural network training skills are used in training, on the premise of not changing a neural network architecture, extra reasoning and calculation cost is not introduced, and the performance of YOLOv3 is obviously improved; and training and learning the self-made data set by using an improved algorithm so as to realize multi-target identification of the defects of the power transmission line.
The purpose of the invention is realized as follows: a multi-target detection method for defects of power transmission lines based on improved YOLOv3 comprises the following steps:
the method comprises the following steps: screening a data set image, carrying out purposeful screening on an obtained original image, wherein the image at least comprises one of six types of ground wires, vibration dampers, bird nests, signboards, ground wire clamps and spacing rods, and initially selecting a target image meeting requirements;
step two: carrying out image augmentation on the image obtained in the step one, processing the screened image in a data augmentation mode, wherein the data augmentation mode comprises translation, rotation, overturning, scaling and cutting and rotation and translation combined transformation, and randomly selecting a translation distance and a rotation angle in the processing process to obtain a target data set;
step three: after the data amplification is completed, image preprocessing operation needs to be carried out on partial photos of the target data set, and the images are processed by using a piecewise linear transformation gray level transformation method, a histogram equalization method, a homomorphic filtering method and a smooth denoising method;
step four: sorting and labeling the target data sets preprocessed in the step three, modifying the picture names in batches, and labeling the target data sets in batches;
step five: improving YOLOv3 by combining the characteristic attention mechanism and the fusion to obtain an improved algorithm;
step six: and training in the improved algorithm by using the marked target data set, and finishing the detection of the picture to be detected.
Preferably, the algorithm improvement in the step five specifically includes:
note that the force mechanism-fusion, for any given transformation, the input feature maps X1 and X2 were subjected to 1X 1 convolutions, respectively, resulting in T1 and T2, where,
Figure BDA0003092757490000045
representing a space structure for a scale space, H represents the height of the feature map, W represents the width of the feature map, and C1 and C2 represent the channel numbers;
the T1 and T2 features are transmitted to a maximum average pooling operation, the features are compressed to H multiplied by W space dimension, the features at the moment become vectors with global receptive fields in a certain sense, and the output dimension is matched with the number of input feature channels, such as the following two formulas:
Figure BDA0003092757490000041
Figure BDA0003092757490000042
wherein
Figure BDA0003092757490000043
Then, performing full-connection layer operation, namely replacing the full-connection operation in the traditional sense with convolution with the convolution kernel size of 1 multiplied by 1 and the step length of 1, and obtaining S1 and S2 after the full-connection operation;
adding S1 and S2 to obtain P, and then re-aggregating original features on channel dimension, as shown in the formula:
P=S1+S2#(4.19)
wherein
Figure BDA0003092757490000044
P is subjected to Sigmoid function, the output weight is regarded as the importance of each fusion characteristic channel, and then each channel is weighted to X1 and X2 characteristics through matrix operation, so that the original characteristics are re-calibrated and fused in channel dimension, and a new characteristic Y is obtained; the process is as follows:
Y=(X1+X2)*Sigmoid(P)。
preferably, in the sixth step, a cosine learning rate and a synchronous normalization technology are used for processing in the training process;
the cosine learning rate processing specifically comprises:
when a gradient descent algorithm is used for optimizing the target function, a cosine function is used for reducing the learning rate in a matching way, and the change rule of the learning rate along with the iteration times is shown as the following two formulas:
Figure BDA0003092757490000051
Figure BDA0003092757490000052
wherein etamin,ηmaxExpressed as the range of learning rates, TcurIndicates how many epochs, T, are currently executedmaxExpressed as the total epoch number; the following modifications are made in the training process:
Figure BDA0003092757490000053
in actual training, the TotalIoperation and initialization T of the optimizer are reset when the corresponding epoch is in turncurAnd (4) finishing.
Compared with the prior art, the invention has the advantages that:
1. by means of image augmentation, the number of data set samples is increased, the occurrence of network overfitting can be reduced, and the robustness and detection precision of a detection algorithm are improved;
2. carrying out image preprocessing on part of sample pictures, wherein a gray scale interval of an interest region in the pictures is highlighted by a piecewise linear transformation gray scale transformation method; the histogram equalization method solves the problems of overexposure or underexposure in the picture; homomorphic filtering eliminates the problem of uneven illumination in the picture; smooth denoising eliminates image noise caused by external factors; the four methods are combined to enable the image to be clearer and have more obvious characteristics, so that the use value of the image is improved; the characteristics of the detected small target in the background are obviously enhanced, the characteristics of the detected small target are clearer, and the detection precision of the small target is improved
3. The cosine learning rate and the synchronous normalized neural network training skill are used, and the cosine function is used for reducing the learning rate in a matching way, so that the learning rate is closer to the global minimum value of Loss; the problem that a BN layer fails during multi-display card training is solved by using a synchronous normalization method; after improvement, the feature extraction capability of the network is obviously improved, and the detection result is enhanced; and meanwhile, the network training time is reduced.
4. The Attention-Fusion mode is used for replacing the Concat Fusion mode to improve the original feature Fusion mode, the relation among feature channels is established by means of the Attention mechanism idea, the non-linear capability of the network is further improved, key information is highlighted, irrelevant information is restrained, information redundancy is reduced, the feature expression capability of the fused feature graph is further enhanced, and the problem caused by sample overlapping is solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a development environment configuration of the present invention.
Fig. 2 is a network part parameter set of the present invention.
FIG. 3 is a flow chart of the multi-target detection method for power transmission line defects.
FIG. 4 is an exemplary graph of a data set.
Fig. 5 is a diagram illustrating an example of the manner in which image data is augmented.
Fig. 6 is a diagram of the result of piecewise linear transformation in image pre-processing.
Fig. 7 is a diagram showing the result of histogram equalization in image preprocessing.
Fig. 8 is a diagram showing the result of the smoothing processing in the image preprocessing.
Fig. 9 is a diagram of an example of a data set category.
FIG. 10 is a LabelImg operating interface diagram.
FIG. 11 is a diagram of data set annotation results.
Fig. 12 is a structural view of a modified YOLOv 3.
FIG. 13 is a diagram of an attention mechanism fusion architecture for algorithm improvement.
Fig. 14 is an identification case of different algorithms.
FIG. 15 shows the detection results of multiple types of defect targets in the power transmission line.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The deep learning training platform used by the invention is configured as shown in fig. 1, and the network part parameter setting is shown in fig. 2.
FIG. 3 is a flow chart of the multi-target detection method for defects of power transmission lines, which comprises the following steps:
the method comprises the following steps: data set image screening
The researched power transmission line data set comprises six samples of a spacer, a shockproof hammer, a bird nest, a signboard, a ground wire and a ground wire clamp. The six types of components are important components of the power transmission line and are also an extremely important ring in power inspection. All images are screened and a picture containing these six types of components is selected as the data set, as shown in fig. 4.
Step two: image augmentation
By means of five processing modes of moving, rotating, overturning, scaling and cutting and rotation-translation combined transformation of a sample picture by using an MATLAB software platform, the translation distance and the rotation angle are randomly selected in the processing process, the diversity of samples is increased, a large batch of sample data sets are obtained, and an image data expansion scheme is shown in fig. 5.
The data augmentation method used is performed by taking the image as a center point by default in actual operation. From the mathematical point of view, the method can be divided into the following steps:
1. moving the rotation point to the origin;
2. rotating around the origin;
3. and then the rotation point is moved back to the original position.
Assume the original coordinates of the image as x0,y0,1]TAnd the coordinates after translation are [ x, y, 1]]TAnd then the coordinate relationship before and after translation is as follows:
Figure BDA0003092757490000081
the image translation refers to the translation sum of all pixels in the x and y directions, and the mathematical matrix corresponding to the translation is:
Figure BDA0003092757490000082
wherein d isx,dyAnd respectively indicate the distance moved in the horizontal and vertical directions.
The image rotation is mainly to rotate by any angle through a specified rotation center point (default is the image center point), and the mathematical matrix is expressed as:
Figure BDA0003092757490000083
where θ is the angle of rotation (in the non-radian scale).
The image flipping includes horizontal flipping and vertical flipping, the mathematical matrix for the horizontal flipping is represented as:
Figure BDA0003092757490000084
the vertically flipped mathematical matrix is represented as:
Figure BDA0003092757490000091
in the deep learning task, a common method for clipping an image is to scale an original image by a certain time (1.1 times in the present system) of the original image, and then perform a clipping operation on the scaled image, where a scaling mathematical matrix is expressed as:
Figure BDA0003092757490000092
in the deep learning task, data augmentation generally adopts a data augmentation mode of various combinations, and the results of different combination sequences are different as known from matrix operation knowledge. To explain this process more intuitively, assume the translation transformation matrix is HshiftRotation transformation matrix is Hrotate. Mainly using translational-rotational combined data augmentation, there are two different combined transformations.
First, translation is performed before rotation, and then the transformation result mathematical matrix can be expressed as:
Figure BDA0003092757490000093
secondly, firstly rotating and then translating, the transformation result mathematical matrix can be expressed as:
Figure BDA0003092757490000094
step three: image pre-processing
1) Gray scale conversion method using piecewise linear transformation
The processing results are shown in fig. 6, which highlights the gray scale regions of the region of interest and relatively suppresses those gray scale regions that are not of interest.
The mathematical expression of piecewise linear transformation is:
Figure BDA0003092757490000101
wherein the gray scale interval [ a, b ] is adjusted]Linear stretching is performed to obtain a gray scale interval [0, a ]]And [ b, fmax]Is compressed.
2) Equalization method using histogram
As shown in fig. 7, the histogram equalization method can make the gray scales of the image uniformly distributed or the gray scale intervals apart, thereby achieving the purpose of increasing the contrast and making the picture clear.
Assuming that the gray scale of the original image at (x, y) is f, the value range is [0, L-1], when f is 0, the color is black, when f is L-1, the color is white, and the gray scale after equalization is j, the transformation process can be described as follows:
j(x,y)=T[f(x,y)],0≤f≤L-1
where the transformation T needs to satisfy the condition: t (r) strictly increases over the gray scale interval [0, L-1 ]; when f is more than or equal to 0 and less than or equal to L-1, T is more than or equal to 0 and less than or equal to (r) and less than or equal to L-1, wherein L is less than or equal to 256.
The Cumulative Distribution Function (CDF) satisfies exactly the above two conditions, and its mathematical expression is:
Figure BDA0003092757490000102
where ω is a formal integral variable;
for finding the probability density function p of the transformed random variable ss(s):
Figure BDA0003092757490000103
Figure BDA0003092757490000104
Further obtain the
Figure BDA0003092757490000105
The image equalization transformation T (r) depends on pr(r) but ps(s) always satisfying a uniform distribution, and prThe form of (r) has no correlation. Since the image pixel distribution is discrete, the discrete form expression of the cumulative distribution function is:
Figure BDA0003092757490000111
wherein k is more than or equal to 0 and less than or equal to L-1, MN is the total number of image pixels, nkRepresenting a gray scale of rkThe gray level value s (k) of each pixel after equalization can be directly calculated from the histogram of the original image.
3) Method for using homomorphic filtering
In the shooting process, the gray level dynamic range of one type of image is large due to uneven illumination of parts, black and white form strong contrast, details are not clearly seen, and the problems cannot be solved by adopting the general piecewise gray linear transformation. And homomorphic filtering can eliminate the adverse effect caused by uneven illumination and enhance the image details.
4) Processing images using smooth denoising
Considering the complexity of detecting the field environment and the image noise introduced in the image acquisition process, the quality of the image can be seriously affected, and an appropriate method needs to be used for eliminating the influence. After investigation, most of noise belongs to random signals, the influence on the image is independent, smooth denoising processing is performed on the image by using low-pass filtering, and the processing effect is shown in fig. 8.
Assuming that the pixel to be processed is f (x, y) and the processed image is g (x, y), the smoothing process can be described as follows:
Figure BDA0003092757490000112
in the formula, T is more than or equal to 0, and Q is the number of pixels in the neighborhood S.
Step four: data set sorting and labeling
1) And compiling Python scripts to modify the picture names in batches, wherein six digits (000000-999999) are used.
2) The data set is divided into two categories of target detection and fault foreign matter identification, including six categories of ground wires, vibration dampers, bird nests, signboards, ground wire clamps and spacers, as shown in fig. 9. And randomly dividing the training set, the testing set and the verification set according to the proportion.
3) The labeling dataset was batched using the LabelImg labeling tool, as shown in FIGS. 10 and 11. Generating an XML file;
4) and arranging the data set and the XML file, and packaging the data set and the XML file into a data set folder.
Step five: algorithm improvement
The improved YOLOv3 structure is shown in FIG. 12
Attention-Fusion (Attention-Fusion) method:
Attention-Fusion alleviates the inconsistency problem by creating a mechanism to enhance the connection between different feature maps, the structure of which is shown in fig. 13.
Unlike the feature fusion method of adding element by element and adding line by line, the key idea of the invention is to use attention mechanism to establish the relation between feature channels. It comprises two main steps: feature attention extraction and feature fusion.
The method aims to achieve the purpose of improving the network expression capacity by modeling the interdependence relationship among channels of convolution characteristics of different characteristic graphs, learn and utilize global information among the different characteristic graphs, selectively emphasize key information and inhibit useless information.
Attention-Fusion for any given transform, the input feature maps X1 and X2 were subjected to 1 × 1 convolution, respectively, resulting in T1 and T2. Wherein
Figure BDA0003092757490000121
And (3) transmitting the T1 and T2 features to a maximum average pooling operation, compressing the features to H multiplied by W space dimension, wherein the features become vectors with global receptive fields in a certain sense, and the output dimension is matched with the number of input feature channels. The following two equations:
Figure BDA0003092757490000131
Figure BDA0003092757490000132
wherein
Figure BDA0003092757490000133
This is followed by a full join layer operation where the convolution with a convolution kernel size of 1 x1 and step size of 1 is still used instead of the full join operation in the conventional sense to reduce information redundancy and computational load. After full ligation, S1, S2 were obtained.
Adding S1 and S2 to obtain P, and then re-aggregating original features on channel dimension, as shown in the formula:
P=S1+S2#(4.19)
wherein
Figure BDA0003092757490000134
P passes through a Sigmoid function, the output weight is regarded as the importance of each fusion characteristic channel, and then each channel is weighted to X1 and X2 characteristics through matrix operation, so that the original characteristics are recalibrated and fused in channel dimension. The process is as follows:
Y=(X1+X2)*Sigmoid(P);
step six: model training improvements
1) Cosine learning rate
When the objective function is optimized by using a Gradient Descent (Gradient decision) algorithm, a cosine function is used to reduce the learning rate in a matching way. The change rule of the learning rate along with the iteration number is shown as the following two formulas:
Figure BDA0003092757490000135
Figure BDA0003092757490000141
wherein etamin,ηmaxExpressed as the range of learning rates, TcurIndicates how many epochs, T, are currently executedmaxExpressed as the total epoch number.
For the convenience of implementation, the invention is modified as follows:
Figure BDA0003092757490000142
thus, in the actual training, the TotalIoperation and initialization T of the optimizer are reset (reset) when the corresponding epoch is in turncurAnd (4) finishing.
2) The synchronous normalization technology is synchronous normalization, namely BN parameters are fused into a Conv layer, and the principle is as follows:
Figure BDA0003092757490000143
Figure BDA0003092757490000144
yBN=Wmergex+bmerge
wherein WmergeIs the weight after fusion, W is the weight before fusion, Var [ x ]]For the variance of the input features x, E [ x]As a statistical mean of the data set of the input features x, bmergeThe bias after fusion, b the bias before fusion, gamma, epsilon and beta the learning parameters, yBNIs the fused output.
The designed algorithm was experimentally tested as follows.
In order to verify whether the improved algorithm is real and effective and whether the expected purpose is achieved, a general target detection data set VOC2014 data set is used firstly, and the improved algorithm is verified under the condition that the experimental environment is consistent.
FIG. 14 is a comparison of recognition situations of different algorithms, and it can be seen from the figure that the improved algorithm provided by the invention is superior to other classical target detection algorithms in recognition accuracy, and the detected mAP is 81.6%.
The partial picture detection results are shown in fig. 15. The algorithm has good identification effect on bird nests, spacing rods, loose strands of grounding wires and fading of rod number plates, but the omission of the vibration dampers is easy to occur because the background color is darker and the target color is close to the background, so that the characteristics are not obvious.
And because no exact standard is available at present to distinguish the accurate relation between the normal state and the slippage of the shockproof hammer, the defect is doubtful to be marked, so that the slippage defect of the shockproof hammer is not paid much attention.
The above description of the embodiments is only intended to facilitate the understanding of the method of the invention and its core idea. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims (3)

1. A multi-target detection method for defects of a power transmission line based on improved YOLOv3 is characterized by comprising the following steps:
the method comprises the following steps: screening a data set image, performing purposeful screening on an obtained original image, wherein the image at least comprises one of six types of ground wires, a shockproof hammer, a bird nest, a signboard, a ground wire clamp and a spacer, and preliminarily selecting a target image meeting requirements;
step two: carrying out image augmentation on the image obtained in the step one, processing the screened image in a data augmentation mode, wherein the data augmentation mode comprises translation, rotation, overturning, scaling and cutting and rotation and translation combined transformation, and randomly selecting a translation distance and a rotation angle in the processing process to obtain a target data set;
step three: after the data amplification is finished, image preprocessing operation needs to be carried out on partial photos of the target data set, and the images are processed by using a piecewise linear transformation gray level transformation method, a histogram equalization method, a homomorphic filtering method and a smooth denoising method;
step four: sorting and labeling the target data sets preprocessed in the step three, modifying picture names in batches, and labeling the target data sets in batches;
step five: improving YOLOv3 by combining the characteristic attention mechanism and the fusion to obtain an improved algorithm;
step six: and training in the improved algorithm by using the previously marked target data set, and finishing the detection of the picture to be detected.
2. The multi-target detection method for the defects of the power transmission lines based on the improved YOLOv3 as claimed in claim 1, wherein the algorithm improvement in the step five specifically comprises:
note that the force mechanism-fusion, for any given transformation, the input feature maps X1 and X2 were subjected to 1X 1 convolutions, respectively, resulting in T1 and T2, where,
Figure FDA0003092757480000011
Figure FDA0003092757480000012
representing a space structure for scale space, wherein H represents the height of the characteristic diagram, W represents the width of the characteristic diagram, and C1 and C2 represent the channel number;
the T1 and T2 features are transmitted to a maximum average pooling operation, the features are compressed to H multiplied by W space dimension, the features at the moment become vectors with global receptive fields in a certain sense, and the output dimension is matched with the number of input feature channels, such as the following two formulas:
Figure FDA0003092757480000021
Figure FDA0003092757480000022
wherein
Figure FDA0003092757480000023
Then, performing full-connection layer operation, namely replacing the full-connection operation in the traditional sense with convolution with the convolution kernel size of 1 multiplied by 1 and the step length of 1, and obtaining S1 and S2 after the full-connection operation;
adding S1 and S2 to obtain P, and then re-aggregating original features on channel dimension, as shown in the formula:
P=S1+S2#(4.19)
wherein
Figure FDA0003092757480000024
P is subjected to Sigmoid function, the output weight is regarded as the importance of each fusion characteristic channel, each channel is weighted to X1 and X2 characteristics through matrix operation, the re-calibration fusion of the original characteristics in channel dimension is realized, and a new characteristic Y is obtained; the process is as follows:
Y=(X1+X2)*Sigmoid(P)。
3. the multi-target detection method for the defects of the power transmission line based on the improved YOLOv3 as claimed in claim 1, wherein in the sixth step, a cosine learning rate and a synchronous normalization technology are used for processing in the training process;
the cosine learning rate processing specifically comprises:
when a gradient descent algorithm is used for optimizing the target function, a cosine function is used for reducing the learning rate in a matching way, and the change rule of the learning rate along with the iteration times is shown as the following two formulas:
Figure FDA0003092757480000031
Figure FDA0003092757480000032
wherein etamin,ηmaxExpressed as the range of learning rates, TcurIndicates how many epochs, T, are currently executedmaxExpressed as the total epoch number; the following modifications are made in the training process:
Figure FDA0003092757480000033
in actual training, the TotalIoperation and initialization T of the optimizer are reset when the corresponding epoch is in turncurAnd (4) finishing.
CN202110600438.3A 2021-05-31 2021-05-31 Multi-target detection method for defects of power transmission line based on improved YOLOv3 Pending CN113378672A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110600438.3A CN113378672A (en) 2021-05-31 2021-05-31 Multi-target detection method for defects of power transmission line based on improved YOLOv3

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110600438.3A CN113378672A (en) 2021-05-31 2021-05-31 Multi-target detection method for defects of power transmission line based on improved YOLOv3

Publications (1)

Publication Number Publication Date
CN113378672A true CN113378672A (en) 2021-09-10

Family

ID=77575009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110600438.3A Pending CN113378672A (en) 2021-05-31 2021-05-31 Multi-target detection method for defects of power transmission line based on improved YOLOv3

Country Status (1)

Country Link
CN (1) CN113378672A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113870265A (en) * 2021-12-03 2021-12-31 绵阳职业技术学院 Industrial part surface defect detection method
CN114913473A (en) * 2022-03-21 2022-08-16 中国科学院光电技术研究所 Lightweight single-body imaging contact network safety patrol instrument

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160142747A1 (en) * 2014-11-17 2016-05-19 TCL Research America Inc. Method and system for inserting contents into video presentations
CN107330871A (en) * 2017-06-29 2017-11-07 西安工程大学 The image enchancing method of insulator automatic identification is run under bad weather condition
CN108257090A (en) * 2018-01-12 2018-07-06 北京航空航天大学 A kind of high-dynamics image joining method that camera is swept towards airborne row
CN110599445A (en) * 2019-07-24 2019-12-20 安徽南瑞继远电网技术有限公司 Target robust detection and defect identification method and device for power grid nut and pin
CN111681240A (en) * 2020-07-07 2020-09-18 福州大学 Bridge surface crack detection method based on YOLO v3 and attention mechanism
CN112464910A (en) * 2020-12-18 2021-03-09 杭州电子科技大学 Traffic sign identification method based on YOLO v4-tiny
CN112508014A (en) * 2020-12-04 2021-03-16 东南大学 Improved YOLOv3 target detection method based on attention mechanism

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160142747A1 (en) * 2014-11-17 2016-05-19 TCL Research America Inc. Method and system for inserting contents into video presentations
CN107330871A (en) * 2017-06-29 2017-11-07 西安工程大学 The image enchancing method of insulator automatic identification is run under bad weather condition
CN108257090A (en) * 2018-01-12 2018-07-06 北京航空航天大学 A kind of high-dynamics image joining method that camera is swept towards airborne row
CN110599445A (en) * 2019-07-24 2019-12-20 安徽南瑞继远电网技术有限公司 Target robust detection and defect identification method and device for power grid nut and pin
CN111681240A (en) * 2020-07-07 2020-09-18 福州大学 Bridge surface crack detection method based on YOLO v3 and attention mechanism
CN112508014A (en) * 2020-12-04 2021-03-16 东南大学 Improved YOLOv3 target detection method based on attention mechanism
CN112464910A (en) * 2020-12-18 2021-03-09 杭州电子科技大学 Traffic sign identification method based on YOLO v4-tiny

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113870265A (en) * 2021-12-03 2021-12-31 绵阳职业技术学院 Industrial part surface defect detection method
CN114913473A (en) * 2022-03-21 2022-08-16 中国科学院光电技术研究所 Lightweight single-body imaging contact network safety patrol instrument
CN114913473B (en) * 2022-03-21 2023-08-15 中国科学院光电技术研究所 Lightweight monomer type imaging contact net safety inspection instrument

Similar Documents

Publication Publication Date Title
CN111428748B (en) HOG feature and SVM-based infrared image insulator identification detection method
CN108961235B (en) Defective insulator identification method based on YOLOv3 network and particle filter algorithm
CN108537743B (en) Face image enhancement method based on generation countermeasure network
CN111784633B (en) Insulator defect automatic detection algorithm for electric power inspection video
CN108875821A (en) The training method and device of disaggregated model, mobile terminal, readable storage medium storing program for executing
CN109034184B (en) Grading ring detection and identification method based on deep learning
CN112733950A (en) Power equipment fault diagnosis method based on combination of image fusion and target detection
CN111950453A (en) Optional-shape text recognition method based on selective attention mechanism
CN111242868B (en) Image enhancement method based on convolutional neural network in scotopic vision environment
CN113378672A (en) Multi-target detection method for defects of power transmission line based on improved YOLOv3
CN115690542A (en) Improved yolov 5-based aerial insulator directional identification method
CN114419413A (en) Method for constructing sensing field self-adaptive transformer substation insulator defect detection neural network
CN114332942A (en) Night infrared pedestrian detection method and system based on improved YOLOv3
CN111915558A (en) Pin state detection method for high-voltage transmission line
CN115841633A (en) Power tower and power line associated correction power tower and power line detection method
CN112884795A (en) Power transmission line inspection foreground and background segmentation method based on multi-feature significance fusion
CN117409083B (en) Cable terminal identification method and device based on infrared image and improved YOLOV5
CN116485802B (en) Insulator flashover defect detection method, device, equipment and storage medium
CN111881803B (en) Face recognition method based on improved YOLOv3
CN112465736B (en) Infrared video image enhancement method for port ship monitoring
CN111127355A (en) Method for finely complementing defective light flow graph and application thereof
CN116189160A (en) Infrared dim target detection method based on local contrast mechanism
CN111402223B (en) Transformer substation defect problem detection method using transformer substation video image
CN114821098A (en) High-speed pavement damage detection algorithm based on gray gradient fusion characteristics and CNN
CN113469224A (en) Rice classification method based on fusion of convolutional neural network and feature description operator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination