CN115171006A

CN115171006A - Detection method for automatically identifying personnel entering electric power dangerous area based on deep learning

Info

Publication number: CN115171006A
Application number: CN202210675238.9A
Authority: CN
Inventors: 朱佳龙; 姜明华; 汤光裕; 俞晨雨; 刘军; 余锋
Original assignee: Wuhan Textile University
Current assignee: Wuhan Textile University
Priority date: 2022-06-15
Filing date: 2022-06-15
Publication date: 2022-10-11
Anticipated expiration: 2042-06-15
Also published as: CN115171006B

Abstract

The invention relates to a detection method for automatically identifying personnel entering an electric power danger area based on deep learning, which comprises the following steps: shooting and collecting a power field image; establishing a pixel coordinate system by taking the upper left corner of the image as the origin of coordinates; defining a danger area in the power field image, and determining the coordinates of the danger area; inputting a real-time video stream of an electric power site into a site position detection network of an operator, and determining the coordinates of the operator according to the output of the detection network; and comparing the obtained coordinates of the operating personnel with the coordinates of the dangerous area in real time, judging whether the operating personnel is located in the dangerous area, and if the operating personnel enters the dangerous area, giving an alarm. The invention realizes the automatic identification and detection of personnel entering the electric power dangerous area, replaces manual supervision and improves the reliability and precision of personnel dangerous detection.

Description

Detection method for automatically identifying person entering electric power dangerous area based on deep learning

Technical Field

The invention belongs to the field of electric power operation safety control, and particularly relates to a detection method for automatically identifying that personnel enter an electric power dangerous area based on deep learning.

Background

In recent years, with the rapid development of artificial intelligence technology and the continuous upgrade of artificial intelligence technology, important power and support are provided for the development of technologies such as target detection, action recognition and the like, and with the arrival of artificial intelligence, the artificial intelligence technology has appeared in our lives and plays a great role, such as face brushing access of various access control systems and mask wearing recognition, and can also be applied to the power industry.

In the process of power production, a lot of potential safety hazards exist in field operation, for example, a lot of dangerous areas exist near a high-voltage transformer substation, workers may get an electric shock risk, the traditional power field safety control is performed in a manual supervision mode, the manual supervision mode has the defects that recording cannot be performed and real-time supervision cannot be achieved, field power operators easily enter the dangerous areas by mistake in the maintenance process, and therefore a disaster is caused, not only is the threat to life safety, but also the damage to power facilities can be caused, and the casualties of the operators and the direct loss of social, economic and property can be caused seriously.

Chinese patent CN113781741A, "method, apparatus, device and medium for warning power boundary crossing behavior based on gateway," determines the track of an operator by using infrared sensing technology, and realizes identification and judgment of the operator entering a dangerous area, but this method is prone to misjudgment, and if leaves may cause infrared alarm, false alarm may occur.

Disclosure of Invention

The invention has the technical problems that the method for manually monitoring and identifying whether the electric power operating personnel enters the dangerous area is unreliable, and the existing method for automatically identifying and judging whether the operating personnel enters the dangerous area by utilizing the infrared sensing technology has poor anti-interference capability and is easy to generate misjudgment and false alarm.

The invention aims to solve the problems, and provides a detection method for automatically identifying whether a worker enters an electric power dangerous area based on deep learning.

The technical scheme of the invention is a detection method for automatically identifying whether personnel enter the electric power dangerous area based on deep learning, which utilizes a site position detection network of operators to detect video images of an electric power site in real time, identifies position coordinates of the personnel and judges whether the personnel enter the electric power dangerous area. The operator site position detection network comprises a depth residual error network, a shallow feature fusion network, a deep feature fusion network and a multi-scale identification network; the depth residual error network comprises a plurality of residual error bottleneck structures and is used for extracting shallow layer and deep layer image characteristics of a plurality of layers from the video image; the shallow layer feature fusion network is used for fusing shallow layer image features of different layers extracted by the depth residual error network; the deep layer feature fusion network is used for fusing deep layer image features of different layers extracted by the depth residual error network; the multi-scale recognition network comprises a multi-scale perception module and is used for detecting large and small targets.

The detection method for automatically identifying the entrance of personnel into the electric power dangerous area based on deep learning comprises the following steps:

step 1: shooting and collecting a power field image;

and 2, step: establishing a pixel coordinate system by taking the upper left corner of the image as the origin of coordinates;

and 3, step 3: defining a dangerous area in the power field image, and determining the coordinates of the dangerous area;

and 4, step 4: inputting a real-time video stream of an electric power site into a site position detection network of an operator, and determining the coordinates of the operator according to the output of the detection network;

and 5: and (5) comparing the coordinates of the operating personnel obtained in the step (4) with the coordinates of the danger area in real time, judging whether the operating personnel is located in the danger area, and if the personnel enters the danger area, giving an alarm.

Preferably, the real-time detection process of the video image of the power site by the operator site location detection network comprises the following 3 stages:

the method comprises the following steps that firstly, a depth residual error network extracts characteristics of shallow-layer images and deep-layer images of multiple layers from a video image, the depth residual error network comprises 5 residual error bottleneck structures which are sequentially connected, each residual error bottleneck structure comprises a convolution layer with convolution kernel size of 3 multiplied by 3 and a ReLu activation function, and the convolution layer and the ReLu activation function are used for adjusting the size of a characteristic graph input by the next layer of residual error bottleneck structure;

preferably, the residual bottleneck structure performs batch normalization and processing of the ReLu activation function on the input, then performs convolution dimensionality enhancement and batch normalization with a convolution kernel size of 1 × 1, and processing of the ReLu activation function, then performs convolution with a convolution kernel size of 3 × 3 and processing of the ReLu activation function, and finally performs convolution with a convolution kernel size of 1 × 1 to make the number of channels consistent with the number of channels of the original input image, and performs feature fusion with the input image.

In the second stage, respectively utilizing a shallow layer feature fusion network and a deep layer feature fusion network to perform feature fusion on shallow layer and deep layer image features of different layers;

the shallow feature fusion network performs feature fusion on image features extracted from a first residual bottleneck structure, a second residual bottleneck structure and a third residual bottleneck structure of a deep residual network, and then performs convolution processing on the fused image features by 2 branches, wherein the first branch sequentially performs convolution with a convolution kernel size of 1 × 3 and convolution with a convolution kernel size of 3 × 1, the second branch sequentially performs convolution with a convolution kernel size of 3 × 1 and convolution with a convolution kernel size of 1 × 3, and then the image features output by the first branch and the second branch are input into an activation function for activation processing after feature fusion;

the deep layer feature fusion network carries out feature fusion on deep layer image features extracted from a fourth residual bottleneck structure and a fifth residual bottleneck structure of the deep residual network and then divides the deep layer image features into 2 branches, wherein one branch is sequentially subjected to global pooling, full connection and function activation, the other branch is subjected to weighting processing, and then feature fusion is carried out on feature images output by the 2 branches.

In the third stage, a multi-scale recognition network is utilized to respectively detect large and small targets according to the feature images output by the shallow feature fusion network and the deep feature fusion network;

the multi-scale recognition network comprises 4 layers of convolution and activation function layers with convolution kernel sizes of 3 multiplied by 3 which are sequentially connected, wherein the output of the 2 nd layer of convolution and activation function layer is sent to a multi-scale sensing module for large target detection, and the output of the 4 th layer of convolution and activation function layer is sent to another multi-scale sensing module for small target detection;

preferably, the multi-scale sensing module comprises 4 branches, wherein a first branch performs convolution with a convolution kernel size of 1 × 1; the second branch sequentially performs convolution with convolution kernel size of 1 × 1, convolution with convolution kernel size of 1 × 3, convolution with convolution kernel size of 3 × 1, and hole convolution with hole rate of 2 with convolution kernel size of 3 × 3; the third branch sequentially performs convolution with convolution kernel size of 1 × 1, convolution with convolution kernel size of 1 × 3, convolution with convolution kernel size of 3 × 1 and hole convolution with hole rate of 3 and convolution with convolution kernel size of 3 × 3; the fourth branch is sequentially subjected to convolution with convolution kernel size of 1 × 1 and cavity convolution with convolution kernel size of 3 × 3 and cavity rate of 4, and finally the outputs of the 4 branches are subjected to adaptive feature fusion.

The self-adaptive feature fusion is to fuse feature maps with different receptive field sizes output by 4 branches in the multi-scale perception module, and set different weights for the 4 branches respectively to perform weighted summation to obtain a final feature map Y, wherein the specific expression is as follows:

Y＝α×X ₁ +β×X ₂ +χ×X ₃ +ε×X ₄

α+β+χ+ε＝1

in the formula X _i I =1,2,3,4 represents a characteristic diagram of the i-th branch output; alpha, beta, chi and epsilon respectively represent the weights of the first branch, the second branch, the third branch and the fourth branch; lambda [ alpha ] _α ,λ _b ,λ _χ ,λ _ε Respectively represent input characteristic diagrams X _i I =1,2,3,4, and a single-channel feature map after 1 × 1 convolution dimensionality reduction.

In step 3, the dangerous area in the electric power field image is defined, and the coordinate data L of the image block corresponding to the dangerous area is determined manually based on the pixel coordinate system _D ，

L _D ＝(u _D ,v _D ,w _D ,h _D )

In the formula u _D Is the value of the hazard zone range on the u-axis, v _D Is the value of the hazard zone range on the v-axis, w _D Is the length of the hazard zone area, h _D Is the height of the extent of the hazard zone.

In step 4, the coordinates of the operator are determined to obtain coordinate data L of the operator in the image _p ， L _P ＝(u _P ,v _P ,w _P ,h _P )

In the formula u _p Is the value of the coordinates of the operator on the u-axis, v _p Is an operationValue of the coordinates of the person on the v-axis, w _p Is the length of the operator's coordinates, h _p Is the height of the coordinates of the operator.

And 5, judging whether the operator is in the dangerous area by adopting a dangerous area judgment method. The danger area determination method is used for determining the danger area L defined manually _D ＝(u _D ,v _D ,w _D ,h _D ) Obtaining the position L of the operator in the personnel site position detection network _P ＝(u _P ,v _P ,w _P ,h _P ) And (3) carrying out real-time comparison, wherein the specific calculation formula is as follows:

when the formulas (1), (2), (3) and (4) are all established, the operator is judged to be in the dangerous area, and then a real-time alarm is sent out to remind the operator.

Compared with the prior art, the invention has the beneficial effects that:

1) The invention utilizes the depth residual error network, the special fusion network and the multi-scale identification network to construct the on-site operator position detection network, detects the position coordinates of the operators in the electric power on-site image in real time, realizes the automatic identification and detection of the operators entering the electric power danger area by comparing the position coordinates of the operators with the danger area in the image, replaces manual supervision, and improves the reliability of the personnel danger detection.

2) The personnel site position detection network can obtain the accurate coordinates of the operators in the images, and the identification accuracy rate of the operators reaches more than 94.3 percent.

3) The position information of the operating personnel in the image is compared with the coordinate data of the defined dangerous area of the power site in real time, whether the operating personnel enter the dangerous area or not is judged, an alarm is sent out when the operating personnel enter the dangerous area in time, the real-time performance is good, the personal safety of the operating personnel in the power site can be effectively guaranteed, and the influence on the normal operation of power facilities is avoided.

4) The method adopts a mode of calculating the coincidence degree of the image block of the operator and the power field danger area to detect whether the operator enters the danger area, has high detection result accuracy and strong anti-interference capability, and effectively prevents false alarm.

5) According to the invention, the shallow layer feature fusion network and the deep layer feature fusion network are respectively utilized to perform feature fusion on the shallow layer image features and the deep layer image features extracted from the residual bottleneck structures at different levels of the deep residual error network, and then the shallow layer image features and the deep layer image features are sent to the multi-scale recognition network for target detection, so that the target detection precision is improved.

6) The multi-scale sensing module of the personnel site position detection network fuses the feature maps with different receptive field sizes output by a plurality of branches to detect large and small targets, thereby further improving the precision of target detection and identification.

7) The residual bottleneck structure of the invention utilizes batch normalization to ensure that each characteristic layer meets a certain distribution rule, obtains characteristic graphs with different sizes and resolutions to carry out characteristic fusion, and uses a residual edge to ensure that the input characteristic value and the output characteristic value are fused to keep initial characteristic information. The problems of degradation, gradient disappearance and the like of the depth residual error network are prevented.

Drawings

The invention is further illustrated by the following figures and examples.

Fig. 1 is a schematic flow chart of a detection method for automatically identifying a person entering an electrical power danger area according to an embodiment of the present invention.

Fig. 2 is a schematic diagram of a personnel site location detection network according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a residual bottleneck structure of a deep residual network according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of a multi-scale sensing module according to an embodiment of the invention.

Fig. 5 is a schematic diagram of comparing the coordinates of the operator with the coordinates of the danger area in real time according to an embodiment of the present invention.

Fig. 6 is a schematic diagram illustrating a detection result of whether the operator is located in the danger area according to the embodiment of the present invention.

Detailed Description

The operator site location detection network of the embodiment includes a deep residual error network, a shallow feature fusion network, a deep feature fusion network, and a multi-scale recognition network, as shown in fig. 2.

As shown in fig. 1, the detection method for automatically identifying the entrance of a person into an electric power danger area based on deep learning includes the following steps:

step 1: and fixing the position of the camera and shooting a power field image.

Step 2: and establishing a rectangular coordinate system with u as a horizontal coordinate and v as a vertical coordinate by taking the upper left corner of the image as a coordinate origin. The coordinate system is to divide the pixels of the field power image obtained in the step 1 and determine a plane reference system.

And step 3: and determining the coordinate information of the dangerous area according to a dangerous area dividing method.

The dangerous area dividing method is to determine four coordinate values of the dangerous area based on manual mode according to the electric field image shot in step 1 and the coordinate system in step 2, and the four coordinate values can be expressed as L _D ＝(u _D ,v _D ,w _D ,h _D ) Wherein L is _D Refers to the coordinates of the hazardous area in a coordinate system. u. of _D Is the value of the area of danger on the u-axis, v _D Is the value of the area of danger on the v-axis, w _D Is the length of the area of danger, h _D Is the height of the area of the hazard zone.

And 4, step 4: inputting the video stream of the real-time scene of the power field operation into the on-site position detection network of the operator for detection, and determining the position information L of the operator in the pixel coordinate _P ＝(u _P ,v _P ,w _P ,h _P )。

The detection process of the on-site position detection network of the operator to the operator is divided into 3 stages:

in the first stage, for the deep residual error network, the deep residual error network is composed of five serial residual error bottleneck structures, and a specific network structure diagram is shown in fig. 3, where each residual error bottleneck structure has a 3 × 3 convolutional layer and a ReLu activation function for adjusting the feature diagram size input by the next layer of residual error bottleneck structure. The input of each residual bottleneck structure is firstly processed by batch normalization, the ReLu activation function is convoluted into ascending dimension with 1 multiplied by 1 channel number, after the batch normalization processing is carried out, the convolution with 3 multiplied by 3 and the activation processing of the ReLu activation function are carried out, and finally the convolution with 1 multiplied by 1 is carried out, so that the channel number is consistent with the channel number of the original input image and the characteristic image is subjected to characteristic fusion.

And in the second stage, the shallow feature fusion network and the deep feature fusion network are divided, the shallow feature fusion network performs feature fusion on the first two residual bottleneck structure low-layer modules of the deep residual error network to obtain a feature map, the feature maps are respectively sent into two parallel convolutions, the convolutions on the two sides are respectively subjected to symmetric convolution operations, the convolution kernels with convolution kernels of 1 × 3 and 3 × 1 are respectively used for symmetric convolution, the parallel two sides are subjected to feature splicing and then subjected to dimension-increasing operation with convolution kernel of 1 × 1, the feature maps are sent into an activation function for activation processing, and finally the feature map of the shallow feature fusion network is output. The deep layer feature fusion network is characterized in that a feature graph is obtained after feature fusion is carried out on the last three high-layer residual bottleneck structures in the deep residual error network, global pooling is carried out, then the two full-connection layers are entered, finally, activation function processing is carried out, weighting operation is carried out on the initial input features of the network, and finally the feature graph is output.

And in the third stage, the multi-scale identification network comprises 4 convolutional layers, and two feature maps with different sizes in the 4 convolutional layers are respectively selected, pass through a multi-scale perception module, and are fused into different receptive field feature maps to obtain an output feature map for classification and positioning.

In the network structure, there are a shallow layer feature fusion network and a deep layer feature fusion network. For the shallow feature fusion network, two low-level feature maps are subjected to feature fusion, and the feature maps generated by fusion pass through two parallel asymmetric convolution layers, so that the receptive field size in a network space can be effectively improved, and the parameter computation amount is reduced. Then, performing addition operation on the outputs of the two asymmetric convolution layers along the channel direction to perform feature fusion, and finally mapping the outputs between the ranges of [0,1] through a Sigmoid function, wherein the specific expression is as follows:

S＝σ ₁ ((conv ₂ (conv ₁ (Y,W ₁ ),W ₁ ))+conv ₁ (conv ₂ (Y,W ₂ ),W ₂ ))

in the formula W ₁ ,W ₂ Parameters, conv, all of which are convolution kernels ₁ ,conv ₂ Respectively refer to asymmetric convolution layers, Y is a feature diagram of fusion output of two low-level features of a depth residual error network, sigma ₁ The method is an activation function operation, and S is a feature map obtained by fusing the features of the asymmetric convolutional layer.

For the deep layer feature fusion network, three high layer feature maps are subjected to feature fusion, the fusion feature maps are shown to be sent into the deep layer feature fusion network, then are compressed by using global pooling to generate D-dimensional feature vectors, and then two full-connection f are used _C ∈δ ^C×H×W The layers process it and finally map the values in the feature vector to [0,1] by a Sigmoid function]And finally, multiplying the feature vector and the input feature map along the channel dimension for weighting. Specific expressions are as follows.

C＝F(V,W ₂ )＝σ ₂ (fc(δ(fc(V,W ₂ )),W ₂ ))

In the formula W ₂ Representing the weight parameters needing to be updated in the network, V is the dimension of a feature graph output by the fusion of three high-level features of the deep residual error network, and sigma ₂ And representing Sigmoid activation operation, fc representing a fully-connected layer, wherein C is the number of channels of the feature map, H represents the height of the feature map, W represents the width of the feature map, delta represents a Relu activation function, and Q represents a feature map output after weighting.

As shown in FIG. 4, the adaptive feature fusion of the multi-scale recognition network is to output feature map X of different receptive field sizes of four branches in the multi-scale perception module _i (i =1,2,3,4), the corresponding weights α, β, χ, ε may be multiplied together on the different branches and inputAnd (5) obtaining a final characteristic diagram Y, wherein the specific expression is as follows.

Y＝α×X ₁ +β×X ₂ +χ×X ₃ +ε×X ₄

By the above formula, alpha is kept at [0,1]]Between the ranges. Respectively carrying out equal calculation on the beta, the chi and the epsilon to obtain alpha + beta + chi + epsilon =1, wherein the lambda is _a ,λ _β ,λ _χ ,λ _e Is to input the feature diagram X _i And (4) performing 1 × 1 convolution dimensionality reduction on the single-channel feature map.

The position coordinate L of the final operator in the image can be obtained according to the dangerous area identification network _P ＝(u _P ,v _P ,w _P ,h _P )。

And 5: and comparing the dangerous area with the coordinate information of the operator in real time according to a dangerous area judgment method, judging whether the operator enters the dangerous area, and giving an alarm if the operator enters the dangerous area.

The dangerous area judgment method is to classify the dangerous area L into a manually classified dangerous area _D ＝(u _D ,v _D ,w _D ,h _D ) Obtaining the position L of the operator in the dangerous area identification network _P ＝(u _P ,v _P ,w _P ,h _P ) And carrying out real-time comparison. The comparison is calculated as follows.

When the equations (1), (2), (3) and (4) are all satisfied, namely, the operator is judged to be in the dangerous area, a real-time alarm is sent out to remind the operator, as shown in fig. 5.

Fig. 6 shows an electric power field image according to an embodiment of the present invention, in which an operator enters into an identification and positioning mode by using the operator field position detection network of the present invention, and a rectangular frame is a detection result of a position coordinate of an electric power operator, so that according to a dangerous area in an image marked by a background operator, effective identification and judgment for judging whether the operator enters into the dangerous area at present is realized. The implementation result shows that the accurate coordinates of the operators in the images can be obtained by the personnel site position detection network, and the identification accuracy rate of the operators is up to more than 94.3%.

The invention realizes the real-time supervision of the field power operation, can see whether the working state of the power operation personnel enters a dangerous area, and greatly ensures the safe operation of the power operation.

It will be apparent to those skilled in the art that various changes and modifications can be made in the present invention without departing from the spirit and scope of the invention. To the extent that such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, it is intended that the present application also encompass such modifications and variations.

Claims

1. The detection method for automatically identifying whether personnel enter the power dangerous area based on deep learning is characterized in that after the coordinates of the power dangerous area in a power field image are determined, a site position detection network of operating personnel is utilized to carry out real-time detection on a video image of a power site, identify the position coordinates of the personnel and judge whether the personnel enter the power dangerous area;

the operator site position detection network comprises a depth residual error network, a shallow feature fusion network, a deep feature fusion network and a multi-scale identification network; the depth residual error network comprises a plurality of residual error bottleneck structures and is used for extracting shallow layer and deep layer image characteristics of a plurality of layers from the video image; the shallow feature fusion network is used for fusing shallow image features of different levels extracted by the depth residual error network; the deep feature fusion network is used for fusing deep image features of different levels extracted by the deep residual error network; the multi-scale recognition network comprises a multi-scale perception module and is used for detecting large and small targets;

the detection method comprises the following steps:

step 1: shooting and collecting a power field image;

step 2: establishing a pixel coordinate system by taking the upper left corner of the image as a coordinate origin;

and step 3: defining a danger area in the power field image, and determining the coordinates of the danger area;

and 5: and (4) comparing the coordinates of the operating personnel obtained in the step (4) with the coordinates of the dangerous area in real time, judging whether the operating personnel is located in the dangerous area, and if the personnel enters the dangerous area, giving an alarm.

2. The detection method for automatically identifying the entrance of personnel into the power danger area according to claim 1, wherein the real-time detection process of the video image of the power site by the detection network of the site location of the operating personnel comprises the following 3 stages:

the method comprises the following steps that firstly, a depth residual error network extracts shallow layer and deep layer image characteristics of multiple layers from a video image, the depth residual error network comprises 5 residual error bottleneck structures which are sequentially connected, each residual error bottleneck structure comprises a convolution layer with convolution kernel size of 3 x 3 and a ReLu activation function, and the convolution layer and the ReLu activation function are used for adjusting the size of a characteristic image input by the next layer of residual error bottleneck structure;

the deep layer feature fusion network carries out feature fusion on deep layer image features extracted from a fourth residual bottleneck structure and a fifth residual bottleneck structure of the deep residual network and then divides the deep layer image features into 2 branches, wherein one branch is sequentially subjected to global pooling, full connection and function activation processing, the other branch is subjected to weighting processing, and then feature fusion is carried out on feature images output by the 2 branches;

the third stage, utilizing multi-scale recognition network to respectively detect large and small targets according to the feature images output by the shallow feature fusion network and the deep feature fusion network;

the multi-scale recognition network comprises 4 layers of convolution and activation function layers with convolution kernel sizes of 3 multiplied by 3 which are sequentially connected, wherein the output of the 2 nd layer of convolution and activation function layer is sent to a multi-scale perception module for large target detection, and the output of the 4 th layer of convolution and activation function layer is sent to another multi-scale perception module for small target detection.

3. The method for detecting the entrance of the personnel into the power danger area according to the claim 2, characterized in that the residual bottleneck structure firstly performs batch normalization and ReLu activation function processing on the input, then performs convolution upscaling, batch normalization and ReLu activation function processing with the convolution kernel size of 1 x 1, then performs convolution and ReLu activation function processing with the convolution kernel size of 3 x 3, and finally performs convolution with the convolution kernel size of 1 x 1 to make the number of channels consistent with the number of channels of the original input image and perform feature fusion with the input image.

4. The detection method for automatically identifying the entrance of people into the power danger zone according to claim 2, wherein the multi-scale perception module comprises 4 branches, wherein a first branch is subjected to convolution with a convolution kernel size of 1 x 1; the second branch sequentially performs convolution with convolution kernel size of 1 × 1, convolution with convolution kernel size of 1 × 3, convolution with convolution kernel size of 3 × 1 and hole convolution with convolution kernel size of 3 × 3; the third branch sequentially performs convolution with convolution kernel size of 1 × 1, convolution with convolution kernel size of 1 × 3, convolution with convolution kernel size of 3 × 1 and cavity convolution with convolution kernel size of 3 × 3; the convolution with convolution kernel size of 1 × 1 and convolution with convolution kernel size of 3 × 3 are performed in sequence on the fourth branch, and finally adaptive feature fusion is performed on the outputs of the 4 branches.

5. The method for detecting the entering of the personnel into the electric power danger area according to the claim 4, wherein the self-adaptive feature fusion is to fuse feature maps with different reception field sizes output by 4 branches in the multi-scale sensing module, and set different weights for the 4 branches respectively for weighted summation to obtain a final feature map Y, and the specific expression is as follows:

Y＝α×X ₁ +β×X ₂ +χ×X ₃ +e×X ₄

α+β+χ+ε＝1

in the formula X _i I =1,2,3,4 represents a characteristic diagram of the i-th branch output; alpha, beta, chi and epsilon respectively represent the weights of the first branch, the second branch, the third branch and the fourth branch; lambda _a ,λ _β ,λ _χ ,λ _e Respectively represent input characteristic diagrams X _i I =1,2,3,4, and the dimension of the single-channel feature map is reduced by 1 × 1 convolution.

6. The method as claimed in claim 5, wherein in step 3, the dangerous area in the electric power field image is defined, and the coordinate data L of the image block corresponding to the dangerous area is determined manually based on the pixel coordinate system _D ，

L _D ＝(u _D ,v _D ,w _D ,h _D )

In the formula u _D Is the value of the hazard zone range on the u-axis, v _D Is the value of the hazard zone range on the v-axis, w _D Is the length of the hazard zone range, h _D Is the height of the area of the hazard zone。

7. The method as claimed in claim 6, wherein the step 4 of determining the coordinates of the operator to obtain the coordinate data L of the operator in the image _p ，

L _P ＝(u _P ,v _P ,w _P ,h _P )

In the formula u _p Is the value of the coordinates of the operator on the u-axis, v _p Is the value of the coordinates of the operator on the v-axis, w _p Is the length of the operator's coordinates, h _p Is the height of the coordinates of the operator.

8. The detection method for automatically identifying the entrance of the personnel into the electrical power danger area according to claim 7, wherein step 5 adopts a danger area judgment method to judge whether the operating personnel is located in the danger area;

the danger area determination method is used for determining the danger area L defined manually _D ＝(u _D ,v _D ,w _D ,h _D ) Obtaining the position L of the operator in the personnel site position detection network _P ＝(u _P ,v _P ,w _P ,h _P ) And (3) carrying out real-time comparison, wherein the specific calculation formula is as follows:

when the formulas (1), (2), (3) and (4) are all satisfied, the operator is judged to be in the dangerous area, and then a real-time alarm is sent out to remind the operator.