CN116740547A - Digital twinning-based substation target detection method, system, equipment and medium - Google Patents

Digital twinning-based substation target detection method, system, equipment and medium Download PDF

Info

Publication number
CN116740547A
CN116740547A CN202310787855.2A CN202310787855A CN116740547A CN 116740547 A CN116740547 A CN 116740547A CN 202310787855 A CN202310787855 A CN 202310787855A CN 116740547 A CN116740547 A CN 116740547A
Authority
CN
China
Prior art keywords
target detection
substation
model
module
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310787855.2A
Other languages
Chinese (zh)
Inventor
李文琢
于若颜
常乃超
张炜
张金虎
张海燕
赵娜
刘筱萍
王化鹏
李亚蕾
李昂
纪欣
崔旭
姜佳宁
李劲松
沈艳
赵铭洋
南祎
刘洋
彭聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
China Electric Power Research Institute Co Ltd CEPRI
Original Assignee
Nanjing University of Aeronautics and Astronautics
China Electric Power Research Institute Co Ltd CEPRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics, China Electric Power Research Institute Co Ltd CEPRI filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202310787855.2A priority Critical patent/CN116740547A/en
Publication of CN116740547A publication Critical patent/CN116740547A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

A method, a system, equipment and a medium for detecting a transformer substation target based on digital twinning comprise the steps of collecting transformer substation image data and constructing a training data set, a verification data set and a test data set; training a pre-established target detection network model by utilizing a training data set, wherein the target detection network model is obtained by integrating a MobileNet v2 model, a YOLOv5 model and an Openpost model; verifying the trained target detection network model by using the verification data set, and storing the target detection network model with the optimal performance index; and inputting the test data set into a target detection network model with optimal performance indexes, and outputting a substation target identification result. The method can accurately identify the real-time state of basic environment parameters (namely environment, equipment and operators) of the digital twin system of the transformer substation, has good accuracy, robustness and real-time performance, and realizes the dynamic synchronization between the physical equipment and the virtual representation thereof in the actual scene of the transformer substation.

Description

Digital twinning-based substation target detection method, system, equipment and medium
Technical Field
The application belongs to the technical field of intelligent substations, and particularly relates to a digital twinning-based substation target detection method, a digital twinning-based substation target detection system, digital twinning-based substation target detection equipment and digital twinning-based substation target detection media.
Background
Under the age background of digital informatization, the construction of a digital twin system of the transformer substation is promoted, so that the operation, management and service of the transformer substation are virtually imported, virtual control is realized through modeling, simulation, deduction and control in a virtual space, the self-perception, self-decision and self-evolution capability of the transformer substation are enhanced, the digitization and intelligent transformation of the transformer substation are promoted, and the method is an inevitable stage and a necessary way for building an energy Internet enterprise.
At present, a great amount of field operation demands exist in daily operation and maintenance management of a transformer substation, and personnel scheduling is frequent, so that in order to ensure production safety, an actual environment, equipment and operators are required to be regarded as important elements of a digital model together when the digital twin model of the transformer substation is constructed. The transformer substation has the characteristics of wide field, complex environment, numerous devices and the like, and meanwhile, the behavior of an operator has high autonomy and uncertainty. Under the influence of factors such as illumination conditions, shooting distance, angles and the like, under the conditions that target pixels are too small and feature information is sparse, the existing algorithm is difficult to meet the requirement of accurately and rapidly identifying targets. Therefore, the digital twin system construction of the intelligent substation needs a high-precision, high-speed and high-robustness multi-type small target detection algorithm under a large-scale complex scene to cope with the characteristics.
Disclosure of Invention
The application aims to solve the problems in the prior art and provides a digital twinning-based substation target detection method, a digital twinning-based substation target detection system, digital twinning-based substation target detection equipment and digital twinning-based substation medium, which can finish detection of a static object and human body gesture recognition of an operator in a wide-field and numerous substation actual scenes of the digital twinning-based substation, and have good accuracy, robustness and real-time performance.
In order to achieve the above purpose, the present application has the following technical scheme:
in a first aspect, a method for detecting a target of a substation based on digital twinning is provided, including:
substation image data are collected and input into a target detection network model, wherein the target detection network model is obtained based on integration of a MobileNetv2 model, a Yolov5 model and an Openphase model;
and outputting a substation target identification result by the target detection network model.
Preferably, a training data set, a verification data set and a test data set are constructed through the collected substation image data;
training a pre-established target detection network model by utilizing the training data set;
verifying the trained target detection network model by using the verification data set, and storing the target detection network model with the optimal performance index;
inputting the test data set into a target detection network model with optimal performance indexes, and outputting a substation target identification result;
the substation image data is used for shooting the substation environment in multiple directions, multiple angles and multiple scenes through the camera equipment, adjusting the size of the image to 320 multiplied by 320, and finishing data annotation.
Preferably, the target detection network model comprises a feature extraction module, a feature fusion module, a target detection module and a gesture recognition module;
the feature extraction module uses a depth separable convolutional network of the mobilenet v2 model to perform feature extraction by replacing CSPDarknet53 in the YOLOv5 model; the feature fusion module integrates and fuses the feature images extracted by the shallow convolution of the MobileNet v2 network and the feature images extracted by the deep convolution; the target detection module is a YOLOv5 model, inputs a feature map extracted by a MobileNet v2 network, and outputs a substation static object identification result; the gesture recognition module adopts an Openphase model, replaces a VGG-19 network by the feature fusion module to perform feature extraction, and outputs a substation operator gesture recognition result.
Preferably, the feature extraction module has three basic convolution modules, the first being an extended convolution module that uses a 1×1 convolution to extend the number of channels in the input data; the second is a depth convolution module, filtering the input from the previous module using a 3 x 3 convolution module without a pooling layer; the third is a projection convolution module that projects high-dimensional data into low-dimensional data using a 1 x 1 convolution;
the first module and the second module use a linear activation function instead of a ReLU activation function.
Preferably, the feature fusion module uses dilation convolution to reduce the size of the shallow feature map, and the calculation expression is as follows:
where α is the fill pixel value, r is the expansion ratio, l is the step size, k is the size of the convolution kernel, S out Is the size of the output characteristic diagram, S in Is the size of the input feature map;
and employs standard deconvolution of the YOLOv5 model to reduce the size of the incremental deep feature map while compressing the number of channels of the feature map according to:
wherein C is d Is the output of the d-th channel of the characteristic diagram, E dhw Is the pixel in the H row and W column of the d-th channel, h×w is the size of the picture;
fusing the deep feature map and the shallow feature map by re-distributing weights to splice a new feature map f new
Preferably, in said target detectionIn the module, the final output of the MobileNet v2 network is 10×10 feature diagram, the YOLOv5 model is detected according to the feature diagram of the MobileNet v2 network, and the output is Pre i1 The method comprises the steps of carrying out a first treatment on the surface of the Then up-sampling the 10×10 feature map, fusing with the 20×20 feature map output by the previous convolution layer, detecting by the YOLOv5 model according to the fused 20×20 feature map, and outputting as Pre i2 The method comprises the steps of carrying out a first treatment on the surface of the Similarly, the 20×20 feature map is up-sampled and fused with the 30×30 feature map output from the previous convolution layer, and the YOLOv5 model is detected according to the fused 30×30 feature map, and output as Pre i3 Finally, synthesizing Pre by the YOLOv5 model i1 、Pre i2 And Pre i3 And outputting the identification result of the static object in the digital twin scene of the transformer substation.
Preferably, the openphase model of the gesture recognition module comprises two parallel convolution networks, wherein one convolution network Branch1 is output as Part Confidence Maps and is used for positioning key points of a human body; another convolutional network Branch2, output Part Affinity Fields, is used to connect body keypoints to form limbs; the whole network of the Openphase model comprises a plurality of stages, the output result of each stage and the label calculate an L2 loss function, and the loss function of the whole network is the sum of the loss functions calculated by each stage.
In a second aspect, a digital twinning-based substation target detection system is provided, including:
the substation image data acquisition module is used for acquiring substation image data and inputting the substation image data into the target detection network model, wherein the target detection network model is obtained based on the integration of the MobileNetv2, the YOLOv5 and the Openpost model;
and the identification result output module is used for outputting a substation target identification result by the target detection network model.
Preferably, the substation image data acquisition module acquires substation image data and constructs a training data set, a verification data set and a test data set;
training a pre-established target detection network model by utilizing the training data set;
verifying the trained target detection network model by using the verification data set, and storing the target detection network model with the optimal performance index;
inputting the test data set into a target detection network model with optimal performance indexes, and outputting a substation target identification result;
the substation image data is used for shooting the substation environment in multiple directions, multiple angles and multiple scenes through the camera equipment, adjusting the size of the image to 320 multiplied by 320, and finishing data annotation.
Preferably, the target detection network model comprises a feature extraction module, a feature fusion module, a target detection module and a gesture recognition module;
the feature extraction module uses a depth separable convolutional network of the mobilenet v2 model to perform feature extraction by replacing CSPDarknet53 in the YOLOv5 model; the feature fusion module integrates and fuses the feature images extracted by the shallow convolution of the MobileNet v2 network and the feature images extracted by the deep convolution; the target detection module is a YOLOv5 model, inputs a feature map extracted by a MobileNet v2 network, and outputs a substation static object identification result; the gesture recognition module adopts an Openphase model, replaces a VGG-19 network by the feature fusion module to perform feature extraction, and outputs a substation operator gesture recognition result.
In a third aspect, an electronic device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the digital twinning-based substation target detection method when executing the computer program.
In a fourth aspect, a computer readable storage medium is provided, where a computer program is stored, where the computer program, when executed by a processor, implements the digital twinning-based substation target detection method.
Compared with the prior art, the application has at least the following beneficial effects:
the digital twinning-based substation target detection method can finish detection of static objects and human body gesture recognition of operators in a substation actual scene with wide field and numerous devices, and has good accuracy, robustness and real-time performance. In order to improve the accuracy of simulation and prediction results of a digital twin system of an intelligent substation, the target detection network model is obtained based on integration of a MobileNet v2 model, a YOLOv5 model and an Openpost model, and mainly solves the problem of two real-time target recognition in practical application; secondly, remote human body gesture recognition, which is an important dynamic environment parameter in digital twinning, has high autonomy and uncertainty of the operator's behavior, and the whole body characteristics of the operator are not generally available due to the different angles and distances of the cameras. Compared with the traditional algorithm based on the key points of the human bones, the target detection network model can objectively depict the gesture and the behavior characteristics of an operator.
It will be appreciated that the advantages of the second to fourth aspects may be found in the relevant description of the first aspect and are not repeated here.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a substation target detection method based on digital twinning in an embodiment of the application;
FIG. 2 is a schematic diagram of a target detection network model according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
Furthermore, the terms "first," "second," "third," and the like in the description of the present specification and in the appended claims, are used for distinguishing between descriptions and not necessarily for indicating or implying a relative importance.
Referring to fig. 1, the method for detecting a target of a transformer substation based on digital twin provided by the embodiment of the application is used for accurately identifying real-time states of basic environmental parameters (i.e. environment, equipment and operators) of a digital twin system of the transformer substation, and aims to realize dynamic synchronization between physical equipment and virtual representation thereof, and specifically comprises the following steps:
s1, collecting substation image data and constructing a training data set, a verification data set and a test data set;
s2, training a pre-established target detection network model by using a training data set, wherein the target detection network model is obtained by integrating a MobileNet v2 model, a YOLOv5 model and an Openpost model;
s3, verifying the trained target detection network model by using a verification data set, and storing the target detection network model with the optimal performance index;
and S4, inputting the test data set into a target detection network model with optimal performance indexes, and outputting a substation target identification result.
In one possible implementation, step S1 photographs the substation environment through the photographing apparatus, where the photographing should meet the requirements of multiple directions, multiple angles, and multiple scenes. In consideration of the influence of factors such as resolution of shooting equipment, illumination and the like in the actual detection process, a blurred image can be artificially introduced to enhance the robustness of an algorithm. The captured image is sized 320×320, and can be adjusted according to 7:2:1 to make corresponding three data sets for training, testing and verifying.
Referring to fig. 2, in one possible implementation, the object detection network model is integrated based on MobileNetv2, YOLOv5 and openpore models, and is used to identify real-time status of intelligent substation environments, devices and operators. The established target detection network model comprises a feature extraction module, a feature fusion module, a target detection module and a gesture recognition module.
Wherein the feature extraction module performs feature extraction by replacing CSPDarknet53 in the YOLOv5 model using a depth separable convolutional network of the MobileNetv2 model; the feature fusion module integrates and fuses the feature images extracted by the shallow convolution of the MobileNet v2 network and the feature images extracted by the deep convolution; the target detection module is a YOLOv5 model, inputs a feature map extracted by a MobileNet v2 network, and outputs a substation static object identification result; the gesture recognition module adopts an Openphase model, replaces a VGG-19 network by the feature fusion module to perform feature extraction, and outputs a substation operator gesture recognition result.
Furthermore, the feature extraction module uses a depth separable convolution network of the MobileNet v2 model to replace CSPDarknet53 in YOLOv5 for feature extraction, so that abundant semantic information can be provided, and real-time performance of small target detection is effectively improved. The input of the module is the marked data in the data set obtained in the step S1, and the marked data is output as a feature map containing rich semantic information. Specifically, the feature extraction module includes three convolution modules. The first is a spread convolution module, which uses a 1 x 1 convolution to spread the number of channels in the input data. The second is a depth convolution module, using a 3 x 3 convolution module without a pooling layer to filter the input from the previous module. The third is a projective convolution module that projects high-dimensional data into low-dimensional data using a 1 x 1 convolution. In addition, a linear activation function is used in the first and second modules to replace the original ReLU activation function to mitigate information loss and corruption.
The feature fusion module integrates and fuses the feature map extracted by the shallow convolution of the MobileNet v2 network and the feature map extracted by the deep convolution, so that detail features from the shallow layer and features from the deep layer are realizedFeature fusion of deep semantic features can save computing resources and alleviate the problems of gradient disappearance and performance degradation in over-deep convolution. Specifically, the shallow feature map needs to be reduced in size and enlarged in receptive field (receptive field), and this process is done using dilation convolution (dilated convolution) to avoid information loss, as shown in computational expression (1). Further, standard deconvolution based on YOLOv5 model is employed to increase the size of the deep feature map while compressing the number of channels of the feature map in accordance with the calculation expression (2). Finally, fusing the deep feature map and the shallow feature map by re-distributing weights to splice a new feature map f new
Where α is the fill pixel value, r is the expansion ratio, l is the step size, k is the size of the convolution kernel, S out Is the size of the output characteristic diagram, S in Is the size of the input feature map.
Wherein C is d Is the output of the d-th channel of the characteristic diagram, E dhw Is the pixel in the H row and W column of the d-th channel, and h×w is the size of the picture.
The target detection module is YOLOv5, inputs a characteristic diagram extracted by a MobileNetv2 network, and outputs a target detection result. Specifically, the final output of the mobilenet v2 network is a 10×10 feature map, and YOLOv5 is detected from the feature map and output is Pre i1 The method comprises the steps of carrying out a first treatment on the surface of the Upsampling the 10×10 feature map, fusing it with the 20×20 feature map output by the previous convolution layer, detecting YOLOv5 according to the fused 20×20 feature map, and outputting as Pre i2 The method comprises the steps of carrying out a first treatment on the surface of the Similarly, the 20×20 feature map is up-sampled and fused with the 30×30 feature map output from the previous convolution layer, and YOLOv5 is detected from the fused 30×30 feature map and output as Pre i3 . Finally, the step of obtaining the product,YOLOv5 comprehensive Pre i1 、Pre i2 And Pre i3 And outputting a detection result of the static small object in the digital twin complex scene of the transformer substation. In conclusion, the network module obtained by integrating MobileNet v2 and YOLOv5, namely YOLOv5-Mv2, is used for detecting and identifying static small targets in a digital twin system of a transformer substation.
The gesture recognition module is an openpost network. In order to enhance the nonlinear fitting capability of the network and further improve the accuracy of remote human body gesture recognition, the feature fusion module is adopted to replace VGG-19 for feature extraction. The openpost network can be regarded as a parallel convolution network model, wherein one convolution network Branch1 has an output of Part Confidence Maps and is used for positioning key points of a human body; another convolutional network Branch2, output Part Affinity Fields, is used to connect body keypoints to form a limb. The whole network comprises a plurality of stages, the output result of each stage is combined with the label to calculate the L2 loss function, so that the network converges in the direction of each stage towards the label, the training and identifying precision of the network are accelerated, and the loss function of the whole network is the sum of the loss functions calculated by each stage.
In order to improve the accuracy of simulation and prediction results of a digital twin system of an intelligent substation, the application integrates and improves MobileNet v2, YOLOv5 and Openpose, and provides a digital twin small target detection model of the intelligent substation. Firstly, a depth separable convolution network of MobileNet v2 is integrated into YOLOv5 to replace the original CSPDarknet53 for feature extraction, so that abundant semantic information is provided for the YOLOv5, the calculation efficiency is effectively improved, and the requirement of digital twin on real-time performance is met. The whole YOLOv5-Mv2 module is used for static small object detection in digital twinning of the intelligent substation. Meanwhile, the feature fusion module is designed, shallow detail information and deep semantic information are further fused, and the fused feature map is used as input of an Openpost network, so that calculation resources can be saved, and the problems of gradient disappearance and performance reduction in over-deep convolution are solved. The improved Openphase network can reduce unnecessary background noise and focus on learning accurate human skeleton characteristics so as to improve the detection accuracy of digital twin long-distance human body gesture recognition.
The method can accurately identify the real-time state of basic environmental parameters (namely environment, equipment and operators) of the digital twin system of the transformer substation, has good accuracy, robustness and real-time performance, realizes dynamic synchronization between physical equipment and virtual representation thereof in the actual scene of the transformer substation, and is beneficial to modeling, monitoring and optimizing the operation process of the intelligent transformer substation.
Another embodiment of the present application further provides a digital twinning-based substation target detection system, including:
the substation image data acquisition module is used for acquiring substation image data and inputting the substation image data into the target detection network model, wherein the target detection network model is obtained based on the integration of the MobileNetv2, the YOLOv5 and the Openpost model;
and the identification result output module is used for outputting a substation target identification result by the target detection network model.
In one possible implementation, the substation image data acquisition module acquires substation image data and constructs a training data set, a validation data set, and a test data set; training a pre-established target detection network model by using a training data set; verifying the trained target detection network model by using the verification data set, and storing the target detection network model with the optimal performance index; inputting the test data set into a target detection network model with optimal performance indexes, and outputting a substation target identification result;
the substation image data is used for shooting the substation environment in multiple directions, multiple angles and multiple scenes through the camera equipment, the size of the image is adjusted to 320 multiplied by 320, and the data annotation is completed.
In one possible implementation, the object detection network model includes a feature extraction module, a feature fusion module, a target detection module and a gesture recognition module, wherein the feature extraction module uses a depth separable convolution network of the MobileNetv2 model to perform feature extraction by replacing the CSPDarknet53 in the YOLOv5 model; the feature fusion module integrates and fuses the feature images extracted by the shallow convolution of the MobileNet v2 network and the feature images extracted by the deep convolution; the target detection module is a YOLOv5 model, inputs a feature map extracted by a MobileNet v2 network, and outputs a substation static object identification result; the gesture recognition module adopts an Openphase model, replaces a VGG-19 network by the feature fusion module to perform feature extraction, and outputs a substation operator gesture recognition result.
Further, the feature extraction module has three basic convolution modules, the first is an extended convolution module, and the number of channels in the input data is extended by using 1×1 convolution; the second is a depth convolution module, filtering the input from the previous module using a 3 x 3 convolution module without a pooling layer; the third is a projection convolution module that projects high-dimensional data into low-dimensional data using a 1 x 1 convolution; the first module and the second module use a linear activation function instead of a ReLU activation function.
Further, the feature fusion module uses dilation convolution to reduce the size of the shallow feature map, and the calculation expression is as follows:
where α is the fill pixel value, r is the expansion ratio, l is the step size, k is the size of the convolution kernel, S out Is the size of the output characteristic diagram, S in Is the size of the input feature map;
and employs standard deconvolution of the YOLOv5 model to reduce the size of the incremental deep feature map while compressing the number of channels of the feature map according to:
wherein C is d Is the output of the d-th channel of the characteristic diagram, E dhw Is the pixel in the H row and W column of the d-th channel, h×w is the size of the picture;
fusing the deep feature map and the shallow feature map by re-distributing weights to splice a new feature map f new
Furthermore, in the target detection module, the final output of the mobilenet v2 network is 10×10 feature map, the YOLOv5 model detects according to the feature map of the mobilenet v2 network, and the output is Pre i1 The method comprises the steps of carrying out a first treatment on the surface of the Then up-sampling the 10×10 feature map, fusing with the 20×20 feature map output by the previous convolution layer, detecting by the YOLOv5 model according to the fused 20×20 feature map, and outputting as Pre i2 The method comprises the steps of carrying out a first treatment on the surface of the Similarly, the 20×20 feature map is up-sampled and fused with the 30×30 feature map output from the previous convolution layer, and the YOLOv5 model is detected according to the fused 30×30 feature map, and output as Pre i3 Finally synthesizing Pre by the YOLOv5 model i1 、Pre i2 And Pre i3 And outputting the identification result of the static object in the digital twin scene of the transformer substation.
Further, the openpost model of the gesture recognition module comprises two parallel convolution networks, wherein one convolution network Branch1 is output as Part Confidence Maps and is used for positioning key points of a human body; another convolutional network Branch2, output Part Affinity Fields, is used to connect body keypoints to form a limb. The whole network of the Openpost model comprises a plurality of stages, the output result of each stage and the label calculate an L2 loss function, and the loss function of the whole network is the sum of the loss functions calculated by each stage.
Another embodiment of the present application further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the substation target detection method based on digital twinning when executing the computer program.
Another embodiment of the present application also proposes a computer readable storage medium storing a computer program which, when executed by a processor, implements the digital twinning-based substation target detection method.
The computer program comprises computer program code which may be in source code form, object code form, executable file or in some intermediate form, etc. The computer readable storage medium may include: any entity or device, medium, usb disk, removable hard disk, magnetic disk, optical disk, computer memory, read-only memory, random access memory, electrical carrier wave signals, telecommunications signals, software distribution media, and the like capable of carrying the computer program code. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals. For convenience of description, the foregoing disclosure shows only those parts relevant to the embodiments of the present application, and specific technical details are not disclosed, but reference is made to the method parts of the embodiments of the present application. The computer readable storage medium is non-transitory and can be stored in a storage device formed by various electronic devices, and can implement the execution procedure described in the method according to the embodiment of the present application.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flowchart and/or block of the flowchart illustrations and/or block diagrams, and combinations of flowcharts and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present application and not for limiting the same, and although the present application has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the application without departing from the spirit and scope of the application, which is intended to be covered by the claims.

Claims (12)

1. The method for detecting the target of the transformer substation based on digital twinning is characterized by comprising the following steps of:
substation image data are collected and input into a target detection network model, wherein the target detection network model is obtained based on integration of a MobileNetv2 model, a Yolov5 model and an Openphase model;
and outputting a substation target identification result by the target detection network model.
2. The digital twinning-based substation target detection method according to claim 1, wherein a training dataset, a verification dataset and a test dataset are constructed from the acquired substation image data;
training a pre-established target detection network model by utilizing the training data set;
verifying the trained target detection network model by using the verification data set, and storing the target detection network model with the optimal performance index;
inputting the test data set into a target detection network model with optimal performance indexes, and outputting a substation target identification result;
the substation image data is used for shooting the substation environment in multiple directions, multiple angles and multiple scenes through the camera equipment, adjusting the size of the image to 320 multiplied by 320, and finishing data annotation.
3. The digital twinning-based substation target detection method according to claim 1, wherein the target detection network model comprises a feature extraction module, a feature fusion module, a target detection module and a gesture recognition module;
the feature extraction module uses a depth separable convolutional network of the mobilenet v2 model to perform feature extraction by replacing CSPDarknet53 in the YOLOv5 model; the feature fusion module integrates and fuses the feature images extracted by the shallow convolution of the MobileNet v2 network and the feature images extracted by the deep convolution; the target detection module is a YOLOv5 model, inputs a feature map extracted by a MobileNet v2 network, and outputs a substation static object identification result; the gesture recognition module adopts an Openphase model, replaces a VGG-19 network by the feature fusion module to perform feature extraction, and outputs a substation operator gesture recognition result.
4. A digital twinning-based substation target detection method according to claim 3, wherein the feature extraction module has three basic convolution modules, the first one being an expansion convolution module that uses 1 x 1 convolution to expand the number of channels in the input data; the second is a depth convolution module, filtering the input from the previous module using a 3 x 3 convolution module without a pooling layer; the third is a projection convolution module that projects high-dimensional data into low-dimensional data using a 1 x 1 convolution;
the first module and the second module use a linear activation function instead of a ReLU activation function.
5. A digital twinning-based substation target detection method according to claim 3, wherein the feature fusion module uses dilation convolution to reduce the size of the shallow feature map, and the calculation expression is as follows:
where α is the fill pixel value, r is the expansion ratio, l is the step size, k is the size of the convolution kernel, S out Is the size of the output characteristic diagram, S in Is the size of the input feature map;
and employs standard deconvolution of the YOLOv5 model to reduce the size of the incremental deep feature map while compressing the number of channels of the feature map according to:
wherein C is d Is the output of the d-th channel of the characteristic diagram, E dhw Is the pixel in the H row and W column of the d-th channel, h×w is the size of the picture;
fusing the deep feature map and the shallow feature map by re-distributing weights to splice a new feature map f new
6. A digital twinning-based substation target detection method according to claim 3, wherein in the target detection module, the final output of the MobileNetv2 network is 10×10 feature map, the YOLOv5 model detects according to the feature map of the MobileNetv2 network, and the output is Pre i1 The method comprises the steps of carrying out a first treatment on the surface of the Upsampling the 10×10 feature map, and outputting with 20×20 feature map of previous convolution layerThe YOLOv5 model is fused, detected according to the 20X 20 characteristic diagram after fusion, and output as Pre i2 The method comprises the steps of carrying out a first treatment on the surface of the Similarly, the 20×20 feature map is up-sampled and fused with the 30×30 feature map output from the previous convolution layer, and the YOLOv5 model is detected according to the fused 30×30 feature map, and output as Pre i3 Finally, synthesizing Pre by the YOLOv5 model i1 、Pre i2 And Pre i3 And outputting the identification result of the static object in the digital twin scene of the transformer substation.
7. The digital twinning-based substation target detection method according to claim 3, wherein the openpost model of the gesture recognition module comprises two parallel convolution networks, wherein one convolution network Branch1 is output as Part Confidence Maps and is used for positioning key points of a human body; another convolutional network Branch2, output Part Affinity Fields, is used to connect body keypoints to form limbs; the whole network of the Openphase model comprises a plurality of stages, the output result of each stage and the label calculate an L2 loss function, and the loss function of the whole network is the sum of the loss functions calculated by each stage.
8. A digital twinning-based substation target detection system, comprising:
the substation image data acquisition module is used for acquiring substation image data and inputting the substation image data into the target detection network model, wherein the target detection network model is obtained based on the integration of the MobileNetv2, the YOLOv5 and the Openpost model;
and the identification result output module is used for outputting a substation target identification result by the target detection network model.
9. The digital twinning-based substation target detection system of claim 8, wherein: the substation image data acquisition module acquires substation image data and constructs a training data set, a verification data set and a test data set;
training a pre-established target detection network model by utilizing the training data set;
verifying the trained target detection network model by using the verification data set, and storing the target detection network model with the optimal performance index;
inputting the test data set into a target detection network model with optimal performance indexes, and outputting a substation target identification result;
the substation image data is used for shooting the substation environment in multiple directions, multiple angles and multiple scenes through the camera equipment, adjusting the size of the image to 320 multiplied by 320, and finishing data annotation.
10. The digital twinning-based substation target detection system of claim 8, wherein: the target detection network model comprises a feature extraction module, a feature fusion module, a target detection module and a gesture recognition module;
the feature extraction module uses a depth separable convolutional network of the mobilenet v2 model to perform feature extraction by replacing CSPDarknet53 in the YOLOv5 model; the feature fusion module integrates and fuses the feature images extracted by the shallow convolution of the MobileNet v2 network and the feature images extracted by the deep convolution; the target detection module is a YOLOv5 model, inputs a feature map extracted by a MobileNet v2 network, and outputs a substation static object identification result; the gesture recognition module adopts an Openphase model, replaces a VGG-19 network by the feature fusion module to perform feature extraction, and outputs a substation operator gesture recognition result.
11. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized by: the processor, when executing the computer program, implements the digital twinning-based substation target detection method according to any one of claims 1 to 7.
12. A computer-readable storage medium storing a computer program, characterized in that: the computer program, when executed by a processor, implements a digital twinning-based substation target detection method according to any one of claims 1 to 7.
CN202310787855.2A 2023-06-29 2023-06-29 Digital twinning-based substation target detection method, system, equipment and medium Pending CN116740547A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310787855.2A CN116740547A (en) 2023-06-29 2023-06-29 Digital twinning-based substation target detection method, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310787855.2A CN116740547A (en) 2023-06-29 2023-06-29 Digital twinning-based substation target detection method, system, equipment and medium

Publications (1)

Publication Number Publication Date
CN116740547A true CN116740547A (en) 2023-09-12

Family

ID=87909569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310787855.2A Pending CN116740547A (en) 2023-06-29 2023-06-29 Digital twinning-based substation target detection method, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN116740547A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117853491A (en) * 2024-03-08 2024-04-09 山东省计算中心(国家超级计算济南中心) Few-sample industrial product abnormality detection method and system based on multi-scene task

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117853491A (en) * 2024-03-08 2024-04-09 山东省计算中心(国家超级计算济南中心) Few-sample industrial product abnormality detection method and system based on multi-scene task
CN117853491B (en) * 2024-03-08 2024-05-24 山东省计算中心(国家超级计算济南中心) Few-sample industrial product abnormality detection method and system based on multi-scene task

Similar Documents

Publication Publication Date Title
CN109785258B (en) Face image restoration method based on multi-discriminator generated countermeasure network
CN108537191B (en) Three-dimensional face recognition method based on structured light camera
CN112287820A (en) Face detection neural network, face detection neural network training method, face detection method and storage medium
CN110659573A (en) Face recognition method and device, electronic equipment and storage medium
CN116740547A (en) Digital twinning-based substation target detection method, system, equipment and medium
CN111833360B (en) Image processing method, device, equipment and computer readable storage medium
CN114333062B (en) Pedestrian re-recognition model training method based on heterogeneous dual networks and feature consistency
WO2022120996A1 (en) Visual position recognition method and apparatus, and computer device and readable storage medium
CN114240770A (en) Image processing method, device, server and storage medium
CN116258756B (en) Self-supervision monocular depth estimation method and system
WO2023217138A1 (en) Parameter configuration method and apparatus, device, storage medium and product
CN117132461A (en) Method and system for whole-body optimization of character based on character deformation target body
CN116778164A (en) Semantic segmentation method for improving deep V < 3+ > network based on multi-scale structure
CN116934972A (en) Three-dimensional human body reconstruction method based on double-flow network
CN111652242A (en) Image processing method, image processing device, electronic equipment and storage medium
CN106570928A (en) Image-based re-lighting method
CN115909408A (en) Pedestrian re-identification method and device based on Transformer network
CN113191367B (en) Semantic segmentation method based on dense scale dynamic network
CN117437120A (en) Image stitching method based on deep learning end to end
CN114648604A (en) Image rendering method, electronic device, storage medium and program product
CN114898447A (en) Personalized fixation point detection method and device based on self-attention mechanism
CN114820755A (en) Depth map estimation method and system
CN114529794A (en) Infrared and visible light image fusion method, system and medium
CN111127587B (en) Reference-free image quality map generation method based on countermeasure generation network
CN112966546A (en) Embedded attitude estimation method based on unmanned aerial vehicle scout image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination