CN111460988B

CN111460988B - Illegal behavior recognition method and device

Info

Publication number: CN111460988B
Application number: CN202010241982.9A
Authority: CN
Inventors: 郝翔宇; 韩若冰; 李相颖; 戴婧姝; 张通; 焦书来; 回博轩; 马兴望; 卢纯镇; 周明; 王曼曼; 谷雨
Original assignee: State Grid Corp of China SGCC; State Grid Hebei Electric Power Co Ltd; Cangzhou Power Supply Co of State Grid Hebei Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; State Grid Hebei Electric Power Co Ltd; Cangzhou Power Supply Co of State Grid Hebei Electric Power Co Ltd
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2023-08-22
Anticipated expiration: 2040-03-31
Also published as: CN111460988A

Abstract

The application provides a method and a device for identifying illegal behaviors, which belong to the field of image identification, and the method for identifying the illegal behaviors comprises the following steps: acquiring an image sequence acquired in real time based on a camera; respectively inputting each frame of image of the image sequence into a spatial channel depth convolution neural network to extract a first characteristic of each frame of image of the image sequence; calculating optical flow information of every two adjacent frames of images in the image sequence based on an optical flow estimation algorithm; recording optical flow information in an image format to obtain an optical flow image; inputting the optical flow image into a time channel depth convolution neural network to extract a second feature of the optical flow image; fusing the first feature and the second feature into a third feature; identifying whether there is an offence in the image sequence based on the third feature; and if the illegal behaviors exist, outputting alarm information. According to the method, the deep convolutional neural network is used for identifying the illegal behaviors respectively, so that the working efficiency of identifying the illegal behaviors of personnel is improved.

Description

Illegal behavior recognition method and device

Technical Field

The application belongs to the field of image recognition, and particularly relates to a method and a device for recognizing illegal behaviors.

Background

As an important link of power transmission in a power system, a transformer substation has an irreplaceable importance in a power grid. Because the number of substations is greatly increased year by year, the number of substations and the number and the demand of equipment in the substations are continuously increased, so that the work in the substations is greatly increased, the operators in the substations are more, the safety control difficulty of the operation site is increased, the frequent illegal actions of the operators on the site can be caused, the potential safety hazards of equipment and a power grid can be caused due to the illegal operation actions, and personal safety accidents can be caused in serious cases. Therefore, it is particularly important to monitor the behavior of the field operators in real time.

At present, detection of illegal behaviors of personnel in a transformer substation still adopts inspection personnel to monitor a working site or monitors videos shot by probes in the transformer substation, but the whole process of dead angle-free monitoring cannot be achieved through manual monitoring, and the detection accuracy is low, the real-time performance is poor, and illegal behaviors of on-site illegal personnel cannot be found and prevented in time.

Disclosure of Invention

The application aims to provide a method and a device for identifying illegal behaviors, which improve the working efficiency of identifying the illegal behaviors of personnel.

To achieve the above object, a first aspect of the present application provides a method for identifying illegal actions, including:

acquiring an image sequence acquired in real time based on a camera;

respectively inputting each frame of image of the image sequence into a spatial channel depth convolution neural network to extract a first characteristic of each frame of image in the image sequence, wherein the spatial channel depth convolution neural network is a depth convolution neural network which is obtained by training in advance based on a first training set, and the first training set comprises: recording a static image with the violations and the types of the violations corresponding to the static image;

calculating optical flow information of every two adjacent frames of images in the image sequence based on an optical flow estimation algorithm;

recording the optical flow information in an image format to obtain an optical flow image;

inputting the optical flow image into a time channel depth convolution neural network to extract a second feature of the optical flow image, wherein the time channel depth convolution neural network is a depth convolution neural network obtained by training in advance based on a second training set, and the second training set comprises: recording a dynamic image with the violations and the types of the violations corresponding to the dynamic image;

fusing the first feature and the second feature into a third feature;

identifying whether an offence exists in the image sequence based on the third feature;

and if the illegal behaviors exist, outputting alarm information.

In a first possible implementation manner of the first aspect of the present application, the recording the optical flow information in an image format, and obtaining an optical flow image includes:

and respectively storing the components of the optical flow information in the horizontal direction and the vertical direction and the magnitude of the optical flow into three channels of an RGB image to obtain a colorful optical flow image.

In a second possible implementation manner, according to the first aspect of the present application or the first possible implementation manner of the first aspect of the present application, the fusing the first feature and the second feature into a third feature specifically includes:

and fusing the first characteristic and the second characteristic into a third characteristic based on a weighted average fusion algorithm.

In a third possible implementation manner, according to a second possible implementation manner of the first aspect of the present application, before the fusing the first feature and the second feature into the third feature, the method further includes:

normalizing the first feature and the second feature;

the fusing the first feature and the second feature into a third feature based on the weighted average fusion algorithm specifically comprises:

and fusing the normalized first feature and the second feature into a third feature based on a weighted average fusion algorithm.

In a fourth possible implementation manner according to the first aspect of the present application or the first possible implementation manner of the first aspect of the present application, the outputting the alarm information includes:

and outputting an alarm signal and an offence image, wherein the alarm signal is used for indicating that the offence exists in the image sequence, and the offence image is at least one image recorded with corresponding offence in the image sequence.

The second aspect of the present application provides an offence identification device, comprising:

the acquisition module is used for acquiring an image sequence acquired in real time based on the camera;

the first feature extraction module is configured to input each frame of image of the image sequence into a spatial channel depth convolution neural network, so as to extract a first feature of each frame of image in the image sequence, where the spatial channel depth convolution neural network is a depth convolution neural network that is obtained by training in advance based on a first training set, and the first training set includes: recording a static image with the violations and the types of the violations corresponding to the static image;

the calculation module is used for calculating optical flow information of every two adjacent frames of images in the image sequence based on an optical flow estimation algorithm;

the conversion module is used for recording the optical flow information in an image format to obtain an optical flow image;

the second feature extraction module is configured to input the optical flow image into a time channel depth convolution neural network to extract a second feature of the optical flow image, where the time channel depth convolution neural network is a depth convolution neural network that is obtained by training in advance based on a second training set, and the second training set includes: recording a dynamic image with the violations and the types of the violations corresponding to the dynamic image;

the fusion module is used for fusing the first feature and the second feature into a third feature;

the identification module is used for identifying whether illegal behaviors exist in the image sequence or not based on the third characteristics;

and the output module is used for outputting alarm information if the illegal action exists.

In a first possible implementation manner, according to the second aspect of the present application, the above fusion module is specifically configured to:

In a second possible implementation manner, the device for identifying illegal actions further includes:

the processing module is used for carrying out normalization processing on the first characteristic and the second characteristic;

the fusion module is specifically used for: and fusing the normalized first feature and the second feature into a third feature based on a weighted average fusion algorithm.

A third aspect of the present application provides an offence identification device, including: a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the first aspect or any one of the possible implementations of the first aspect when the computer program is executed.

A fourth aspect of the application provides a computer readable storage medium storing a computer program which when executed by a processor performs the steps of the first aspect or any one of the possible implementations of the first aspect.

From the above, the application firstly acquires the image sequence acquired in real time based on the camera; extracting a first characteristic of each frame of image in the image sequence through a spatial channel depth convolution neural network; calculating optical flow information of every two adjacent frames of images in the image sequence based on an optical flow estimation algorithm; recording optical flow information in an image format to obtain an optical flow image; and deconvolving a second feature of the neural network optical flow image by time channel depth; then fusing the first feature and the second feature into a third feature and identifying whether illegal behaviors exist in the image sequence based on the third feature; if the illegal behaviors exist, alarm information is output, tedious processes such as image preprocessing in the manual feature extraction process are omitted, and the working efficiency of identifying the illegal behaviors of personnel is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a method for identifying illegal behaviors according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of an apparatus for identifying illegal behaviors according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of an apparatus for identifying illegal behaviors according to another embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, but the present application may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present application is not limited to the specific embodiments disclosed below.

Example 1

The embodiment of the application provides a method for identifying illegal behaviors, which is shown in fig. 1 and comprises the following steps:

step 11: acquiring an image sequence acquired in real time based on a camera;

step 12: respectively inputting each frame of image of the image sequence into a spatial channel depth convolution neural network to extract a first characteristic of each frame of image of the image sequence;

the spatial channel deep convolutional neural network is a deep convolutional neural network which is obtained by training in advance based on a first training set, wherein the first training set comprises: recording a static image with the violations and the types of the violations corresponding to the static image;

specifically, the spatial channel depth convolutional neural network extracts a static feature (i.e., a first feature) from each frame of image that is stationary in the image sequence.

Optionally, the first training set may be a sample video image sequence recorded with an offence, and specifically, the sample video image sequence may include a still image recorded with an offence type such as a person wearing no helmet, a person taking off a helmet, and a person crossing a fence.

Optionally, when training the deep convolutional neural network based on the first training set, the deep convolutional neural network may be trained by using the static image recorded with the violations from the sample video image sequence and the corresponding types of the violations as input, so as to obtain the spatial channel deep convolutional neural network.

Step 13: calculating optical flow information of every two adjacent frames of images in the image sequence based on an optical flow estimation algorithm;

optionally, the optical flow estimation algorithm may be a high-precision optical flow estimation algorithm based on a variation theory, which can accurately estimate an optical flow field and has good robustness, and specifically, the high-precision optical flow estimation algorithm based on the variation theory is derived based on a gray value continuous hypothesis, a gradient continuous hypothesis and a smoothness hypothesis.

Further, the gray value continuous hypothesis is specifically: when the positions of the pixel points are changed between frames, the gray value of each pixel point is kept unchanged, and the characteristic can be expressed by the following formula:

I(x,y,t)＝I(x+u,y+v,t+1)

wherein, I (x, y, t) represents the gray value information of each frame of image on the time sequence, which is the change vector of a pixel point from t time to t+1 time;

the gradient continuous hypothesis is specifically: since the gray value is continuously assumed to be sensitive to slight changes of light, the gray value is properly allowed to be changed to help determine a real change vector, and then a variable which remains unchanged when the gray value of the matched pixel point changes, namely, the gradient of the gray value image needs to be found, wherein the specific formula is as follows:

wherein,,representing the spatial gradient.

The smoothness assumption is specifically: the pixel displacement estimation based on the above two assumptions only considers the pixel point itself, and does not consider other pixels adjacent around it; this results in an estimation error once the gradient diverges somewhere or an aperture problem occurs, and therefore it is necessary to reintroduce the smoothness assumption of the optical flow field. Since the optimal displacement field will create discontinuities at the object boundaries, the smoothness assumption can be relaxed to a situation where only a segment of the optical flow field is required to be ensured to be smooth.

Based on the above three assumptions, a corresponding energy equation can be deduced, specifically, let x= (x, y, t) ^T ，w＝(u,v,1) ^T The following energy equation can be deduced：

Wherein γ is the weight of the balanced gray value continuous hypothesis and the gradient continuous hypothesis, a concave function ψ may be added to the energy equation to further enhance the robustness of the energy equation (S ² ) The following calculation formula is thus derived:

wherein,,epsilon is a small positive number, and as epsilon is small enough, ψ(s) can be ensured to be a convex function, so that the minimization of the energy equation can be smoothly carried out; furthermore, ψ does not introduce additional parameter variables, ε can be set to be constant, and ε=0.001 can be used in the calculation.

Step 14: recording the optical flow information in an image format to obtain an optical flow image;

optionally, after the optical flow information is obtained, components of the optical flow information in the horizontal direction and the vertical direction and magnitudes of the optical flow are respectively stored in three channels of an RGB image, so as to obtain a color optical flow image.

Optionally, since the pixel value of the image is [0,255], and the calculation result of the optical flow estimation algorithm falls within a real range close to 0 and having positive and negative values, before obtaining the optical flow image, the optical flow information may be transformed into the corresponding pixel value, where a specific transformation formula is as follows:

F＝min[max[(flow×scale+128),255],0]

wherein F is the transformed pixel value; flow is an original optical flow estimation result (component of optical flow information in a horizontal direction, component of optical flow information in a vertical direction or amplitude of optical flow), namely optical flow information; scale is an amplification adjustment multiple, and when the value of scale is 16 through preliminary experiments, the optical flow calculation result can be amplified to a range with the span of 255; to prevent the values of individual points from going out of range after amplification, a saturated nonlinear transformation may be added to the upper and lower bounds of the transformation result.

Alternatively, the components of the optical flow information in the horizontal direction and the vertical direction can be recorded through different single-channel gray images, so that two single-channel gray images are obtained as optical flow images.

Optionally, the obtained optical flow image may be directly input into the time channel depth convolutional neural network for feature extraction, or may be stored and then input into the time channel depth convolutional neural network for feature extraction, which is not limited herein.

Step 15: inputting the optical flow image into a time channel depth convolution neural network to extract a second characteristic of the optical flow image;

the time channel deep convolutional neural network is a deep convolutional neural network which is obtained by training in advance based on a second training set, wherein the second training set comprises: recording a dynamic image with the violations and the types of the violations corresponding to the dynamic image;

specifically, the time-channel depth convolutional neural network extracts a dynamic feature (i.e., a second feature) between every two adjacent frames of images in the image sequence based on the optical flow image.

Optionally, the second training set may be a sample video image sequence recorded with an offence, and specifically, based on the sample video image sequence, a dynamic image (i.e. an optical flow image) recorded with an offence type such as that a person does not wear a helmet, the person removes the helmet, and the person spans a fence may be obtained.

Optionally, when training the deep convolutional neural network based on the second training set, the deep convolutional neural network may be trained by using the dynamic image obtained based on the sample video image sequence and the corresponding type of the offending behavior as input, so as to obtain the time channel deep convolutional neural network.

Step 16: fusing the first feature and the second feature into a third feature;

optionally, the first feature and the second feature are fused into a third feature based on a weighted average fusion algorithm, so as to reduce interference of useless frames or invalid frames in the image sequence.

Optionally, before the fusing of the first feature and the second feature into the third feature, the method further includes: normalizing the first feature and the second feature;

the fusing the first feature and the second feature into a third feature based on the weighted average fusion algorithm specifically comprises: and fusing the normalized first feature and the second feature into a third feature based on a weighted average fusion algorithm.

Specifically, a calculation formula for fusing the normalized first feature and the second feature based on a weighted average fusion algorithm specifically includes:

wherein x is _spatial And x _temporal C is the first and second features, c _spatial And c _temporal Weighting coefficients of the spatial channel depth convolutional neural network and the temporal channel depth convolutional neural network, respectively, optionally, weighting coefficient c _spatial And c _temporal 1/2 and 1/2 may be used, or 1/3 and 2/3 may be used, respectively, and the method is not limited herein. Because the recognition accuracy of the time channel depth convolution neural network is generally higher than that of the space channel depth convolution neural network in behavioral analysis research, the weight of the second feature obtained by properly increasing the time channel depth convolution neural network is beneficial to the improvement of the final recognition accuracy.

Step 17: identifying whether an offence exists in the image sequence based on the third feature;

the first characteristics and the second characteristics respectively extracted by the two-channel (namely, the real-time channel and the space channel) deep convolution neural network are fused to obtain third characteristics, and whether the illegal behaviors exist in the image sequence or not is recognized based on the third characteristics, so that the accuracy of the illegal behavior recognition result is greatly improved.

Step 18: and if the illegal behaviors exist, outputting alarm information.

Optionally, the outputting the alarm information includes: and outputting an alarm signal and an offence image, wherein the alarm signal is used for indicating that the offence exists in the image sequence, and the offence image is at least one image recorded with corresponding offence in the image sequence.

As can be seen from the above, the method for identifying the illegal behaviors provided by the embodiment of the present application includes: acquiring an image sequence acquired in real time based on a camera; respectively inputting each frame of image of the image sequence into a spatial channel depth convolution neural network to extract a first characteristic of each frame of image of the image sequence; calculating optical flow information of every two adjacent frames of images in the image sequence based on an optical flow estimation algorithm; recording optical flow information in an image format to obtain an optical flow image; inputting the optical flow image into a time channel depth convolution neural network to extract a second feature of the optical flow image; fusing the first feature and the second feature into a third feature; identifying whether there is an offence in the image sequence based on the third feature; and if the illegal behaviors exist, outputting alarm information. According to the embodiment of the application, the deep convolutional neural network is used for identifying the illegal behaviors respectively, so that the working efficiency of identifying the illegal behaviors of personnel is improved.

Example two

The embodiment of the application provides an illegal activity recognition device, and fig. 2 shows a schematic structural diagram of the illegal activity recognition device provided by the embodiment of the application.

Specifically, referring to fig. 2, the offence identification device 20 includes an acquisition module 21, a first feature extraction module 22, a calculation module 23, a conversion module 24, a second feature extraction module 25, a fusion module 26, an identification module 27, and an output module 28.

The acquisition module 21 is used for acquiring an image sequence acquired in real time based on a camera;

the first feature extraction module 22 is configured to input each frame of image of the image sequence into a spatial channel depth convolution neural network, so as to extract a first feature of each frame of image in the image sequence, where the spatial channel depth convolution neural network is a depth convolution neural network that is obtained by training based on a first training set in advance, and the first training set includes: recording a static image with the violations and the types of the violations corresponding to the static image;

the calculating module 23 is configured to calculate optical flow information of each two adjacent frames of images in the image sequence based on an optical flow estimation algorithm;

the conversion module 24 is configured to record the optical flow information in an image format, so as to obtain an optical flow image;

the second feature extraction module 25 is configured to input the optical flow image into a time-channel depth convolution neural network to extract a second feature of the optical flow image, where the time-channel depth convolution neural network is a depth convolution neural network that is trained in advance based on a second training set, and the second training set includes: recording a dynamic image with the violations and the types of the violations corresponding to the dynamic image;

the fusing module 26 is configured to fuse the first feature and the second feature into a third feature;

the identifying module 27 is configured to identify whether there is an offence in the image sequence based on the third feature;

the output module 28 is used for outputting alarm information if there is an illegal action.

Optionally, the calculation module 23 is specifically configured to: the high-precision optical flow estimation algorithm based on the variation theory calculates the optical flow information of every two adjacent frames of images in the image sequence.

Optionally, the conversion module 24 is specifically configured to: after the optical flow information is obtained, the components of the optical flow information in the horizontal direction and the vertical direction and the magnitude of the optical flow are respectively stored in three channels of an RGB image, and a colorful optical flow image is obtained.

Optionally, the conversion module 24 may be further configured to: before obtaining an optical flow image, the optical flow information is converted into corresponding pixel values.

Optionally, the above-mentioned fusion module 26 is specifically configured to: and fusing the first characteristic and the second characteristic into a third characteristic based on a weighted average fusion algorithm.

Optionally, the above-mentioned illegal activity recognition device 20 further includes: a processing module (not shown in the figure) for normalizing the first feature and the second feature; the above-mentioned fusion module 26 is specifically for: and fusing the normalized first feature and the second feature into a third feature based on a weighted average fusion algorithm.

Optionally, the output module 28 is specifically configured to: and outputting an alarm signal and an offence image, wherein the alarm signal is used for indicating that the offence exists in the image sequence, and the offence image is at least one image recorded with corresponding offence in the image sequence.

From the above, the device 20 for identifying the illegal behaviors provided by the embodiment of the application can acquire an image sequence acquired in real time based on the camera; respectively inputting each frame of image of the image sequence into a spatial channel depth convolution neural network to extract a first characteristic of each frame of image of the image sequence; calculating optical flow information of every two adjacent frames of images in the image sequence based on an optical flow estimation algorithm; recording optical flow information in an image format to obtain an optical flow image; inputting the optical flow image into a time channel depth convolution neural network to extract a second feature of the optical flow image; fusing the first feature and the second feature into a third feature; identifying whether there is an offence in the image sequence based on the third feature; and if the illegal behaviors exist, outputting alarm information. According to the embodiment of the application, the deep convolutional neural network is used for identifying the illegal behaviors respectively, so that the working efficiency of identifying the illegal behaviors of personnel is improved.

Example III

The embodiment of the present application further provides an apparatus for identifying an offence, referring to fig. 3, where the apparatus for identifying an offence includes a memory 31, a processor 32, and a computer program stored in the memory 31 and executable on the processor 32, where the memory 31 is used to store a software program and a module, and the processor 32 executes various functional applications and data processing by running the software program and the module stored in the memory 31. The memory 31 and the processor 32 are connected by a bus 33. Specifically, the processor 32 realizes the following steps by running the above-mentioned computer program stored in the memory 31:

acquiring an image sequence acquired in real time based on a camera;

fusing the first feature and the second feature into a third feature;

and if the illegal behaviors exist, outputting alarm information.

Assuming that the above is a first possible embodiment, in a second possible embodiment provided by way of the first possible embodiment, the recording the optical flow information in the image format to obtain the optical flow image includes:

In a third possible implementation manner provided by the first possible implementation manner or the second possible implementation manner, the fusing the first feature and the second feature into a third feature is specifically:

In a fourth possible implementation manner provided by the third possible implementation manner, before the fusing the first feature and the second feature into the third feature, the method further includes:

normalizing the first feature and the second feature;

In a fifth possible embodiment provided by the first possible embodiment or the second possible embodiment as a basis, the outputting the alarm information includes:

It should be appreciated that in embodiments of the present application, the processor 32 may be a central processing unit (Central Processing Unit, CPU), the processor 32 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSPs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Memory 31 may include read-only memory, flash memory, and random access memory, and provides instructions and data to the processor. Some or all of the memory 31 may also include nonvolatile random access memory.

From the above, the device for identifying the illegal behaviors provided by the embodiment of the application can acquire an image sequence acquired in real time based on the camera; respectively inputting each frame of image of the image sequence into a spatial channel depth convolution neural network to extract a first characteristic of each frame of image of the image sequence; calculating optical flow information of every two adjacent frames of images in the image sequence based on an optical flow estimation algorithm; recording optical flow information in an image format to obtain an optical flow image; inputting the optical flow image into a time channel depth convolution neural network to extract a second feature of the optical flow image; fusing the first feature and the second feature into a third feature; identifying whether there is an offence in the image sequence based on the third feature; and if the illegal behaviors exist, outputting alarm information. According to the embodiment of the application, the deep convolutional neural network is used for identifying the illegal behaviors respectively, so that the working efficiency of identifying the illegal behaviors of personnel is improved.

It should be appreciated that the above-described integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer-readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by instructing related hardware by a computer program, where the computer program may be stored in a computer readable storage medium, and the computer program may implement the steps of each of the method embodiments described above when executed by a processor. The computer program comprises computer program code, and the computer program code can be in a source code form, an object code form, an executable file or some intermediate form and the like. The computer readable medium may include: any entity or device capable of carrying the computer program code described above, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. The content of the computer readable storage medium can be appropriately increased or decreased according to the requirements of the legislation and the patent practice in the jurisdiction.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

It should be noted that, the method and the details thereof provided in the foregoing embodiments may be combined into the apparatus and the device provided in the embodiments, and are referred to each other and are not described in detail.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/device embodiments described above are merely illustrative, e.g., the division of modules or elements described above is merely a logical functional division, and may be implemented in other ways, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed.

The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims

1. A method for identifying violations, comprising:

acquiring an image sequence acquired in real time based on a camera in a transformer substation;

respectively inputting each frame of image of the image sequence into a spatial channel depth convolution neural network to extract first characteristics of each frame of image in the image sequence, wherein the spatial channel depth convolution neural network is a depth convolution neural network which is obtained by training in advance based on a first training set, and the first training set comprises: recording a static image with the violations and the types of the violations corresponding to the static image; the types of illegal behaviors include that personnel do not wear safety helmets, personnel take off the safety helmets and personnel span the fence;

inputting the optical flow image into a time channel depth convolution neural network to extract a second feature of the optical flow image, wherein the time channel depth convolution neural network is a depth convolution neural network which is obtained by training in advance based on a second training set, and the second training set comprises: recording a dynamic image with the violations and the types of the violations corresponding to the dynamic image;

fusing the first feature and the second feature into a third feature based on a weighted average fusion algorithm;

identifying whether there is an offence in the sequence of images based on the third feature;

if the illegal behaviors exist, outputting alarm information;

wherein, the recording the optical flow information in the image format to obtain an optical flow image includes:

the components of the optical flow information in the horizontal direction and the vertical direction and the amplitude of the optical flow are respectively stored in three channels of an RGB image, so that a colorful optical flow image is obtained;

before obtaining the optical flow image, further comprising:

according to a transformation formula, transforming the components of the optical flow information in the horizontal direction and the vertical direction and the amplitude of the optical flow into corresponding pixel values; the transformation formula is as follows:

F＝min[max[(flow×scale+128),255],0]；

wherein F is the pixel value after transformation, flow is the component of the optical flow information in the horizontal direction, the component of the optical flow in the vertical direction or the amplitude of the optical flow, scale is the magnification adjustment multiple, and the value of the image pixel is [0,255];

wherein before the fusing of the first feature and the second feature into the third feature based on the weighted average fusion algorithm, the method further comprises:

normalizing the first feature and the second feature;

the fusing the first feature and the second feature into a third feature based on a weighted average fusion algorithm specifically comprises the following steps:

fusing the normalized first feature and the second feature into a third feature based on a weighted average fusion algorithm; the first characteristic and the second characteristic after normalization processing are fused based on a weighted average fusion algorithm to apply a first formula;

the first formula is:

wherein x is _final For the third feature, x _spatial For the first feature, x _temporal For the second feature, c _spatial The weighting coefficients, c, for the spatial channel depth convolution neural network _temporal Weighting coefficients for the time channel depth convolutional neural network; the weighting coefficients of the spatial channel depth convolution neural network are smaller than the weighting coefficients of the temporal channel depth convolution neural network.

2. The method of claim 1, wherein outputting the alarm information comprises:

and outputting an alarm signal and an offence image, wherein the alarm signal is used for indicating that the offence exists in the image sequence, and the offence image is at least one image with corresponding offence recorded in the image sequence.

3. An offence identification device, comprising:

the acquisition module is used for acquiring an image sequence acquired in real time based on a camera in the transformer substation;

the first feature extraction module is configured to input each frame of image of the image sequence into a spatial channel depth convolution neural network respectively, so as to extract a first feature of each frame of image in the image sequence, where the spatial channel depth convolution neural network is a depth convolution neural network obtained by training in advance based on a first training set, and the first training set includes: recording a static image with the violations and the types of the violations corresponding to the static image; the types of illegal behaviors include that personnel do not wear safety helmets, personnel take off the safety helmets and personnel span the fence;

the fusion module is used for fusing the first feature and the second feature into a third feature based on a weighted average fusion algorithm;

the output module is used for outputting alarm information if the illegal behaviors exist;

the conversion module is specifically used for: the components of the optical flow information in the horizontal direction and the vertical direction and the amplitude of the optical flow are respectively stored in three channels of an RGB image, so that a colorful optical flow image is obtained;

the conversion module is further configured to: before the optical flow image is obtained, converting the components of the optical flow information in the horizontal direction and the vertical direction and the amplitude of the optical flow into corresponding pixel values according to a conversion formula; the transformation formula is as follows:

F＝min[max[(flow×scale+128),255],0]；

the fusion module is specifically used for: fusing the normalized first feature and the second feature into a third feature based on a weighted average fusion algorithm; the fusing the first feature and the second feature into a third feature based on a weighted average fusion algorithm specifically comprises the following steps:

the first formula is:

4. An offence identification device, comprising: memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 2 when the computer program is executed.

5. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 2.