CN111127355A - Method for finely complementing defective light flow graph and application thereof - Google Patents

Method for finely complementing defective light flow graph and application thereof Download PDF

Info

Publication number
CN111127355A
CN111127355A CN201911302248.2A CN201911302248A CN111127355A CN 111127355 A CN111127355 A CN 111127355A CN 201911302248 A CN201911302248 A CN 201911302248A CN 111127355 A CN111127355 A CN 111127355A
Authority
CN
China
Prior art keywords
image
smoke
area
optical flow
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911302248.2A
Other languages
Chinese (zh)
Inventor
张伟伟
郭鹏宇
李传昌
陈超
邱永锋
陈彦召
赵建波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hongrun Construction Waterproof Engineering Co Ltd
Shanghai University of Engineering Science
Foshan Viomi Electrical Technology Co Ltd
Original Assignee
Shanghai Hongrun Construction Waterproof Engineering Co Ltd
Shanghai University of Engineering Science
Foshan Viomi Electrical Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hongrun Construction Waterproof Engineering Co Ltd, Shanghai University of Engineering Science, Foshan Viomi Electrical Technology Co Ltd filed Critical Shanghai Hongrun Construction Waterproof Engineering Co Ltd
Priority to CN201911302248.2A priority Critical patent/CN111127355A/en
Publication of CN111127355A publication Critical patent/CN111127355A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for finely complementing a defect optical flow diagram and application thereof, comprising the following steps: collecting a smoke picture I in real time, processing the smoke picture I to obtain an optical flow image, and positioning a smoke dynamic shielding area; extracting an image area from the center of a minimum circumscribed square of the dynamic occlusion area, inputting the image area into a trained improved GAN network to obtain a substitute image of the dynamic occlusion area, wherein the improvement of the GAN network is to replace the countermeasure loss in the loss function of a context network module of the GAN network with the WGAN loss in the loss function of the context network module of the improved GAN network; processing the alternative image of the dynamic occlusion area through a VGG-19 network, and adding texture information; and mapping the image obtained by adding the texture information onto the optical flow image to obtain a complete global optical flow graph. The method of the invention has good completion effect and high processing speed; the electronic equipment has simple structure, low cost and good application prospect.

Description

Method for finely complementing defective light flow graph and application thereof
Technical Field
The invention belongs to the technical field of image processing and deep learning, and relates to a method for finely complementing a defective optical flow graph and application thereof.
Background
With the continuous development of machine vision and image processing, the visual detection effect is more and more accurate, the image processing technology needs a difficult point to break through, and blocking is a common interference in vision. When detecting a moving object in a field of view by an optical flow method, there are various things that may cause occlusion, including other objects. Occlusion means that a part of the object is occluded resulting in a part of the object being lost. The vision system obviously cannot detect things which do not exist in the image, and the robustness of moving object detection is influenced. Taking the detection of the smoke concentration as an example, the accurate evaluation of the smoke concentration depends on the estimation of the whole smoke flow field to a great extent, and the precision of the whole smoke flow field depends on the prediction of the smoke optical flow field after the blocking. Unfortunately, due to the unreliability of the light intensity matching, the light flow is uncertain at the occlusion, and to solve this problem the global smoke light flow must be accurately obtained.
Document 1(c.yang, x.lu, z.lin, Eli shechman, over Wang and h.li, High-Resolution Image Inpainting Multi-Scale Neural pitch Synthesis, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), arXiv:1611.09969, (2017)) describes a method for training a GAN network and then completing a defective light flow graph by using a combination of L2 loss and countermeasures loss, wherein the GAN is a generative countermeasure model and is also an unsupervised learning model, which is most characterized by providing a countermeasure training mode for a deep network, which helps to solve some problems that are not easily solved by the common training modes, but the loss function of the original GAN has the following defects: when the training of the D (discriminator) is better, the gradient of the G (generator) disappears more seriously, so that the training of the G is limited, namely the training stability is poor, and the collapse is easy to occur.
Therefore, the development of a method for finely complementing the defective optical flow diagram with good complementing effect and high processing speed is of practical significance.
Disclosure of Invention
The invention aims to overcome the defects of poor completion effect, large data processing amount and low processing speed in the prior art, provides a method for finely completing a defective optical flow diagram with good completion effect and high processing speed, and solves the problem of optical flow diagram defect caused by shielding of a current flowing object, thereby greatly improving the integrity of the optical flow diagram, providing a reliable foundation for subsequent processing, and being suitable for finely completing the optical flow diagram of the flowing object.
In order to achieve the purpose, the invention provides the following technical scheme:
a method for finely completing a defective light flow graph comprises the following steps:
(1) collecting a smoke picture I in real time, processing the smoke picture I to obtain an optical flow image, and detecting and positioning a smoke dynamic shielding area;
(2) extracting an image area B from the center of the minimum external square of the dynamic occlusion area, and inputting the image area B into the trained improved GAN network to obtain a substitute image of the dynamic occlusion area;
the improved GAN network is a GAN network that improves a loss function of a context network module of the GAN network, the competing loss in the loss function of the context network module of the GAN network being replaced with a WGAN loss in the loss function of the context network module of the improved GAN network;
in the training process of the improved GAN network, an image area A of one frame of image in a historical data set is used as input, a picture of a corresponding dynamic shielding area of another frame of image in the historical data set is used as theoretical output, and parameters of the improved GAN network are continuously adjusted, wherein the training termination condition is that the similarity between the actual output and the theoretical output reaches more than 98%;
(3) processing the alternative image of the dynamic occlusion area through a VGG-19 network, and adding texture information;
(4) and (4) mapping the image obtained in the step (3) to the optical flow image obtained in the step (1) to obtain a complete global optical flow diagram.
According to the method for finely complementing the defect optical flow graph, Wassertein GAN (WGAN) is used for replacing the original antagonistic loss, a method combining L2 loss and WGAN loss is used for training and improving a GAN network, the stability of network training is improved, and the missing part content is generated better; meanwhile, the method deletes the multi-scale scheme of the original network, directly takes the image area with a certain size as input, avoids the repeated calculation of the original scheme on the networks with different scales, greatly reduces the data processing amount, greatly improves the operation efficiency of the network, and finally adds the texture information of the substitute image of the dynamic shielding area by utilizing the VGG-19 network structure, so that the completed optical flow graph is closer to the real texture, the fine completion of the defective optical flow graph is realized, and the method has great application prospect.
As a preferred technical scheme:
the method for finely complementing the defective light flow graph comprises the following steps that a group of historical data sets comprises a plurality of adjacent frames of images without smoke and a plurality of adjacent frames of images containing smoke, which are shot at the same position and at the same shooting angle; and the image area A is obtained by processing an image positioning smoke dynamic occlusion area in the historical data set and extracting the image positioning smoke dynamic occlusion area from the center of the minimum circumscribed square of the dynamic occlusion area.
The method for performing fine completion on defective light flow diagrams as described above, in which the size of the image area B is 256 × 256 pixels, which is the same as that of the image area a, is not limited thereto, and the size of the image areas a and B is not limited thereto, but should be the same.
The method for performing fine completion on the defective light flow graph comprises the following specific steps of (1):
(1.1) preprocessing a smoke picture I after the smoke picture I is collected in real time, wherein the preprocessing refers to the construction of smoke information data with labeled information;
(1.2) inputting the preprocessed smoke picture I to obtain an optical flow image of the smoke picture I by inputting the trained LiteFlowNet;
the training process of the LiteFlowNet takes a preprocessed smoke picture II as input, takes an optical flow image generated by the smoke picture II as theoretical output, and continuously adjusts the parameter of the LiteFlowNet, wherein the training termination condition is that the maximum iteration number P is reached;
(1.3) performing convolution on the optical flow image obtained in the step (1.2) by using a LoG filter to obtain a filtered optical flow image;
(1.4) searching a zero crossing position in the image obtained in the step (1.3), and further determining the position of an optical flow edge between different objects on the image, namely determining a dynamic occlusion area;
(1.5) introducing image semantic information to judge whether the shielded area obtained in the step (1.4) is smoke or not;
and (1.6) detecting the optical flow generated by dynamic occlusion, and mapping the optical flow image obtained in the step (1.2) back to the dynamic occlusion area.
The smoke dynamic occlusion area detection and positioning method comprises the steps of firstly constructing LiteFlowNet, training the LiteFlowNet by adopting a historical database (comprising a preprocessed smoke picture II and an optical flow image corresponding to the preprocessed smoke picture II), secondly preprocessing the smoke picture I acquired in real time, inputting the trained LiteFlowNet to obtain the optical flow image, filtering the optical flow image, then searching a zero crossing position to determine a dynamic occlusion area, then introducing image semantic information to judge the dynamic occlusion area, and finally removing the dynamic occlusion area from the optical flow image, so that only pure smoke optical flow is left on the optical flow image. The method distinguishes the optical flow of the motion shielding object and the smoke according to the connectivity of the optical flow area and the semantic information similarity, provides a new idea for solving the problem of dynamic object shielding in smoke detection, has small processing data amount, low requirement on hardware, high processing speed, good processing effect and great application prospect, and can be widely applied to abnormal detection of factory fires, forest fires, automobile exhaust and the like.
According to the method for finely complementing the defective light flow graph, the smoke picture II is acquired through the camera and at least comprises 30 video and 1000 smoke image pairs, and meanwhile, in order to avoid network overfitting caused by training data loss, methods such as image translation, mirror image, bilinear interpolation and image scaling can be used for amplifying training data, so that the network is more robust, and better light flow estimation is obtained;
the LiteFlowNet becomes the preferred network of the first step optical flow detection with extremely small parameters and extremely high precision, and has two sub-networks of NetC and NetE; the NetC extracts two multi-scale high-dimensional pyramid features for any input picture pair, NetE is used for estimating a thick-to-thin flow field of smoke, a NetC sub-network is designed into a double-flow network, and the double flows share weights with each other.
In the method for performing fine completion on the defective light flow graph, the LoG operator g (r) has the following formula:
Figure BDA0002322126880000051
LoG filter
Figure BDA0002322126880000052
Is the second derivative of G (r), and is formulated as follows:
Figure BDA0002322126880000053
the convolution formula is as follows:
Figure BDA0002322126880000054
wherein I and j represent the abscissa and ordinate, respectively, of a pixel in the image, I0(i, j) and g (i, j) represent the optical flow image obtained in step (1.2) and the filtered optical flow image, respectively, and σ represents the variance;
the process of searching the zero crossing position is specifically as follows:
i) taking a part of the image acquired in the step (1.3), and amplifying the part to a pixel level;
ii) in the area of 3X3 with the pixel point X as the center, determining the positive and negative relations of the pixel points of four pairs of upper, lower, left, right and two diagonal lines;
and iii) if the positive and negative relations of the gray values of the four pairs of pixel points after being filtered in the step (1.3) are opposite and the absolute value of the gray value difference corresponding to the four pairs of pixel points is less than a threshold value a, the pixel point X is the zero crossing position.
According to the method for finely complementing the defective light flow graph, after all zero crossing positions are found, all zero crossing position points are mapped back to the light flow image obtained in the step (1.2), and then the image with the edge of the shielding area can be obtained;
the dynamic occlusion regions are extracted with MH zero crossing detector;
the step of judging whether the shielding area obtained in the step (1.4) is smoke or not by introducing the image semantic information refers to judging whether the shielding area is smoke or not based on the smoke gray scale characteristics, and if the gray scale value of the shielding area is greater than 220 or less than 80, the shielding area is not smoke; otherwise, the smoke is generated;
inputting two adjacent frames in the preprocessed smoke picture I by the LiteFlowNet, and outputting optical flow images corresponding to the two adjacent frames;
the P is 10000; there are generally two situations for the neural network to stop: 1. the set precision requirement is met; 2, the maximum iteration number is reached, the second method is adopted here, the iteration number is set to 10000 (general situation), namely when the iteration number reaches 10000 in the training process, the training is terminated, at this time, the activation function of the LiteFlowNet network loss function is converged, the training is completed, and of course, a person skilled in the art can select the maximum iteration number P and the condition for stopping the training according to the actual requirement;
the picture resolution of the smoke picture I and the smoke picture II is larger than or equal to 1920 multiplied by 1080 pixels.
The method for performing fine completion on the defective light flow graph as described above, wherein the calculation formula of the Loss function Loss _ D of the context network module of the improved GAN network is as follows:
Loss_D=λLl2(S,Sg,R)+(1-λ)LDWGAN(S,Sg,Dw);
wherein λ is a constant, in particular 0.9;
l in the above formulal2(S,SgR) and LDwGAN(S,Sg,Dw) The derivation of (c) is as follows:
for each training image, the L2 loss is defined as follows:
Figure BDA0002322126880000061
where r (S) represents the response of the input image S and h (×) defines the operation of extracting the sub-images in the missing region, i.e. h (Sg, M) returns the content of Sg within the distance M Wasserstein, called the earth movement distance (EM distance), which can be used as a distance measure for two probability distributions, defined as follows:
Figure BDA0002322126880000062
however, the formula cannot be directly calculated, and an optimal conversion method of equivalent Wtherstein distance can be obtained according to the dual principle of Camtorovick-Lubingstein;
Figure BDA0002322126880000071
currently, the discrimination network Dw can be constructed with the parameters w, and the last layer of the network is not the nonlinear active layer. Under the condition of limiting w to be within a certain range, the
Figure BDA0002322126880000072
Taking the maximum value, L represents the approximate distance between the padded image Sr and the ground-trouhsg. Where E is the constraint and x is every pixel in the padded image. We use LWGAN to define the loss of the discriminator, the loss function should be minimal, so we take the relative values of LWGAN to get:
Figure BDA0002322126880000073
in the method for performing fine completion on the defective light flow graph, the Loss function Loss _ G of the VGG-19 network is defined as follows:
Loss_G=arg minEc(R(M),R(Mg))+αEt(T(R(M)),T(Mg))+βγ(M);
where R and T represent the outputs of the content and texture networks, respectively, and α and β are E, respectivelyt(T(R(M)),T(Mg) And weight of γ (M), Ec(R(M),R(Mg) Represents an overall content constraint, which is used to penalize the L2 difference between the optimal result and the previous content network prediction, also known as the pixel-level euclidean distance, Et(T(R(M)),T(Mg) Empirically, α and β are set to 5e-6 to balance the magnitude of the two to achieve the best experimental results, and γ (M) represents total variation loss (TVloss) which is used to increase smoothness and therefore requires the square of the image gradient to be summed;
Figure BDA0002322126880000074
Figure BDA0002322126880000075
Figure BDA0002322126880000081
wherein (c, d) are the coordinates of the deletion region M.
VGG-19 network architecture designed to produce sharper high frequency details. We have found that using relu3_1 and relu4_1 to compute texture terms will yield more accurate results than using one layer alone. Meanwhile, it is known that the VGG-19 network is trained by semantic classification, and the middle layer of the network has strong invariance to texture distortion. This function can help us to reconstruct missing content more accurately. The network in this patent contains three kinds of losses, wherein the computation of the perceptual loss is relatively complex. The results show that in practical operation, the loss function (simplifying the perceptual loss) is more sensitive and accurate to the simple texture features of the optical flow.
The invention also provides electronic equipment applying the method for finely completing the defective light flow graph, which comprises one or more processors, one or more memories, one or more programs and an image acquisition device;
the image acquisition apparatus is configured to acquire a smoke picture I in real time, the one or more programs being stored in the memory, and the one or more programs, when executed by the processor, causing the electronic device to perform a method of fine completion of a defective light flow map as described above.
Has the advantages that:
(1) according to the method for finely complementing the defect optical flow graph, Wassertein GAN (WGAN) is used for replacing the original antagonistic loss, a method combining L2 loss and WGAN loss is used for training and improving a GAN network, the stability of network training is improved, and the missing part content is generated better;
(2) the method for finely complementing the defect optical flow graph has good complementing effect, high processing speed and great application prospect;
(3) the electronic equipment has the advantages of simple structure, low cost, fine completion of the defective light flow diagram and good application prospect.
Drawings
FIG. 1 is a schematic diagram of three methods described in example 1 and comparative examples 1 to 2;
FIG. 2 is a graph showing a comparison between the completion effects of example 1 and comparative examples 1 to 2;
FIG. 3 is a network structure of a context encoder of the present invention;
FIG. 4 is a flow chart of a method of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to the present invention.
Detailed Description
The following further describes the embodiments of the present invention with reference to the attached drawings.
Example 1
A method for performing fine completion on a defective light flow graph, the steps of which are shown in fig. 4, specifically are as follows:
(1) the method comprises the following steps of collecting a smoke picture I in real time, processing the smoke picture I to obtain an optical flow image, and detecting and positioning a smoke dynamic shielding area, wherein the method specifically comprises the following steps:
(1.1) preprocessing the smoke picture I after the smoke picture I is collected in real time, wherein the preprocessing refers to constructing smoke information data with marking information, namely marking a smoke area in the smoke picture I, wherein the picture resolution of the smoke picture I is larger than or equal to 1920 multiplied by 1080 pixels;
(1.2) inputting the trained LiteFlowNet to obtain an optical flow image of the smoke picture I by taking the preprocessed smoke picture I as input, wherein the input of the LiteFlowNet is two adjacent frames in the preprocessed smoke picture I, and the output of the LiteFlowNet is the optical flow image corresponding to the two adjacent frames;
LiteFlowNet has two subnetworks, NetC and net; the NetC extracts two multi-scale high-dimensional pyramid characteristics from any input picture pair, the NetE is used for estimating a flow field of smoke from thick to thin, a NetC sub-network is designed into a double-flow network, and weights are shared between the double flows;
the training process of the LiteFlowNet takes a preprocessed smoke picture II as input, an optical flow image generated by the smoke picture II as theoretical output, and the process of continuously adjusting the parameters of the LiteFlowNet, wherein the training termination condition is that the maximum iteration number reaches 10000 times, the smoke picture II is acquired by a camera and at least comprises 30 videos and 1000 smoke image pairs, and the picture resolution of the smoke picture II is more than or equal to 1920 x 1080 pixels;
(1.3) convolving the optical flow image obtained in the step (1.2) by using a LoG filter to obtain a filtered optical flow image, wherein the processing formula is as follows:
LoG operator G (r) has the following formula:
Figure BDA0002322126880000101
LoG filter
Figure BDA0002322126880000102
Is the second derivative of G (r), and is formulated as follows:
Figure BDA0002322126880000103
the convolution formula is as follows:
Figure BDA0002322126880000104
wherein I and j represent the abscissa and ordinate, respectively, of a pixel in the image, I0(i, j) and g (i, j) represent the optical flow image obtained in step (1.2) and the filtered optical flow image, respectively, and σ represents the variance;
(1.4) searching a zero crossing position in the image obtained in the step (1.3), and further determining the position of an optical flow edge between different objects on the image, namely determining a dynamic occlusion area;
the process of finding the zero crossing position is specifically as follows:
i) taking a part of the image acquired in the step (1.3), and amplifying the part to a pixel level;
ii) in the area of 3X3 with the pixel point X as the center, determining the positive and negative relations of the pixel points of four pairs of upper, lower, left, right and two diagonal lines;
iii) if the positive and negative relations of the gray values of the four pairs of pixel points after being filtered in the step (1.3) are opposite and the absolute value of the gray value difference corresponding to the four pairs of pixel points is less than a threshold value a, the pixel point X is the zero crossing position;
after all zero crossing positions are found, mapping all zero crossing position points back to the optical flow image obtained in the step (1.2) to obtain an image with the edge of a dynamic occlusion area, wherein the dynamic occlusion area is extracted by an MH zero crossing detector;
(1.5) introducing image semantic information to judge whether the shielded area obtained in the step (1.4) is smoke or not, namely judging whether the shielded area is smoke or not based on smoke gray scale characteristics, specifically judging whether the gray scale value of the shielded area is greater than 220 or less than 80, and judging that the shielded area is not smoke; otherwise, the smoke is generated;
(1.6) detecting an optical flow generated by dynamic shielding, and mapping the optical flow image obtained in the step (1.2) back to a dynamic shielding area;
(2) extracting an image area B (256 multiplied by 256 pixels) from the center of the minimum circumscribed square of the dynamic occlusion area, and inputting the image area B into a trained improved GAN network to obtain a substitute image of the dynamic occlusion area;
the modified GAN network is a GAN network that modifies a loss function of a context network module, the competing loss in the loss function of the context network module of the GAN network being replaced with a WGAN loss in the loss function of the context network module of the modified GAN network;
the calculation formula of the Loss function Loss _ D of the context network module of the improved GAN network is as follows:
Loss_D=λLl2(S,Sg,R)+(1-λ)LDWGAN(S,Sg,Dw);
Figure BDA0002322126880000111
Figure BDA0002322126880000112
wherein λ is a constant, in particular 0.9; r (S) represents the response of the input image S; h (Sg, M) means the content of Sg within the distance of return m.wasserstein; e is a constraint, and x is each pixel in the padded image;
the training process of the improved GAN network is a process of continuously adjusting parameters of the improved GAN network by taking an image area A (256 multiplied by 256 pixels) of one frame of image in a historical data set as input and taking a picture of a corresponding dynamic occlusion area of another frame of image in the historical data set as theoretical output, and the termination condition of the training is that the similarity between the actual output and the theoretical output reaches more than 98 percent; the image area A is obtained by processing images in the historical data set, positioning a smoke dynamic occlusion area and extracting the smoke dynamic occlusion area from the center of a minimum circumscribed square of the dynamic occlusion area;
(3) the alternative image of the dynamic occlusion region is processed through the VGG-19 network, and texture information is added, and the network structure of the context encoder (modified GAN network → VGG-19 network) of the invention is shown in fig. 3;
the Loss function Loss _ G of the VGG-19 network is defined as follows:
Loss_G=arg minEc(R(M),R(Mg))+αEt(T(R(M)),T(Mg))+βγ(M);
where R and T represent the outputs of the content and texture networks, respectively, and α and β are E, respectivelyt(T(R(M)),T(Mg) And weight of γ (M), Ec(R(M),R(Mg) Represents the L2 difference between the penalty optima and the previous content network forecast, Et(T(R(M)),T(Mg) Represents global texture constraints, y (m) represents total variation loss;
Figure BDA0002322126880000121
Figure BDA0002322126880000122
Figure BDA0002322126880000123
wherein, (c, d) are the coordinates of the deleted region M;
(4) and (4) mapping the image obtained in the step (3) to the optical flow image obtained in the step (1.2) to obtain a complete global optical flow diagram.
Comparative example 1
A method of completing a defective light flow map, which comprises the steps substantially the same as those of example 1, except that the step (2) comprises: continuously deriving the shielding edge, and enabling the shielding area to shrink and approach the center of the shielding area according to the continuous change of the light stream gradient.
Comparative example 2
A method of completing a defective light flow map, which comprises the steps substantially the same as those of example 1, except that the step (2) comprises: and finding out the minimum external square of the shielded area, performing continuous local convolution in the square area, and filling the optical flow of the shielded area line by line according to the texture information of the optical flow.
The schematic diagrams of the embodiment 1 and the comparative examples 1-2 and the graphs obtained by processing the same image are shown in fig. 1 (continuous derivation corresponds to the comparative example 1, local convolution corresponds to the comparative example 2, and context coding corresponds to the embodiment 1), and the comparison graph of the completion effect of processing the original image, the embodiments 1, the comparative examples 1-2 and the method using GAN (as described in the background art) is shown in fig. 2, and as can be seen from fig. 1 and 2, compared with the prior art, the comparative example 1 and the comparative example 2, the method of the present invention significantly improves the completion effect, i.e., the method of the present invention for finely completing the defective optical flow graph has the advantages of good completion effect, fast processing speed and great application prospect.
Example 2
An electronic device, as shown in fig. 5, includes one or more processors, one or more memories, one or more programs, and an image acquisition apparatus;
the image acquisition device is used for acquiring a smoke picture I in real time, one or more programs are stored in the memory, and when the one or more programs are executed by the processor, the electronic equipment is enabled to execute the same method for performing fine completion on a defective light flow graph as in embodiment 1.
Proved by verification, the electronic equipment has simple structure and low cost, can finely supplement the defect optical flow diagram, and has good application prospect.
Although specific embodiments of the present invention have been described above, it will be appreciated by those skilled in the art that these embodiments are merely illustrative and various changes or modifications may be made without departing from the principles and spirit of the invention.

Claims (10)

1. A method for finely completing a defective light flow graph is characterized by comprising the following steps:
(1) collecting a smoke picture I in real time, processing the smoke picture I to obtain an optical flow image, and detecting and positioning a smoke dynamic shielding area;
(2) extracting an image area B from the center of the minimum external square of the dynamic occlusion area, and inputting the image area B into the trained improved GAN network to obtain a substitute image of the dynamic occlusion area;
the improved GAN network is a GAN network that improves a loss function of a context network module of the GAN network, the competing loss in the loss function of the context network module of the GAN network being replaced with a WGAN loss in the loss function of the context network module of the improved GAN network;
the training process of the improved GAN network is a process of continuously adjusting parameters of the improved GAN network by taking an image area A of one frame of image in a historical data set as input and taking a picture of a corresponding dynamic shielding area of another frame of image in the historical data set as theoretical output, and the termination condition of the training is that the similarity between the actual output and the theoretical output reaches more than 98 percent;
(3) processing the alternative image of the dynamic occlusion area through a VGG-19 network, and adding texture information;
(4) and (4) mapping the image obtained in the step (3) to the optical flow image obtained in the step (1) to obtain a complete global optical flow diagram.
2. A method of refining a defect light flow map as claimed in claim 1 wherein a set of historical data sets comprises adjacent multiframe images containing smoke; and the image area A is obtained by processing an image positioning smoke dynamic occlusion area of the historical data set and extracting the image positioning smoke dynamic occlusion area from the center of the minimum external square of the dynamic occlusion area.
3. A method of fine completion of a defective light flow map as claimed in claim 2, characterized in that image area B is of the same size as image area a and is 256 x 256 pixels.
4. The method for performing fine completion on a defective light flow graph according to claim 1, wherein the step (1) is specifically as follows:
(1.1) preprocessing a smoke picture I after the smoke picture I is collected in real time, wherein the preprocessing refers to the construction of smoke information data with labeled information;
(1.2) inputting the preprocessed smoke picture I to obtain an optical flow image of the smoke picture I by inputting the trained LiteFlowNet;
the training process of the LiteFlowNet takes a preprocessed smoke picture II as input, takes an optical flow image generated by the smoke picture II as theoretical output, and continuously adjusts the parameter of the LiteFlowNet, wherein the training termination condition is that the maximum iteration number P is reached;
(1.3) performing convolution on the optical flow image obtained in the step (1.2) by using a LoG filter to obtain a filtered optical flow image;
(1.4) searching a zero crossing position in the image obtained in the step (1.3), and further determining the position of an optical flow edge between different objects on the image, namely determining a dynamic occlusion area;
(1.5) introducing image semantic information to judge whether the shielded area obtained in the step (1.4) is smoke or not;
and (1.6) detecting the optical flow generated by dynamic occlusion, and mapping the optical flow image obtained in the step (1.2) back to the dynamic occlusion area.
5. The method of claim 4, wherein smoke frame II is captured by a camera and comprises at least 30 video and 1000 smoke image pairs;
the LiteFlowNet has two sub-networks, NetC and NetE; the NetC extracts two multi-scale high-dimensional pyramid features for any input picture pair, NetE is used for estimating a thick-to-thin flow field of smoke, a NetC sub-network is designed into a double-flow network, and the double flows share weights with each other.
6. The method of claim 4, wherein the LoG operator G (r) has the following formula:
Figure FDA0002322126870000021
LoG filter
Figure FDA0002322126870000022
Is the second derivative of G (r), and is formulated as follows:
Figure FDA0002322126870000031
the convolution formula is as follows:
Figure FDA0002322126870000032
wherein I and j represent the abscissa and ordinate, respectively, of a pixel in the image, I0(i, j) and g (i, j) represent the optical flow image obtained in step (1.2) and the filtered optical flow image, respectively, and σ represents the variance;
the process of searching the zero crossing position is specifically as follows:
i) taking a part of the image acquired in the step (1.3), and amplifying the part to a pixel level;
ii) in the area of 3X3 with the pixel point X as the center, determining the positive and negative relations of the pixel points of four pairs of upper, lower, left, right and two diagonal lines;
and iii) if the positive and negative relations of the gray values of the four pairs of pixel points after being filtered in the step (1.3) are opposite and the absolute value of the gray value difference corresponding to the four pairs of pixel points is less than a threshold value a, the pixel point X is the zero crossing position.
7. The method for performing fine completion on defective light flow graph according to claim 6, wherein after all zero crossing positions are found, mapping all zero crossing position points back to the light flow image obtained in step (1.2) to obtain an image with the edge of the occlusion region;
the dynamic occlusion regions are extracted with MH zero crossing detector;
the step of judging whether the shielding area obtained in the step (1.4) is smoke or not by introducing the image semantic information refers to judging whether the shielding area is smoke or not based on the smoke gray scale characteristics, and if the gray scale value of the shielding area is greater than 220 or less than 80, the shielding area is not smoke; otherwise, the smoke is generated;
inputting two adjacent frames in the preprocessed smoke picture I by the LiteFlowNet, and outputting optical flow images corresponding to the two adjacent frames;
the P is 10000;
the picture resolution of the smoke picture I and the smoke picture II is larger than or equal to 1920 multiplied by 1080 pixels.
8. The method according to claim 1, wherein the formula for calculating the Loss function Loss _ D of the context network module of the modified GAN network is as follows:
Loss_D=λLl2(S,Sg,R)+(1-λ)LDWGAN(S,Sg,Dw);
Figure FDA0002322126870000041
Figure FDA0002322126870000042
wherein λ is a constant, in particular 0.9; r (S) represents the response of the input image S; h (Sg, M) means the content of Sg within the distance of return m.wasserstein; e is the constraint and x is filling each pixel in the image.
9. The method of claim 1, wherein the Loss function Loss _ G of the VGG-19 network is defined as follows:
Loss_G=arg minEc(R(M),R(Mg))+αEt(T(R(M)),T(Mg))+βγ(M);
where R and T represent the outputs of the content and texture networks, respectively, and α and β are E, respectivelyt(T(R(M)),T(Mg) And weight of γ (M), Ec(R(M),R(Mg) Represents the L2 difference between the penalty optima and the previous content network forecast, Et(T(R(M)),T(Mg) Denotes the global texture constraint, γ (M) denotes total variation loss;
Figure FDA0002322126870000043
Figure FDA0002322126870000044
Figure FDA0002322126870000045
wherein (c, d) are the coordinates of the deletion region M.
10. An electronic device applying the method of fine completion of a defective light flow graph according to any of claims 1 to 9, comprising one or more processors, one or more memories, one or more programs, and an image acquisition device;
the image acquisition apparatus is configured to acquire a smoke picture I in real time, the one or more programs being stored in the memory, and when executed by the processor, the one or more programs causing the electronic device to perform a method of fine completion of a defective light flow graph as claimed in any one of claims 1 to 9.
CN201911302248.2A 2019-12-17 2019-12-17 Method for finely complementing defective light flow graph and application thereof Withdrawn CN111127355A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911302248.2A CN111127355A (en) 2019-12-17 2019-12-17 Method for finely complementing defective light flow graph and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911302248.2A CN111127355A (en) 2019-12-17 2019-12-17 Method for finely complementing defective light flow graph and application thereof

Publications (1)

Publication Number Publication Date
CN111127355A true CN111127355A (en) 2020-05-08

Family

ID=70498269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911302248.2A Withdrawn CN111127355A (en) 2019-12-17 2019-12-17 Method for finely complementing defective light flow graph and application thereof

Country Status (1)

Country Link
CN (1) CN111127355A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149617A (en) * 2020-10-13 2020-12-29 中国工程物理研究院计算机应用研究所 Pulse waveform denoising method based on deep learning
CN116438568A (en) * 2022-05-23 2023-07-14 上海玄戒技术有限公司 Position difference map generation method and device, electronic equipment, chip and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180293496A1 (en) * 2017-04-06 2018-10-11 Pixar Denoising monte carlo renderings using progressive neural networks
CN109360159A (en) * 2018-09-07 2019-02-19 华南理工大学 A kind of image completion method based on generation confrontation network model
CN109377448A (en) * 2018-05-20 2019-02-22 北京工业大学 A kind of facial image restorative procedure based on generation confrontation network
US20190094875A1 (en) * 2017-09-28 2019-03-28 Nec Laboratories America, Inc. Generating occlusion-aware bird eye view representations of complex road scenes
CN109977790A (en) * 2019-03-04 2019-07-05 浙江工业大学 A kind of video smoke detection and recognition methods based on transfer learning
CN110222628A (en) * 2019-06-03 2019-09-10 电子科技大学 A kind of face restorative procedure based on production confrontation network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180293496A1 (en) * 2017-04-06 2018-10-11 Pixar Denoising monte carlo renderings using progressive neural networks
US20190094875A1 (en) * 2017-09-28 2019-03-28 Nec Laboratories America, Inc. Generating occlusion-aware bird eye view representations of complex road scenes
CN109377448A (en) * 2018-05-20 2019-02-22 北京工业大学 A kind of facial image restorative procedure based on generation confrontation network
CN109360159A (en) * 2018-09-07 2019-02-19 华南理工大学 A kind of image completion method based on generation confrontation network model
CN109977790A (en) * 2019-03-04 2019-07-05 浙江工业大学 A kind of video smoke detection and recognition methods based on transfer learning
CN110222628A (en) * 2019-06-03 2019-09-10 电子科技大学 A kind of face restorative procedure based on production confrontation network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZEYANG MI ET AL.: "Sniffer-Net: quantitative evaluation of smoke in the wild based on spatial–temporal motion spectrum", 《NEURAL COMPUTING AND APPLICATIONS (2020) 32:9165–9180;HTTPS://DOI.ORG/10.1007/S00521-019-04426-Z》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149617A (en) * 2020-10-13 2020-12-29 中国工程物理研究院计算机应用研究所 Pulse waveform denoising method based on deep learning
CN116438568A (en) * 2022-05-23 2023-07-14 上海玄戒技术有限公司 Position difference map generation method and device, electronic equipment, chip and medium

Similar Documents

Publication Publication Date Title
CN113065558B (en) Lightweight small target detection method combined with attention mechanism
CN111639692B (en) Shadow detection method based on attention mechanism
CN112150493B (en) Semantic guidance-based screen area detection method in natural scene
CN103325112B (en) Moving target method for quick in dynamic scene
CN110689482A (en) Face super-resolution method based on supervised pixel-by-pixel generation countermeasure network
CN111768388A (en) Product surface defect detection method and system based on positive sample reference
WO2019136591A1 (en) Salient object detection method and system for weak supervision-based spatio-temporal cascade neural network
CN113591968A (en) Infrared weak and small target detection method based on asymmetric attention feature fusion
CN111582092B (en) Pedestrian abnormal behavior detection method based on human skeleton
CN113159466B (en) Short-time photovoltaic power generation prediction system and method
CN112818969A (en) Knowledge distillation-based face pose estimation method and system
CN111079739A (en) Multi-scale attention feature detection method
CN110827320B (en) Target tracking method and device based on time sequence prediction
CN111160291A (en) Human eye detection method based on depth information and CNN
CN114881867A (en) Image denoising method based on deep learning
CN112308087A (en) Integrated imaging identification system and method based on dynamic vision sensor
CN111767826A (en) Timing fixed-point scene abnormity detection method
CN111127355A (en) Method for finely complementing defective light flow graph and application thereof
CN115761563A (en) River surface flow velocity calculation method and system based on optical flow measurement and calculation
CN115661611A (en) Infrared small target detection method based on improved Yolov5 network
Wang et al. MSAGAN: a new super-resolution algorithm for multispectral remote sensing image based on a multiscale attention GAN network
CN111444913B (en) License plate real-time detection method based on edge guiding sparse attention mechanism
Ren et al. A lightweight object detection network in low-light conditions based on depthwise separable pyramid network and attention mechanism on embedded platforms
CN116721288A (en) Helmet detection method and system based on YOLOv5
Xia et al. Unsupervised optical flow estimation with dynamic timing representation for spike camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20200508