CN112446436A - Anti-fuzzy unmanned vehicle multi-target tracking method based on generation countermeasure network - Google Patents

Anti-fuzzy unmanned vehicle multi-target tracking method based on generation countermeasure network Download PDF

Info

Publication number
CN112446436A
CN112446436A CN202011460523.6A CN202011460523A CN112446436A CN 112446436 A CN112446436 A CN 112446436A CN 202011460523 A CN202011460523 A CN 202011460523A CN 112446436 A CN112446436 A CN 112446436A
Authority
CN
China
Prior art keywords
image
network
target
loss function
fuzzy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011460523.6A
Other languages
Chinese (zh)
Inventor
梁军
马皓月
刘创
张婳
张智源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202011460523.6A priority Critical patent/CN112446436A/en
Publication of CN112446436A publication Critical patent/CN112446436A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/10Image enhancement or restoration using non-spatial domain filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • G06T2207/30256Lane; Road marking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an anti-fuzzy unmanned vehicle multi-target tracking method based on a generation countermeasure network. The method and the device aim at the situation that images collected by a camera are fuzzy due to vehicle body shake and the like, realize the processing and multi-target tracking of the fuzzy video sequence collected by the unmanned vehicle, have simple and convenient realization method and flexible means, can effectively solve the problem that the multi-target tracking effect is poor due to shake, and improve the accuracy of multi-target tracking.

Description

Anti-fuzzy unmanned vehicle multi-target tracking method based on generation countermeasure network
Technical Field
The invention relates to the technical field of computer networks, in particular to an anti-fuzzy unmanned vehicle multi-target tracking method based on a generation countermeasure network.
Background
The problem of tracking an unmanned vehicle during movement is a big problem. The excellent tracking algorithm needs to have good detection and tracking effects on the front obstacles, particularly pedestrians and vehicles, and accurately judges the behavior intention of the pedestrians and vehicles, so that reasonable path planning is performed. The multi-target tracking detects moving targets in a video sequence, corresponds the targets in different frames one by one, gives the moving tracks of different targets, predicts the short-term moving trend of the targets and judges the behavior intention of the targets. These objects may be arbitrary, pedestrians, vehicles, or animals, etc. And the multi-target tracking result can be used for making an obstacle avoidance strategy and planning a dynamic path of the unmanned vehicle.
The current multi-target tracking algorithm almost does not consider whether the acquired image is clear or not. In reality, if the driving road surface is uneven, the vehicle body shakes, and the pictures acquired by the camera generate motion blur. The blurred image can greatly reduce the performance of the tracking algorithm, so that some dangerous targets cannot be found in time, and the detected targets cannot be tracked well. This is very fatal to the stability and safety of the unmanned vehicle. Therefore, how to process image blurring caused by motion is an urgent problem to be solved in multi-target tracking of the unmanned vehicle.
Disclosure of Invention
The invention aims to provide an anti-fuzzy unmanned vehicle multi-target tracking method based on a generation countermeasure network, aiming at the problem of image fuzzy in the existing unmanned vehicle multi-target tracking.
The purpose of the invention is realized by the following technical scheme: an anti-fuzzy unmanned vehicle multi-target tracking method based on a generation countermeasure network comprises the following steps:
the method comprises the following steps: and acquiring a road condition video sequence by using the vehicle-mounted camera equipment of the unmanned vehicle.
Step two: and (3) detecting whether the image in the video sequence acquired in the step one is a blurred image or not by using a blurred image detection method, if so, directly performing the step four, and if so, performing the step three.
Step three: and (5) using a deblurring generation countermeasure method for the blurred image in the step two to remove the blur of the blurred image, so that the blurred image becomes clear.
Step four: and detecting the targets appearing in each frame of image by using a YOLO (you only look once, a single neural network-based target detection algorithm) on the clear images obtained in the second step and the third step, and determining the main targets to be tracked by the multi-target tracking algorithm.
Step five: aiming at the main target tracked in the fourth step, the data association is realized by using a re-recognition model and a Hungarian algorithm. And associating the currently detected target with the historical target track to obtain a complete target track.
Step six: and a Kalman filter is used as a tracker to estimate the position of the current target in the next frame as prior information of target tracking, and the predicted position and the detection position of the detector are fused as output to smooth the track and realize a target tracking task.
Further, the step is realized by the following sub-steps:
(2.1) graying and laplacian filtering: and converting the RGB color image into a gray image, and filtering by using a Laplace operator to realize the pretreatment of the image.
(2.2) variance calculation: the more serious the image blurring degree is, the lower the image variance is, the higher the clear image variance is, and when the variance is greater than the threshold value 200, the image is judged to be a non-blurred image.
(2.3) preventing misjudgment in combination with the upper frame image: if the previous frame image is fuzzy, when the variance ratio of the current frame image to the previous frame image is greater than a threshold value 5, judging that the image is a non-fuzzy image; if the previous image is not blurred, the image is judged to be a non-blurred image when the variance ratio of the current frame image to the previous frame image is greater than the threshold value 0.3. Otherwise, the image is blurred.
Further, the step three is realized by the following sub-steps:
and (3.1) constructing a defuzzification generation countermeasure network. Constructing a generator network: and constructing a defuzzification generation countermeasure network. Constructing a generator network: the method comprises the steps of designing a network structure based on a super-resolution reconstruction depth network, simulating iterative fitting residual errors through convolution operation, approximating a clear image, wherein an improved neural network consists of 2 convolution layers and 9 blocks, and each block consists of 2 series-connected 3 x 3 convolution layers, a normalization layer and a linear rectification function. Constructing a discriminator network: the design of discriminator network is realized by using the area generation countermeasure network in the image translation algorithm based on the condition generation countermeasure network, the area generation countermeasure network changes the generation countermeasure network discriminator into the full convolution network, and the input is mapped into the matrix X, X of NxNijThe value of (d) represents the probability that each matrix is a true sample, and averaging is the final output of the discriminator.
(3.2) determining the deblurring generates a countering network loss function, countering loss function LallComprises four parts, namely a conditional generation antagonistic network loss function LcGANError squared sum loss function L2Structural similarity loss function LssimThe perceptual loss function Lperceptual,kn(n-1, 2,3) is the corresponding hyperparameter.
Lall=LcGAN+(k1)L2+(k2)Lssim+(k3)Lperceptual
Conditional generation countering network loss function:
Figure BDA0002831394550000021
where G represents the generator, D represents the discriminator, E (. + -.) represents the expected value of the distribution function, x represents the blurred image, y represents the sharp image, z represents the noise, P represents the noisedata(x) represents the distribution of samples, using cross entropy loss function as a condition to generate a countermeasureA network loss function.
Structural similarity loss function SSIM (x, y):
Figure BDA0002831394550000031
SSIM (x, y) can be viewed as the product of two aspects, image illumination similarity L (x, y), image contrast similarity c (x, y).
Figure BDA0002831394550000032
Figure BDA0002831394550000033
Wherein muxAnd σxRepresents the mean and variance, μ, of the deblurred imageyAnd σyMean and variance, σ, representing the original sharp imagexyAs a covariance of both, C1And C2Is a constant used for stabilization. For a three-channel RGB map, the values of each channel are averaged, and then the local mean and variance are calculated. Corresponding structural loss LssimComprises the following steps:
Lssim=1-SSIM(x,y)
perceptual loss function Lperceptual
Figure BDA0002831394550000034
Cj、WjAnd HjIs the number of channels, width and height of the jth feature map of the network. Phi is ajIs the output of the network corresponding to the jth convolutional layer. G (I)B) Is the output of the generator from the fuzzy graph of the input, ISIs a corresponding clear picture, IBIs the corresponding fuzzy graph.
And (3.3) training deblurring to generate a countermeasure network. During training, the network convolution kernel is 3 multiplied by 3, the batch is 8, the initial learning rates of the generator and the discriminator are both 0.01, and the training is carried out on two GTX1080Ti video cards by using ADAM optimization. The network training process is as follows:
initialization: initial learning rate ρGDIs 0.01 and a loss function weight kn(n=1,2,3)
An update generator G: sampling N samples from the training set, (x, y) — (x)1,y1),…,(xN,yN)
Updating the G parameter:
Figure BDA0002831394550000035
training the discriminator D for multiple times: updating the parameters D:
Figure BDA0002831394550000036
after training, the well-trained generation confrontation network model is obtained
And (3.4) deblurring of the blurred image is realized. And inputting the blurred image into a trained generation countermeasure network to obtain a deblurred clear image.
The method has the advantages that the problem that images acquired by a camera are fuzzy due to vehicle body shake can be solved, processing of fuzzy video sequences acquired by the unmanned vehicle and multi-target tracking are achieved, the method is simple and convenient to achieve, means are flexible, and the accuracy of the multi-target tracking can be effectively improved.
Drawings
FIG. 1 is a flow chart of an anti-fuzzy unmanned vehicle multi-target tracking method based on a generative countermeasure network;
fig. 2 is a flow chart of a step two blur detection method.
Detailed Description
The present invention is described in detail below with reference to the accompanying drawings.
The invention relates to an anti-fuzzy unmanned vehicle multi-target tracking method based on a generated countermeasure network, which comprises the following steps:
the method comprises the following steps: and acquiring a road condition video sequence by using the vehicle-mounted camera equipment. The unmanned vehicle needs to capture surrounding environment information during driving, especially pedestrian and vehicle information in the environment. According to the method, a camera sensor is selected, and multi-target tracking is achieved through an image sequence acquired by a camera.
Step two: and detecting whether the image in the video sequence is a blurred image or not by using a blurred image detection method, if so, directly performing the step four, and if so, performing the step three. In an actual vehicle driving scene, various interferences exist, for example, a captured image is blurred due to unevenness of a road, the captured image is polluted due to rain and snow weather, and before multi-target tracking, the image needs to be subjected to blur detection, that is, blurred image detection. This is one of the keys of the present invention, and the flow chart is shown in fig. 2. The detection of the blurred image mainly depends on the variance of the image, the image is clearer when the image variance is larger, and the image is blurred when the image variance is smaller.
This step is achieved by the following substeps:
(2.1) graying and laplacian filtering: the RGB color image is converted into a gray image, and filtering is carried out by using a 3 x 3 Laplacian operator, so that the image preprocessing is realized.
(2.2) variance calculation: the more serious the image blurring degree is, the lower the image variance is, the higher the clear image variance is, and when the variance is greater than the threshold value 200, the image is judged to be a non-blurred image.
(2.3) preventing misjudgment in combination with the upper frame image: if the previous frame image is fuzzy, when the variance ratio of the current frame image to the previous frame image is greater than a threshold value 5, judging that the image is a non-fuzzy image; if the previous image is not blurred, the image is judged to be a non-blurred image when the variance ratio of the current frame image to the previous frame image is greater than the threshold value 0.3. Otherwise, the image is blurred.
Step three: and removing the blur of the blurred image by using a deblurring generation countermeasure method to make the blurred image clear. Step three is one of the keys of the present invention. Clear images are generated mainly by generating games against the network.
This step is achieved by the following sub-steps:
and (3.1) constructing a defuzzification generation countermeasure network. Constructing a generator network: the method comprises the steps of designing a network structure based on VDSR (Very Deep network for Super-Resolution reconstruction depth network), simulating iterative fitting residual errors through convolution operation, and approximating a clear image so as to be more suitable for an image deblurring task. The improved neural network consists of 2 convolutional layers and 9 blocks, and each block consists of 2 3 × 3 convolutional layers connected in series, an InstanceNorm normalization layer and a leak activation layer. Constructing a countermeasure network: the design of the countermeasure network is realized by using PatchGAN in pix2pix, the PatchGAN converts the GAN discriminator into a full convolution network, and the input is mapped into an NxN matrix X, XijThe value of (d) represents the probability that each matrix is a true sample, and the average value is the final output of the discriminator.
(3.2) determining a deblurring generation countermeasure network loss function, wherein the countermeasure loss function comprises four parts, namely a conditional generation countermeasure network loss function LcGANError squared sum loss function L2Structural similarity loss function LssimThe perceptual loss function Lperceptual,kn(n-1, 2,3) is the corresponding hyperparameter.
Lall=LcGAN+(k1)L2+(k2)ssim+(k3)Lperceptual
The choice of the loss function is considered from four points: (1) in common image conversion tasks such as image deblurring and image style conversion, the loss function mostly adopts error square sum loss L2. (2) The error sum of squares loss can well capture the low-frequency information in the image, so that the reconstructed image accords with the true value on the whole and macro level, but the high-frequency information, namely local texture and detail, is distorted, and the loss function L of the structural similarity degreessimThe method can solve the problem well, and is often used for measuring the local texture similarity. (3) Perceptual loss function LperceptualThe method is also used for image segmentation and migration, selects an intermediate layer of a pre-training model as the high-level features of the image, and calculates and generates the Euclidean form between the image and a real image from the feature levelDistance is a loss of reconstruction of the image content. (4) Conditional Generation Confrontation network loss function LcGANIs a function required for network training.
Conditional generation countering network loss function:
Figure BDA0002831394550000051
wherein G represents a generator and D represents a discriminator, and the countering network loss function is generated using the cross entropy loss function as a condition.
Structural similarity loss function:
Figure BDA0002831394550000052
SSIM (x, y) can be viewed as the product of two aspects, image illumination similarity L (x, y), image contrast similarity c (x, y).
Figure BDA0002831394550000053
Figure BDA0002831394550000054
Wherein muxAnd σxRepresents the mean and variance, μ, of the deblurred imageyAnd σyMean and variance, σ, representing the original sharp imagexyIs the covariance of the two. For a three-channel RGB map, the values of each channel are averaged, and then the local mean and variance are calculated. The corresponding structural losses are:
Lssim=1-SSIM(x,y)
perceptual loss function:
Figure BDA0002831394550000061
Cj、Wjand HjIs the number of channels, width and height of the jth feature map of the network. Phi is ajIs the output of the network corresponding to the jth convolutional layer. G (I)B) Is the output of the generator from the fuzzy graph of the input, ISIs a corresponding clear view.
And (3.3) training deblurring to generate a countermeasure network. During training, the network convolution kernel is 3 multiplied by 3, the batch is 8, the initial learning rates of the generator and the discriminator are both 0.01, and the training is carried out on two GTX1080Ti video cards by using ADAM optimization. The network training process is as follows:
initialization: initial learning rate ρGDIs 0.01 and a loss function weight kn(n=1,2,3)
An update generator G: sampling N samples from the training set, (x, y) — (x)1,y1),…,(xN,yN)
Updating the G parameter:
Figure BDA0002831394550000062
training the discriminator D for multiple times: updating the parameters D:
Figure BDA0002831394550000063
and obtaining a trained generated confrontation network model after training.
And (3.4) deblurring of the blurred image is realized. And inputting the blurred image into a trained generation countermeasure network to obtain a deblurred clear image.
Step four: and detecting the targets appearing in each frame of image by using a YOLO algorithm, and determining the main targets to be tracked by multi-target tracking. The YOLO algorithm gives consideration to efficiency and accuracy, and can realize real-time accurate detection of the target.
This step is achieved by the following sub-steps:
and (4.1) loading a base model pre-trained by a YOLO algorithm. The YOLO algorithm is a representative one-stage target detection algorithm, the target detection is regarded as a simple regression problem, a target frame and a target category are regressed from pixel points, and the frame and the category of the target are obtained by using only one simple convolutional neural network. The convolution neural network of the YOLO algorithm mainly comprises the following steps: dividing the image into a 7 x 7 grid; if an object center falls within a grid, the grid is used to detect the class of the object. The convolutional neural network is composed of 24 convolutional layers and 2 fully-connected layers. In the method, the darknet model with trained weights is first downloaded from YOLO Real-Time Object Detection (pjredbie. com).
And (4.2) inputting each frame of clear image into a YOLO algorithm according to a sequence to obtain a target object in the clear image sequence, and realizing primary target identification. The YOLO algorithm relies mainly on the COCO dataset for training, which has 80 classes, and the main labels include car, bird, cat, dog, etc.
And (4.3) screening the targets of the labels, namely pedestrians and vehicles, and taking the targets as targets of multi-target tracking. In the method, attention needs to be paid to pedestrian and vehicle targets, and car, bus and person are screened as tracking targets.
Step five: and (3) realizing data association by using a re-recognition model and a Hungarian algorithm. And associating the currently detected target with the historical target track to obtain a complete target track. The step and the step six are iterated circularly continuously until the image sequence is finished.
This step is achieved by the following sub-steps:
and (5.1) loading a pre-trained re-recognition model, and distinguishing different pedestrians or vehicles in the detected target. In the method, the re-recognition mainly comprises pedestrian re-recognition and vehicle re-recognition, i.e. recognizing the same object (pedestrian or vehicle) in different frames of the video sequence. The pre-trained re-recognition model used in the method is obtained by training based on a large-scale ReiD data set, and a residual error network with 2 convolutional layers and 6 residual error blocks is constructed based on the pre-trained network to extract the appearance characteristics of the target.
And (5.2) searching a matching optimal solution of a plurality of targets in the front frame and the rear frame by using a Hungarian algorithm (an algorithm for searching the maximum matching of bipartite graphs), and associating the detected targets with historical target tracks. In the method, two matched parties are the matching of the targets of the front frame and the rear frame, and the motion information and the appearance information of the targets are mainly used. In the matching of motion information: and matching the two target track states (u, v, r and h) obtained by detection and the target track state (u, v, r and h) predicted by Kalman in the step six, and if the two target track states are smaller than a certain threshold value t which is 9.4877, the target track states are considered to be the same target. In the matching of the appearance information: and the two matched frames are target appearance information obtained by two-frame detection, the appearance similarity is extracted by using the re-recognition model in (5.1), and the minimum cosine distance between the feature vectors is used as an appearance similarity index. After the two matching indexes are determined, the method uses a weighted Hungarian algorithm to find the optimal solution of the matching of the two frames of targets before and after.
Step six: and using a Kalman filter as a tracker to estimate the position of the current target in the next frame, and fusing the predicted position and the detection position of the detector as output to smooth the track and realize multi-target tracking.
This step is achieved by the following sub-steps:
and (6.1) state definition of the tracking target. Using 8-dimensional space
Figure BDA0002831394550000071
Defining the state of a tracking target, (u, v) represents the center position of a two-dimensional frame, r represents the aspect ratio, h represents the height,
Figure BDA0002831394550000072
representing the rate of change of each of the aforementioned states in the coordinate system.
And (6.2) solving by using a Kalman filter. And the Kalman filtering utilizes a linear system state equation, observation data is input and output through the system, and the optimal estimation is carried out on the system state. The method assumes that a Kalman filter adopts a uniform motion model and a linear observation model, and uses a standard Kalman filter to solve, wherein the observation variables are (u, v, r, h).
And (6.3) multi-target tracking track prediction. Definition ofThreshold value AmaxUsing the variable akRecording the time length from the last successful matching to the current time when akGreater than AmaxThen, the track is considered to have ended, and recording of the track is stopped. For a newly generated track, it needs to be observed whether the track can be matched successfully in the next 3 frames, if the track can be matched successfully, the track is considered to be generated newly, and if the track cannot be matched successfully, the track is deleted.

Claims (5)

1. An anti-fuzzy unmanned vehicle multi-target tracking method based on a generation countermeasure network is characterized by comprising the following steps:
the method comprises the following steps: and acquiring a road condition video sequence by using the vehicle-mounted camera equipment of the unmanned vehicle.
Step two: and (3) detecting whether the image in the road condition video sequence acquired in the step one is a fuzzy image or a fuzzy image by using a fuzzy image detection method, if the image is a clear image, directly performing the step four, and if the image is a fuzzy image, performing the step three.
Step three: and (5) using a deblurring generation countermeasure method for the blurred image in the step two to remove the blur of the blurred image, so that the blurred image becomes clear.
Step four: and detecting the targets appearing in each frame of image by using a target detection algorithm based on a single neural network on the clear images obtained in the second step and the third step, and determining the main targets to be tracked by multi-target tracking.
Step five: aiming at the main target tracked in the fourth step, the data association is realized by using a re-recognition model and a Hungarian algorithm. And associating the currently detected target with the historical target track to obtain a complete target track.
Step six: and a Kalman filter is used as a tracker to estimate the position of the current target in the next frame as prior information of target tracking, and the predicted position and the detection position of the detector are fused as output to smooth the track and realize a target tracking task.
2. The anti-blur unmanned vehicle multi-target tracking method according to claim 1, wherein the second step is realized by the following sub-steps:
(2.1) graying and laplacian filtering: and converting the RGB color image into a gray image, and filtering by using a Laplace operator to realize the pretreatment of the image.
(2.2) variance calculation: the more serious the image blurring degree is, the lower the image variance is, the higher the clear image variance is, and when the variance is greater than the threshold value 200, the image is judged to be a non-blurred image.
(2.3) preventing misjudgment in combination with the upper frame image: if the previous frame image is fuzzy, when the variance ratio of the current frame image to the previous frame image is greater than a threshold value 5, judging that the image is a non-fuzzy image; if the previous image is not blurred, the image is judged to be a non-blurred image when the variance ratio of the current frame image to the previous frame image is greater than the threshold value 0.3. Otherwise, the image is blurred.
3. The anti-blur unmanned vehicle target tracking method according to claim 1, wherein the third step is realized by the following sub-steps:
and (3.1) constructing a defuzzification generation countermeasure network. Constructing a generator network: the method comprises the steps of designing a network structure based on a super-resolution reconstruction depth network, simulating iterative fitting residual errors through convolution operation, approximating a clear image, wherein an improved neural network consists of 2 convolution layers and 9 blocks, and each block consists of 2 series-connected 3 x 3 convolution layers, a normalization layer and a linear rectification function. Constructing a discriminator network: the design of discriminator network is realized by using the area generation countermeasure network in the image translation algorithm based on the condition generation countermeasure network, the area generation countermeasure network changes the generation countermeasure network discriminator into the full convolution network, and the input is mapped into the matrix X, X of NxNijThe value of (d) represents the probability that each matrix is a true sample, and averaging is the final output of the discriminator.
(3.2) determining the deblurring generates a countering network loss function, countering loss function LallComprises four parts, namely a conditional generation antagonistic network loss function LcGANError, error ofSum of squares loss function L2Structural similarity loss function LssimThe perceptual loss function Lperceptual,kn(n-1, 2,3) is the corresponding hyperparameter.
Lall=LcGAN+(k1)L2+(k2)Lssim+(k3)Lperceptual
Conditional generation countering network loss function:
Figure FDA0002831394540000021
where G represents the generator, D represents the discriminator, E (. + -.) represents the expected value of the distribution function, x represents the blurred image, y represents the sharp image, z represents the noise, P represents the noisedata(. x) represents the distribution of samples, and a countering network loss function is generated using the cross entropy loss function as a condition.
Structural similarity loss function SSIM (x, y):
Figure FDA0002831394540000022
SSIM (x, y) can be viewed as the product of two aspects, image illumination similarity L (x, y), image contrast similarity c (x, y).
Figure FDA0002831394540000023
Figure FDA0002831394540000024
Wherein muxAnd σxRepresents the mean and variance, μ, of the deblurred imageyAnd σyMean and variance, σ, representing the original sharp imagexyAs a covariance of both, C1And C2Is a constant used for stabilization. For threeThe RGB map of a channel is obtained by averaging the values of each channel and then calculating the local mean and variance. Corresponding structural loss LssimComprises the following steps:
Lssim=1-SSIM(x,y)
perceptual loss function Lperceptual
Figure FDA0002831394540000025
Cj、WjAnd HjIs the number of channels, width and height of the jth feature map of the network. Phi is ajIs the output of the network corresponding to the jth convolutional layer. G (I)B) Is the output of the generator from the fuzzy graph of the input, ISIs a corresponding clear picture, IBIs the corresponding fuzzy graph.
And (3.3) training deblurring to generate a countermeasure network. During training, the network convolution is checked to be 3 multiplied by 3, the batch is checked to be 8, the initial learning rates of the generator and the discriminator are both 0.01, and the training is carried out on two GTX1080Ti video cards by using ADAM optimization. The network training process is as follows:
initialization: initial learning rate ρG,ρDIs 0.01 and a loss function weight kn(n=1,2,3);
An update generator G: sampling N samples from the training set, (x, y) — (x)1,y1),...,(xN,yN);
Updating the G parameter:
Figure FDA0002831394540000031
training the discriminator D a plurality of times: updating the parameters D:
Figure FDA0002831394540000032
obtaining a trained generated confrontation network model after training;
and (3.4) deblurring of the blurred image is realized. And inputting the blurred image into a trained generation countermeasure network to obtain a deblurred clear image.
4. The anti-blur unmanned vehicle target tracking method according to claim 1, wherein the fourth step is realized by the following sub-steps:
and (4.1) loading a base model pre-trained by a target detection algorithm based on a single neural network.
And (4.2) inputting the obtained clear image into a target detection algorithm based on a single neural network to realize primary target identification.
And (4.3) the labels of the screened targets are pedestrians and vehicles which are used as targets for multi-target tracking.
5. The anti-blur unmanned vehicle target tracking method according to claim 1, wherein the step five is realized by the following sub-steps:
and (5.1) loading a pre-trained re-recognition model, and distinguishing different pedestrians or vehicles in the detected target.
And (5.2) finding the optimal matching solution of a plurality of targets of the front frame and the rear frame by using the Hungarian algorithm, and associating the detected targets with the historical target tracks.
CN202011460523.6A 2020-12-11 2020-12-11 Anti-fuzzy unmanned vehicle multi-target tracking method based on generation countermeasure network Pending CN112446436A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011460523.6A CN112446436A (en) 2020-12-11 2020-12-11 Anti-fuzzy unmanned vehicle multi-target tracking method based on generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011460523.6A CN112446436A (en) 2020-12-11 2020-12-11 Anti-fuzzy unmanned vehicle multi-target tracking method based on generation countermeasure network

Publications (1)

Publication Number Publication Date
CN112446436A true CN112446436A (en) 2021-03-05

Family

ID=74740344

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011460523.6A Pending CN112446436A (en) 2020-12-11 2020-12-11 Anti-fuzzy unmanned vehicle multi-target tracking method based on generation countermeasure network

Country Status (1)

Country Link
CN (1) CN112446436A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095135A (en) * 2021-03-09 2021-07-09 武汉理工大学 System, method, device and medium for beyond-the-horizon target detection based on GAN
CN113313012A (en) * 2021-05-26 2021-08-27 北京航空航天大学 Dangerous driving behavior identification method based on convolution generation countermeasure network
CN114663318A (en) * 2022-05-25 2022-06-24 江西财经大学 Fundus image generation method and system based on generation countermeasure network
CN114820389A (en) * 2022-06-23 2022-07-29 北京科技大学 Face image deblurring method based on unsupervised decoupling representation
WO2023272414A1 (en) * 2021-06-28 2023-01-05 华为技术有限公司 Image processing method and image processing apparatus
CN115731516A (en) * 2022-11-21 2023-03-03 国能九江发电有限公司 Behavior recognition method and device based on target tracking and storage medium
CN116523754A (en) * 2023-05-10 2023-08-01 广州民航职业技术学院 Method and system for enhancing quality of automatically-identified image of aircraft skin damage
CN113298007B (en) * 2021-06-04 2024-05-03 西北工业大学 Small sample SAR image target recognition method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106710653A (en) * 2016-12-05 2017-05-24 浙江大学 Real-time data abnormal diagnosis method for monitoring operation of nuclear power unit
CN111428575A (en) * 2020-03-02 2020-07-17 武汉大学 Tracking method for fuzzy target based on twin network
CN111488795A (en) * 2020-03-09 2020-08-04 天津大学 Real-time pedestrian tracking method applied to unmanned vehicle
CN111914664A (en) * 2020-07-06 2020-11-10 同济大学 Vehicle multi-target detection and track tracking method based on re-identification

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106710653A (en) * 2016-12-05 2017-05-24 浙江大学 Real-time data abnormal diagnosis method for monitoring operation of nuclear power unit
CN111428575A (en) * 2020-03-02 2020-07-17 武汉大学 Tracking method for fuzzy target based on twin network
CN111488795A (en) * 2020-03-09 2020-08-04 天津大学 Real-time pedestrian tracking method applied to unmanned vehicle
CN111914664A (en) * 2020-07-06 2020-11-10 同济大学 Vehicle multi-target detection and track tracking method based on re-identification

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘创: "无人驾驶车辆多目标容错跟踪与轨迹预测研究", 《中国优秀博硕士学位论文全文数据库(硕士) 工程科技Ⅱ辑》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095135A (en) * 2021-03-09 2021-07-09 武汉理工大学 System, method, device and medium for beyond-the-horizon target detection based on GAN
CN113313012A (en) * 2021-05-26 2021-08-27 北京航空航天大学 Dangerous driving behavior identification method based on convolution generation countermeasure network
CN113298007B (en) * 2021-06-04 2024-05-03 西北工业大学 Small sample SAR image target recognition method
WO2023272414A1 (en) * 2021-06-28 2023-01-05 华为技术有限公司 Image processing method and image processing apparatus
CN114663318A (en) * 2022-05-25 2022-06-24 江西财经大学 Fundus image generation method and system based on generation countermeasure network
CN114663318B (en) * 2022-05-25 2022-08-30 江西财经大学 Fundus image generation method and system based on generation countermeasure network
CN114820389A (en) * 2022-06-23 2022-07-29 北京科技大学 Face image deblurring method based on unsupervised decoupling representation
CN114820389B (en) * 2022-06-23 2022-09-23 北京科技大学 Face image deblurring method based on unsupervised decoupling representation
CN115731516A (en) * 2022-11-21 2023-03-03 国能九江发电有限公司 Behavior recognition method and device based on target tracking and storage medium
CN116523754A (en) * 2023-05-10 2023-08-01 广州民航职业技术学院 Method and system for enhancing quality of automatically-identified image of aircraft skin damage

Similar Documents

Publication Publication Date Title
CN112446436A (en) Anti-fuzzy unmanned vehicle multi-target tracking method based on generation countermeasure network
US10984532B2 (en) Joint deep learning for land cover and land use classification
EP3614308B1 (en) Joint deep learning for land cover and land use classification
Bautista et al. Convolutional neural network for vehicle detection in low resolution traffic videos
CN113223059B (en) Weak and small airspace target detection method based on super-resolution feature enhancement
CN106683119B (en) Moving vehicle detection method based on aerial video image
CN109767454B (en) Unmanned aerial vehicle aerial video moving target detection method based on time-space-frequency significance
Sharma et al. Single image defogging using deep learning techniques: past, present and future
CN111046880A (en) Infrared target image segmentation method and system, electronic device and storage medium
CN110827332B (en) Convolutional neural network-based SAR image registration method
CN109345474A (en) Image motion based on gradient field and deep learning obscures blind minimizing technology
CN108804992B (en) Crowd counting method based on deep learning
CN114897728A (en) Image enhancement method and device, terminal equipment and storage medium
CN112560717A (en) Deep learning-based lane line detection method
CN112418149A (en) Abnormal behavior detection method based on deep convolutional neural network
Malav et al. DHSGAN: An end to end dehazing network for fog and smoke
Kadim et al. Deep-learning based single object tracker for night surveillance.
CN113129336A (en) End-to-end multi-vehicle tracking method, system and computer readable medium
Babu et al. An efficient image dahazing using Googlenet based convolution neural networks
CN110751670A (en) Target tracking method based on fusion
CN113627481A (en) Multi-model combined unmanned aerial vehicle garbage classification method for smart gardens
Abraham et al. A fuzzy based road network extraction from degraded satellite images
KR et al. Moving vehicle identification using background registration technique for traffic surveillance
Kim et al. Unsupervised moving object segmentation and recognition using clustering and a neural network
Yufeng et al. Research on SAR image change detection algorithm based on hybrid genetic FCM and image registration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210305

WD01 Invention patent application deemed withdrawn after publication