CN111062973A - Vehicle tracking method based on target feature sensitivity and deep learning - Google Patents

Vehicle tracking method based on target feature sensitivity and deep learning Download PDF

Info

Publication number
CN111062973A
CN111062973A CN201911408023.5A CN201911408023A CN111062973A CN 111062973 A CN111062973 A CN 111062973A CN 201911408023 A CN201911408023 A CN 201911408023A CN 111062973 A CN111062973 A CN 111062973A
Authority
CN
China
Prior art keywords
picture
target
tracking
filter
discriminant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911408023.5A
Other languages
Chinese (zh)
Other versions
CN111062973B (en
Inventor
韩冰
李凯
杨铮
朱考进
郭凯珺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201911408023.5A priority Critical patent/CN111062973B/en
Publication of CN111062973A publication Critical patent/CN111062973A/en
Application granted granted Critical
Publication of CN111062973B publication Critical patent/CN111062973B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a vehicle tracking method based on target feature sensitivity and deep learning, and mainly solves the problem that in the prior art, in the vehicle tracking process, due to the fact that shielding, illumination change and the like occur, interferents similar to a vehicle target are easily judged as the vehicle target, and tracking failure is caused. The method comprises the following steps: and constructing and training a discriminant connected network, extracting features through a trained public network model, selecting a filter which is more sensitive to a vehicle target, and tracking the vehicle target by using the discriminant connected network and the selected sensitive filter. The method introduces the selection of the sensitive filter bank and the operation, and has the advantages of strong robustness, good tracking effect, low calculation amount and easy realization.

Description

Vehicle tracking method based on target feature sensitivity and deep learning
Technical Field
The invention belongs to the technical field of image processing, and further relates to a vehicle tracking method based on target feature sensitivity and deep learning in the technical field of target tracking. The invention can be used for tracking vehicles which are driven in unmanned driving, driving assistance and intelligent traffic.
Background
The task of vehicle tracking is to predict the size and position of a vehicle in a subsequent frame given the size and position of the vehicle in an initial frame of a video sequence, and tracking based on correlation filtering is a lot of attention due to its real-time nature. And updating a filtering template for training data based on a tracking result of the relevant filtering for tracking the previous frame, and obtaining a response graph by obtaining the correlation between the obtained filtering template and the extracted features of the current frame, wherein the position of the maximum response point on the response graph is the position of the vehicle target. In order to solve the appearance change situation of the target in the tracking process, people design different feature descriptors, such as HOG features, SIFT features and the like. With the rapid development of deep learning in the fields of target detection, image classification and image segmentation, it is a latest trend to apply a deep neural network as a feature extractor to the field of vehicle tracking.
The patent document "a road vehicle tracking method based on multi-feature fusion" (patent application number: 201910793516.9, publication number: CN 110517291A) applied by Nanjing post and telecommunications university discloses a road vehicle tracking method based on multi-feature space fusion. Firstly, reading a section of video and dividing the video into image frames, selecting an area where a vehicle target is located, converting the input image frame from an RGB color space to an HSV color space, and taking a color histogram as a color feature; calculating horizontal edge characteristics, vertical edge characteristics and diagonal edge characteristics by constructing an integral graph to obtain Haar-like shape characteristics; then respectively establishing a target model and a candidate model in a vertical edge feature space, a horizontal edge feature space, a diagonal edge feature space and a color feature space, calculating the similarity between the two models by utilizing a Bhattacharyya coefficient, and iteratively calculating the position of the candidate model which is most similar to the target model in the current frame by using a mean shift algorithm; and respectively finding four possible target positions in the color feature space, the horizontal edge feature space, the vertical edge feature space and the diagonal edge feature space, and performing weighted fusion to obtain the final position of the target. The method has the disadvantages that because the method adopts the Haar-like shape characteristics to describe the appearance characteristics of the vehicle, when illumination change, mutual shielding of the vehicle and motion blurring of the vehicle occur, the Haar-like characteristics easily judge an interfering object similar to a vehicle target as the vehicle target, and the tracking fails. In real-time tracking of vehicles under actual road conditions, the mutual shielding condition between vehicles is very common, so the robustness of the method cannot meet the requirement of vehicle tracking under actual road conditions.
Disclosure of Invention
The invention aims to provide a vehicle tracking method based on target feature sensitivity and deep learning aiming at the defects of the prior art, and the method is used for solving the problem of tracking failure caused by occlusion, illumination change and the like in the vehicle tracking process.
The idea for realizing the purpose of the invention is as follows: and constructing and training a discriminant connected network, extracting features through a trained public network model, selecting a filter which is more sensitive to a vehicle target, and tracking the vehicle target by using the discriminant connected network and the selected sensitive filter.
The method comprises the following specific steps:
step 1, constructing a discriminant type connected network:
two identical sub-networks are built, and each sub-network has five layers of structures which are as follows in sequence: first convolution layer → first downsampling layer → second convolution layer → second downsampling layer → third convolution layer; setting the number of convolution kernels of the first convolution layer, the second convolution layer and the third convolution layer as 16, 32 and 1 in sequence, and setting the sizes of the convolution kernels as 3 x 3, 3 x 3 and 1 x 1 in sequence; setting the filter sizes of the first and second downsampling layers to be 2 x 2;
two sub-networks are arranged in parallel up and down and then connected with a cross-correlation layer XCorr to form a discriminant type connected network; setting a loss function of the discriminant connected network as a contrast loss function;
step 2, generating a training set:
randomly collecting at least 1000 pictures from a continuous video, wherein each picture comprises at least one target and marks the target; cutting the marked target in the picture into a 127 x 127 picture, and randomly cutting the background in the picture into a 127 x 127 picture;
combining the cut target picture and the cut background picture into a picture pair at random, wherein each picture pair at least comprises one target picture; if two pictures in the picture pair are the same target, setting the label of the picture pair to be 1; if the two pictures in the image pair are two different target pictures or a target picture and a background picture, setting the label of the picture pair to be 0; all the picture pairs and the labels thereof form a training set;
step 3, training a discriminant connected network:
inputting the training set into a discriminant connected network, and iteratively updating the network weight by using an Adam optimization algorithm until a comparison loss function is converged to obtain the trained discriminant connected network;
step 4, calculating a filtering template:
firstly, a rectangular frame is formed in a first frame of a tracking video and clings to the periphery of a tracked vehicle target, all pixel points in the range of the rectangular frame are extracted to form a real target picture, and all pixel points in the rectangular frame with the central point of the rectangular frame as the center and with the width and the height expanded by two times respectively form an initial filtering sample picture;
secondly, generating initial filter labels corresponding to each pixel point in the initial filter sample picture one by using a filter label generation formula, and forming the initial filter labels of all the pixel points into a label picture;
inputting the initial filtering sample picture into a trained public network model, outputting two-dimensional sub-feature matrixes with the same number as the last layer of filter of the model, and summing elements at the same positions in all the two-dimensional sub-feature matrixes to obtain a two-dimensional deep layer feature matrix of the initial filtering sample picture;
fourthly, generating a filtering template by using a filtering template calculation formula and the two-dimensional deep layer feature matrix of the tag picture and the initial filtering sample picture;
and 5, determining a sensitive filter combination:
firstly, performing related filtering operation by using each two-dimensional sub-feature matrix in an initial filtering picture and a filtering template to obtain response graphs with the same number as that of filters;
the second step, comparing the magnitude of each response point value in each response map and determining the maximum response point of each response map;
thirdly, the distance between the maximum response point of each response image and the central point of the label image is calculated, and filters corresponding to the first 100 distance values are found out according to the sequence from small to large to form a sensitive filter combination;
step 6, setting a first frame of the tracking video as a current frame;
step 7, positioning a tracked vehicle target in the next frame image of the current frame;
step 8, generating a target picture to be evaluated:
taking the positioned position as a center in the next frame of the current frame, extracting all pixel points in the area with the same size as the real target picture generated in the first step of the step 4 to form a target picture to be evaluated;
step 9, inputting the real target picture and the target picture to be evaluated into the discriminant connected network trained in the step 3, judging whether the output of the discriminant connected network is 1, if so, setting the next frame of the current frame as the current frame, and then executing the step 11; otherwise, the tracking is regarded as failed, and the step 10 is executed;
step 10, repositioning the tracking target:
inputting the next frame of the current frame into a common detector to output the position of the vehicle target to be tracked, taking the output target position as the position of the tracked vehicle target in the next frame of the current frame, and executing the step 11 after setting the next frame of the current frame as the current frame;
step 11, judging whether the current frame is the last frame of the tracking video, if so, executing step 12, otherwise, executing step 7;
and step 12, finishing the vehicle tracking process.
Compared with the prior art, the invention has the following advantages:
firstly, the invention selects the filter which is more sensitive to the vehicle target, and can accurately extract the characteristics of the tracked vehicle target when similar interference occurs, thereby overcoming the problem that the interference similar to the vehicle target is easily judged as the vehicle target when illumination change, mutual vehicle shielding and vehicle motion blurring occur in the prior art, and having the advantages of low calculated amount and strong robustness.
Secondly, the method can evaluate the tracking result by constructing and training the discriminant connection network, can relocate the vehicle target after the tracking fails, and overcomes the problem that the tracking is difficult to continue after the tracking fails in the prior art, so that the method has the advantage of high tracking accuracy.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a filter label diagram of the present invention;
fig. 3 is a structure diagram of a discriminant connectivity network constructed by the present invention.
Detailed Description
The technical solution and effects of the present invention will be further described in detail with reference to the accompanying drawings.
The specific implementation steps of the present invention are further described in detail with reference to fig. 1.
Step 1, constructing a discriminant connection network.
Two identical sub-networks are built, each sub-network has five layers, and the structure of the five sub-networks sequentially comprises the following parts from left to right: first convolution layer → first downsampling layer → second convolution layer → second downsampling layer → third convolution layer; setting the number of convolution kernels of the first convolution layer, the second convolution layer and the third convolution layer as 16, 32 and 1 in sequence, and setting the sizes of the convolution kernels as 3 x 3, 3 x 3 and 1 x 1 in sequence; the filter sizes of the first and second downsampling layers are set to 2 × 2.
Two sub-networks are arranged in parallel up and down and then connected with a cross-correlation layer XCorr to form a discriminant type connected network; and setting the loss function of the discriminant connected network as a contrast loss function.
The discriminant connectivity network constructed in accordance with the present invention is further described with reference to fig. 2.
The upper and lower layers in fig. 2 represent two sub-networks respectively, each layer of each sub-network is sequentially a first convolutional layer, a first lower sampling layer, a second convolutional layer, a second lower sampling layer and a third convolutional layer from left to right with reference to fig. 2, and the two sub-networks are connected in parallel and then connected with a cross-correlation layer XCorr.
And 2, generating a training set.
Randomly collecting at least 1000 pictures from a continuous video, wherein each picture comprises at least one target and marks the target; the labeled objects in the picture are cut out into 127 × 127 pictures, and the background in the pictures is randomly cut out into 127 × 127 pictures.
Combining the cut target picture and the cut background picture into a picture pair at random, wherein each picture pair at least comprises one target picture; if two pictures in the picture pair are the same target, setting the label of the picture pair to be 1; if the two pictures in the image pair are two different target pictures or a target picture and a background picture, setting the label of the picture pair to be 0; and (4) forming a training set by all the picture pairs and the labels thereof.
And 3, training the discriminant connection network.
And inputting the training set into the discriminant connected network, and iteratively updating the network weight by using an Adam optimization algorithm until the comparison loss function is converged to obtain the trained discriminant connected network.
And 4, calculating a filtering template.
Step 1, a rectangular frame is formed in a first frame of a tracking video and clings to the periphery of a tracked vehicle target, all pixel points in the range of the rectangular frame are extracted to form a real target picture, and all pixel points in the rectangular frame with the central point of the rectangular frame as the center and with the width and the height expanded by two times respectively form an initial filtering sample picture.
And 2, generating initial filter labels corresponding to each pixel point in the initial filter sample picture one by using the following filter label generation formula, and forming the initial filter labels of all the pixel points into a label picture:
Figure BDA0002349206260000051
wherein g (x, y) represents an initial filtering label corresponding to a pixel point at (x, y) in the filtering sample, pi represents a circumferential rate, sigma represents a control parameter with a value of 0.5, e represents an exponential operation with a natural constant as a base, and x (x, y) represents an initial filtering label corresponding to a pixel point at (x, y) in the filtering samplecAbscissa value, y, representing the central pixel of the initial filtered sample pictureCAnd the ordinate value of the central pixel point of the initial filtering sample picture is represented.
The label picture generated by the present invention will be further described with reference to fig. 3.
The size of fig. 3 is the same as the size of the initial filtered sample picture, and the white dots in the center of fig. 3 represent the locations of tracked vehicle targets in the initial filtered sample picture.
And 3, inputting the initial filtering sample picture into a trained public network model, outputting two-dimensional sub-feature matrixes with the same number as the last layer of filter of the model, and summing elements at the same positions in all the two-dimensional sub-feature matrixes to obtain a two-dimensional deep layer feature matrix of the initial filtering sample picture.
And 4, generating a filtering template by using the following filtering template calculation formula and the two-dimensional deep layer feature matrix of the tag picture and the initial filtering sample picture.
Figure BDA0002349206260000052
Wherein, F (·) represents a fourier transform operation, h represents a filtering template, x represents a conjugate transpose operation, g represents a label picture, and F represents a two-dimensional deep feature matrix of an initial filtering sample picture.
And 5, determining the sensitive filter combination.
And step 1, performing related filtering operation by using each two-dimensional sub-feature matrix in the initial filtering picture and a filtering template to obtain response graphs with the same number as that of the filters.
And step 2, comparing the magnitude of each response point value in each response map and determining the maximum response point of each response map.
And 3, solving the distance between the maximum response point of each response image and the central point of the label image, and finding out the filters corresponding to the first 100 distance values according to the sequence from small to large to form a sensitive filter combination.
And 6, setting the first frame of the tracking video as the current frame.
And 7, positioning the tracked vehicle target in the next frame image of the current frame.
Step 1, reading the position and the size of a tracked vehicle target in the current frame of the tracking video, and obtaining the range of a search area by taking the central point position of the vehicle target as the center and expanding the width and the height by two times respectively.
And 2, extracting all pixel points in the search area range from the next frame image of the current frame of the tracking video to form a search area picture, inputting the search area picture into a public network model, and summing the sensitive sub-features extracted by each filter in the sensitive filter combination determined in the step 5 to obtain the sensitive features of the search area picture.
And 3, performing relevant filtering operation on the sensitive features and the filtering template to obtain a sensitive response graph.
And 4, comparing the magnitude of each response point value in the sensitive response map, determining a maximum response point, and taking the position of the maximum response point as the position of the tracked vehicle target in the next frame of image.
And 8, generating a target picture to be evaluated.
And 4, taking the positioned position as a center in the next frame of the current frame, and extracting all pixel points in the area with the same size as the real target picture generated in the first step of the step 4 to form the target picture to be evaluated.
Step 9, inputting the real target picture and the target picture to be evaluated into the discriminant connected network trained in the step 3, judging whether the output of the discriminant connected network is 1, if so, setting the next frame of the current frame as the current frame, and then executing the step 11; otherwise, the tracking is considered to be failed, and step 10 is executed.
And 10, repositioning the tracking target.
Inputting the next frame of the current frame into a common detector to output the position of the vehicle target to be tracked, taking the output target position as the position of the tracked vehicle target in the next frame of the current frame, and executing the step 11 after setting the next frame of the current frame as the current frame.
And 11, judging whether the current frame is the last frame of the tracking video, if so, executing a step 12, otherwise, executing a step 7.
And step 12, finishing the vehicle tracking process.
The effect of the present invention will be further described with reference to simulation experiments.
1. Simulation conditions are as follows:
the simulation of the invention is carried out on an Ubuntu14.04 system with a CPU of Intel (R) core (TM) i8, a main frequency of 3.5GHz and a memory of 128G by using MATLAB R2014 software and a MatConvnet deep learning toolkit.
2. Simulation content and result analysis:
vehicle tracking in simulation experiment data is simulated by using the method and three methods (a nuclear correlation filtering algorithm is abbreviated as KCF, a full convolution connected network algorithm for tracking is abbreviated as Sim _ FC, and a hierarchical convolution characteristic for tracking is abbreviated as HCFT) in the prior art respectively.
In the simulation experiment, three prior arts are adopted:
the prior art kernel Correlation filter algorithm KCF is a target tracking algorithm, KCF algorithm for short, proposed by Henriques et al in "High-Speed tracking with Kernelized Correlation Filters [ J ]. IEEE Transactions on Pattern Analysis & Machine understanding, 37(3): 583-.
The prior art full convolution integral network algorithm Siam _ FC for tracking means that the algorithm, Bertinetto L et al, in "Bertinetto L, Valmadre J, Henriques,
Figure BDA0002349206260000072
F,et al.Fully-Convolutional SiameseNetworks for Object Tracking[J]2016, providing a real-time target algorithm, S for shortiam _ FC algorithm.
The hierarchical Convolutional network HCFT used for Tracking in the prior art means that Zhang H et al put forward a target Tracking algorithm, called HCFT algorithm for short, in Ma C, Huang JB, Yang X, et al.
The simulation experiment data used by the invention are a common tracking database OTB and TColor-128, wherein the OTB database comprises 100 video sequences, and the TColor-128 comprises 128 video sequences. And (4) evaluating the tracking results of the four methods by using two evaluation indexes (distance accuracy DP and overlapping success rate OP). The distance accuracy DP and the overlapping success ratio OP of all videos in the two databases are calculated by using the following formulas, and the average distance accuracy and the average overlapping power of the OTB database and the TColor-128 database are drawn as table 1 and table 2:
Figure BDA0002349206260000071
Figure BDA0002349206260000081
the effect of the present invention is further described below with reference to the simulation diagrams of tables 1 and 2.
TABLE 1 OTB database distance accuracy and overlay success rate comparison chart
Figure BDA0002349206260000082
TABLE 2 TColor-128 database distance accuracy and overlay success rate comparison chart
Figure BDA0002349206260000083
It can be seen from tables 1 and 2 that the present invention achieves better results in both distance accuracy and overlay success rate on OTB100 and TColor-128 databases, and can achieve better tracking effect, mainly because the present invention can obtain features that can better describe the tracked vehicle target through the combination of sensitive filters, and relocate the tracked target after the tracking fails, thereby obtaining higher and more robust tracking effect.

Claims (5)

1. A vehicle tracking method based on target feature sensitivity and deep learning is characterized in that a discriminant connected network is constructed and trained, features of a tracked vehicle are extracted through a trained public network model, a filter which is more sensitive to a vehicle target is selected from the trained public network model, and the vehicle target is tracked by using the discriminant connected network and the selected sensitive filter, wherein the method specifically comprises the following steps:
step 1, constructing a discriminant type connected network:
two identical sub-networks are built, and each sub-network has five layers of structures which are as follows in sequence: first convolution layer → first downsampling layer → second convolution layer → second downsampling layer → third convolution layer; setting the number of convolution kernels of the first convolution layer, the second convolution layer and the third convolution layer as 16, 32 and 1 in sequence, and setting the sizes of the convolution kernels as 3 x 3, 3 x 3 and 1 x 1 in sequence; setting the filter sizes of the first and second downsampling layers to be 2 x 2;
two sub-networks are arranged in parallel up and down and then connected with a cross-correlation layer XCorr to form a discriminant type connected network; setting a loss function of the discriminant connected network as a contrast loss function;
step 2, generating a training set:
randomly collecting at least 1000 pictures from a continuous video, wherein each picture comprises at least one target and marks the target; cutting the marked target in the picture into a 127 x 127 picture, and randomly cutting the background in the picture into a 127 x 127 picture;
combining the cut target picture and the cut background picture into a picture pair at random, wherein each picture pair at least comprises one target picture; if two pictures in the picture pair are the same target, setting the label of the picture pair to be 1; if the two pictures in the image pair are two different target pictures or a target picture and a background picture, setting the label of the picture pair to be 0; all the picture pairs and the labels thereof form a training set;
step 3, training a discriminant connected network:
inputting the training set into a discriminant connected network, and iteratively updating the network weight by using an Adam optimization algorithm until a comparison loss function is converged to obtain the trained discriminant connected network;
step 4, calculating a filtering template:
firstly, a rectangular frame is formed in a first frame of a tracking video and clings to the periphery of a tracked vehicle target, all pixel points in the range of the rectangular frame are extracted to form a real target picture, and all pixel points in the rectangular frame with the central point of the rectangular frame as the center and with the width and the height expanded by two times respectively form an initial filtering sample picture;
secondly, generating initial filter labels corresponding to each pixel point in the initial filter sample picture one by using a filter label generation formula, and forming the initial filter labels of all the pixel points into a label picture;
inputting the initial filtering sample picture into a trained public network model, outputting two-dimensional sub-feature matrixes with the same number as the last layer of filter of the model, and summing elements at the same positions in all the two-dimensional sub-feature matrixes to obtain a two-dimensional deep layer feature matrix of the initial filtering sample picture;
fourthly, generating a filtering template by using a filtering template calculation formula and the two-dimensional deep layer feature matrix of the tag picture and the initial filtering sample picture;
and 5, determining a sensitive filter combination:
firstly, performing related filtering operation by using each two-dimensional sub-feature matrix in an initial filtering picture and a filtering template to obtain response graphs with the same number as that of filters;
the second step, comparing the magnitude of each response point value in each response map and determining the maximum response point of each response map;
thirdly, the distance between the maximum response point of each response image and the central point of the label image is calculated, and filters corresponding to the first 100 distance values are found out according to the sequence from small to large to form a sensitive filter combination;
step 6, setting a first frame of the tracking video as a current frame;
step 7, positioning a tracked vehicle target in the next frame image of the current frame;
step 8, generating a target picture to be evaluated:
taking the positioned position as a center in the next frame of the current frame, extracting all pixel points in the area with the same size as the real target picture generated in the first step of the step 4 to form a target picture to be evaluated;
step 9, inputting the real target picture and the target picture to be evaluated into the discriminant connected network trained in the step 3, judging whether the output of the discriminant connected network is 1, if so, setting the next frame of the current frame as the current frame, and then executing the step 11; otherwise, the tracking is regarded as failed, and the step 10 is executed;
step 10, repositioning the tracking target:
inputting the next frame of the current frame into a common detector to output the position of the vehicle target to be tracked, taking the output target position as the position of the tracked vehicle target in the next frame of the current frame, and executing the step 11 after setting the next frame of the current frame as the current frame;
step 11, judging whether the current frame is the last frame of the tracking video, if so, executing step 12, otherwise, executing step 7;
and step 12, finishing the vehicle tracking process.
2. The target feature sensitivity and deep learning based vehicle tracking method of claim 1, characterized in that: step 4 the second step of the filter tag generation formula is as follows:
Figure FDA0002349206250000031
where g (x, y) represents the initial filter label corresponding to the pixel point at (x, y) in the filtered sample, and π representsThe circumference ratio, σ represents a control parameter with a value of 0.5, e represents an exponential operation with a natural constant as the base, xcAbscissa value, y, representing the central pixel of the initial filtered sample pictureCAnd the ordinate value of the central pixel point of the initial filtering sample picture is represented.
3. The target feature sensitivity and deep learning based vehicle tracking method of claim 1, characterized in that: step 4, the trained public network model in the third step refers to a public database which has at least 19 layers of depth and is trained by adopting pictures with the scale larger than one hundred thousand.
4. The target feature sensitivity and deep learning based vehicle tracking method of claim 1, characterized in that: step 4, the fourth step is that the calculation formula of the filtering template is as follows:
Figure FDA0002349206250000032
wherein, F (·) represents a fourier transform operation, h represents a filtering template, x represents a conjugate transpose operation, g represents a label picture, and F represents a two-dimensional deep feature matrix of an initial filtering sample picture.
5. The target feature sensitivity and deep learning based vehicle tracking method of claim 1, characterized in that: the specific steps of positioning the tracked vehicle target in the next frame image of the current frame in step 7 are as follows:
reading the position and the size of a tracked vehicle target in a current frame of a tracking video, and obtaining a search area range by taking the central point position of the vehicle target as a center and expanding the width and the height by two times respectively;
secondly, extracting all pixel points in a search area range from a next frame image of a current frame of the tracking video to form a search area picture, inputting the search area picture into a public network model, and summing sensitive sub-features extracted by each filter in the sensitive filter combination determined in the step 5 to obtain sensitive features of the search area picture;
thirdly, performing relevant filtering operation on the sensitive features and a filtering template to obtain a sensitive response graph;
and fourthly, comparing the magnitude of each response point value in the sensitive response graph, determining a maximum response point, and taking the position of the maximum response point as the position of the tracked vehicle target in the next frame of image.
CN201911408023.5A 2019-12-31 2019-12-31 Vehicle tracking method based on target feature sensitivity and deep learning Active CN111062973B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911408023.5A CN111062973B (en) 2019-12-31 2019-12-31 Vehicle tracking method based on target feature sensitivity and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911408023.5A CN111062973B (en) 2019-12-31 2019-12-31 Vehicle tracking method based on target feature sensitivity and deep learning

Publications (2)

Publication Number Publication Date
CN111062973A true CN111062973A (en) 2020-04-24
CN111062973B CN111062973B (en) 2021-01-01

Family

ID=70305372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911408023.5A Active CN111062973B (en) 2019-12-31 2019-12-31 Vehicle tracking method based on target feature sensitivity and deep learning

Country Status (1)

Country Link
CN (1) CN111062973B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344932A (en) * 2021-06-01 2021-09-03 电子科技大学 Semi-supervised single-target video segmentation method
CN113920250A (en) * 2021-10-21 2022-01-11 广东三维家信息科技有限公司 Household code matching method and device
US11403069B2 (en) 2017-07-24 2022-08-02 Tesla, Inc. Accelerated mathematical engine
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US11487288B2 (en) 2017-03-23 2022-11-01 Tesla, Inc. Data synthesis for autonomous control systems
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11665108B2 (en) 2018-10-25 2023-05-30 Tesla, Inc. QoS manager for system on a chip communications
US11681649B2 (en) 2017-07-24 2023-06-20 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US11734562B2 (en) 2018-06-20 2023-08-22 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11748620B2 (en) 2019-02-01 2023-09-05 Tesla, Inc. Generating ground truth for machine learning from time series elements
US12014553B2 (en) 2019-02-01 2024-06-18 Tesla, Inc. Predicting three-dimensional features for autonomous driving

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests
US11361457B2 (en) 2018-07-20 2022-06-14 Tesla, Inc. Annotation cross-labeling for autonomous control systems
SG11202103493QA (en) 2018-10-11 2021-05-28 Tesla Inc Systems and methods for training machine models with augmented data
US11816585B2 (en) 2018-12-03 2023-11-14 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US10956755B2 (en) 2019-02-19 2021-03-23 Tesla, Inc. Estimating object properties using visual image data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171112A (en) * 2017-12-01 2018-06-15 西安电子科技大学 Vehicle identification and tracking based on convolutional neural networks
CN108280808A (en) * 2017-12-15 2018-07-13 西安电子科技大学 The method for tracking target of correlation filter is exported based on structuring
CN110210551A (en) * 2019-05-28 2019-09-06 北京工业大学 A kind of visual target tracking method based on adaptive main body sensitivity
CN110473231A (en) * 2019-08-20 2019-11-19 南京航空航天大学 A kind of method for tracking target of the twin full convolutional network with anticipation formula study more new strategy
US20190370553A1 (en) * 2018-05-09 2019-12-05 Wizr Llc Filtering of false positives using an object size model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171112A (en) * 2017-12-01 2018-06-15 西安电子科技大学 Vehicle identification and tracking based on convolutional neural networks
CN108280808A (en) * 2017-12-15 2018-07-13 西安电子科技大学 The method for tracking target of correlation filter is exported based on structuring
US20190370553A1 (en) * 2018-05-09 2019-12-05 Wizr Llc Filtering of false positives using an object size model
CN110210551A (en) * 2019-05-28 2019-09-06 北京工业大学 A kind of visual target tracking method based on adaptive main body sensitivity
CN110473231A (en) * 2019-08-20 2019-11-19 南京航空航天大学 A kind of method for tracking target of the twin full convolutional network with anticipation formula study more new strategy

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PEIXIA LI,ET AL.: "Deep visual tracking:review and experimental comparison", 《PATTERN RECOGNITION》 *
林晓翠: "基于深度学习的车辆检测研究", 《万方学位全文数据库》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11487288B2 (en) 2017-03-23 2022-11-01 Tesla, Inc. Data synthesis for autonomous control systems
US11403069B2 (en) 2017-07-24 2022-08-02 Tesla, Inc. Accelerated mathematical engine
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US11681649B2 (en) 2017-07-24 2023-06-20 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US11734562B2 (en) 2018-06-20 2023-08-22 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11983630B2 (en) 2018-09-03 2024-05-14 Tesla, Inc. Neural networks for embedded devices
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
US11665108B2 (en) 2018-10-25 2023-05-30 Tesla, Inc. QoS manager for system on a chip communications
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11908171B2 (en) 2018-12-04 2024-02-20 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US11748620B2 (en) 2019-02-01 2023-09-05 Tesla, Inc. Generating ground truth for machine learning from time series elements
US12014553B2 (en) 2019-02-01 2024-06-18 Tesla, Inc. Predicting three-dimensional features for autonomous driving
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
CN113344932A (en) * 2021-06-01 2021-09-03 电子科技大学 Semi-supervised single-target video segmentation method
CN113920250B (en) * 2021-10-21 2023-05-23 广东三维家信息科技有限公司 House type code matching method and device
CN113920250A (en) * 2021-10-21 2022-01-11 广东三维家信息科技有限公司 Household code matching method and device

Also Published As

Publication number Publication date
CN111062973B (en) 2021-01-01

Similar Documents

Publication Publication Date Title
CN111062973B (en) Vehicle tracking method based on target feature sensitivity and deep learning
Wang et al. Adaptive DropBlock-enhanced generative adversarial networks for hyperspectral image classification
CN108665481B (en) Self-adaptive anti-blocking infrared target tracking method based on multi-layer depth feature fusion
Mukherjee et al. A comparative experimental study of image feature detectors and descriptors
CN108416266B (en) Method for rapidly identifying video behaviors by extracting moving object through optical flow
CN110084249A (en) The image significance detection method paid attention to based on pyramid feature
CN108304873A (en) Object detection method based on high-resolution optical satellite remote-sensing image and its system
CN108346159A (en) A kind of visual target tracking method based on tracking-study-detection
CN107146240A (en) The video target tracking method of taking photo by plane detected based on correlation filtering and conspicuousness
CN106780485A (en) SAR image change detection based on super-pixel segmentation and feature learning
CN102495998B (en) Static object detection method based on visual selective attention computation module
Shahab et al. How salient is scene text?
CN106682641A (en) Pedestrian identification method based on image with FHOG- LBPH feature
CN108257151A (en) PCANet image change detection methods based on significance analysis
CN110991547A (en) Image significance detection method based on multi-feature optimal fusion
CN112329771B (en) Deep learning-based building material sample identification method
CN109635726A (en) A kind of landslide identification method based on the symmetrical multiple dimensioned pond of depth network integration
CN107230219A (en) A kind of target person in monocular robot is found and follower method
CN104537381A (en) Blurred image identification method based on blurred invariant feature
CN108876776B (en) Classification model generation method, fundus image classification method and device
Yun et al. Part-level convolutional neural networks for pedestrian detection using saliency and boundary box alignment
CN110570450B (en) Target tracking method based on cascade context-aware framework
CN111127407B (en) Fourier transform-based style migration forged image detection device and method
CN115661754B (en) Pedestrian re-recognition method based on dimension fusion attention
CN106022310A (en) HTG-HOG (histograms of temporal gradient and histograms of oriented gradient) and STG (scale of temporal gradient) feature-based human body behavior recognition method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant