CN113902773A - Long-term target tracking method using double detectors - Google Patents

Long-term target tracking method using double detectors Download PDF

Info

Publication number
CN113902773A
CN113902773A CN202111119613.3A CN202111119613A CN113902773A CN 113902773 A CN113902773 A CN 113902773A CN 202111119613 A CN202111119613 A CN 202111119613A CN 113902773 A CN113902773 A CN 113902773A
Authority
CN
China
Prior art keywords
target
image
tracking
filter
detector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111119613.3A
Other languages
Chinese (zh)
Inventor
胡昭华
李奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202111119613.3A priority Critical patent/CN113902773A/en
Publication of CN113902773A publication Critical patent/CN113902773A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a long-term target tracking method using double detectors, which comprises the following steps: extracting fHOG characteristics of an image containing a target background image, inputting the fHOG characteristics into a pre-trained initial filter to calculate a maximum response value containing the target background image, and taking the position of the maximum response value as a target position in a current image; updating each frame of the filter; cutting out HOG characteristics of an image only containing a target according to the target position in the current image, inputting a pre-trained filter, and calculating the maximum response value of the target image; when the maximum response value of the target image is less than the threshold: if the tracking is lost, a re-detection module is started to re-detect the predicted position of the current picture; if the tracking error is larger than the threshold value, the tracking is correct, and the filter is updated; the invention adopts a double-detector system to detect the target again, and can carry out multiple detection on the current search area in a quick detection and depth detection mode when the target is lost, thereby increasing the success rate of detection.

Description

Long-term target tracking method using double detectors
Technical Field
The invention relates to a long-term target tracking method using double detectors, belonging to the technical field of computer vision and image processing.
Background
Object tracking has made great progress in the last decade as one of the important branches of research in the field of computer vision. The method is widely applied to the fields of medical treatment, intelligent transportation, unmanned driving and the like. The main working principle is that the initial state of the target (namely the position and the size of the target in the image) is given only when the first frame of the video, and a computer is required to estimate the latest state of the target in a subsequent video sequence.
The target tracking technology has two main development directions at present: correlation filtering and deep learning. In recent years, a target tracking algorithm based on correlation filtering has been greatly improved. Many researchers have made a lot of improvements on the basis of the MOSSE algorithm, such as Henriques et al (Henriques J F, Caseiro R, Martins P, et al. high-speed tracking with kernel correlation filters [ J ]. IEEE Transactions on Pattern Analysis and Machine Analysis, 2015,37(3): 583-algorithm 596) propose a high-speed tracking algorithm (KCF) using kernel function on the basis, express targets using multi-channel direction gradient Histogram (HOG) features, and solve the problem of linear inseparability using kernel function, greatly improving tracking speed. But because the training data sets of the related filtering algorithm are all generated by cyclic shift, the boundary effect inevitably influences the tracking effect. Danelljan et al (Martin D, Gustav H, Fahad S K, et al. Learing spatialization reconstruction filters for visual tracking [ C ]// Proceedings of the IEEE Conference on International Conference on Computer Vision. Satiago, Chile: IEEE Press,2015: 4310-. However, the method for optimizing the solution increases the complexity of the algorithm and greatly reduces the tracking speed; and moreover, a fixed negative Gaussian matrix is used as a space regularization matrix, and when errors occur in the tracking process, the tracker cannot flexibly respond, so that the performance of the tracker is influenced. In the aspect of long-term tracking, Ma et al (Ma C, Yang X K, Zhang C Y, et al Long-term correlation tracking [ C ]// Proceedings of the IEEE Conference on Computer Vision and Pattern recognition. Boston, MA, USA: IEEE,2015: 5388-.
Disclosure of Invention
The invention aims to provide a long-term target tracking method using double detectors, which aims to solve the defect that when a target is shielded for a long time and appears again, a tracker cannot recognize again to cause tracking failure.
A method of long term object tracking using dual detectors, the method comprising the steps of:
extracting fHOG characteristics of the image containing the target background image, and inputting the fHOG characteristics into a pre-trained initial filter F1Calculating a maximum response value containing a target background image, and taking the position of the maximum response value as a target position in a current image; and to filter F1Updating each frame;
cutting out HOG characteristics of the image only containing the target according to the target position in the current image, and inputting a pre-trained filter F2Calculating the maximum response value of the target image;
judging the maximum response value of the target image and a preset threshold value; when the maximum response value of the target image is less than the threshold: if the tracking is lost, a re-detection module is started to re-detect the predicted position of the current picture; if the value is greater than the threshold value, the tracking is correct, and the filter F is updated2
Further, the filter F1The solution objective function of (a) is:
Figure BDA0003276588740000021
where the filter F ═ F1,f2,...,fK](ii) a The first term of the formula is a ridge regression term, K represents the total number of channels, xKRepresenting the characteristics of the Kth sample, fKDenotes the corresponding kth filter, y being the desired filter; in the second term w is adaptive spatial positiveWeighting, namely updating each frame of image according to the current frame of image information when tracking the image; in the third term, in order to prevent model degradation, an a priori reference weight w of w is introducedrWherein λ is1And λ2Is a regularization parameter.
Further, the re-detection module comprises a support vector machine detector and a depth twin network detector; the support vector machine detector draws dense training samples through the tracked target positions and target scales, and gradually trains the classifier by adding positive and negative labels to the samples according to the overlapping rate of the samples and the targets.
Further, the training method of the support vector machine detector comprises the following steps:
when N samples are collected in one frame of image, the training set is { (v)i,ci) 1, 2., N }; v in the formulaiFor the feature vector of the ith sample, class label ci={-1,+1};
Let the loss function of hyperplane h be l (h)i)=max{0,1-c<hi,vi}, symbol (symbol)<h,v>Represents the inner product of h and v;
then the objective function is obtained:
Figure BDA0003276588740000031
wherein
Figure BDA0003276588740000032
For the gradient of the loss function with respect to the hyperplane h, τ ∈ (0, + ∞) is the hyperparameter that controls the h-update rate.
Further, the support vector machine detector updates hyperplane parameters by using an online passive attack learning algorithm, and the formula for calculating the hyperplane is as follows:
Figure BDA0003276588740000033
wherein
Figure BDA0003276588740000034
For the gradient of the loss function with respect to the hyperplane h, τ ∈ (0, + ∞) is the hyperparameter that controls the h-update rate.
Further, the training method of the deep twin network detector comprises the following steps:
training a detector according to given target information during a first frame of picture, namely classifying a target and a background by using a k-means clustering method to serve as a target template pool;
obtaining a plurality of candidate regions, respectively calculating Euclidean distance between each candidate region and a target template through a twin network to be used as matching similarity, and selecting a region with the highest similarity as a tracking target; the similarity calculation formula is as follows:
Figure BDA0003276588740000035
in the formula
Figure BDA0003276588740000036
A similarity score is represented for each candidate region,
Figure BDA0003276588740000037
a sample set representing N candidate regions, upsilon (p, s) representing a tracking target p and an ith candidate sample siThe similarity score of (a).
Further, the method for judging the maximum response value of the target image and the preset threshold value comprises the following steps:
the maximum response values max _ R are respectively matched with the set updating threshold value TaAnd re-detecting threshold TbComparing;
if max _ R < TbIf the tracking fails, activating a detector part, and detecting the current picture at the predicted position again by the support vector machine detector and the depth twin network detector;
if max _ R > TaIf the tracking is successful, the support vector machine detector is densely sampled and updated at the predicted target, and the target information processing part is updatedFilter F of2
Compared with the prior art, the invention has the following beneficial effects:
the invention adopts a double-detector system to detect the target again, and can carry out multiple detection on the current search area in a quick detection and depth detection mode when the target is lost, thereby increasing the success rate of detection;
in the invention, a mode of self-adaptive spatial regularization weight is adopted, target information is fully highlighted, a target background is obviously inhibited, the influence of a boundary effect is reduced, and a more robust filter can be obtained;
in the optimization solution of the objective function, the objective function is optimized by adopting an alternating direction multiplier method, so that the calculated amount can be effectively reduced, and the running speed of the model is accelerated.
Drawings
FIG. 1 is an overall frame diagram of the present invention;
FIG. 2 is a schematic diagram of a re-detection module according to the present invention;
FIG. 3 is a comparison of long-term tracking effectiveness of the present invention;
FIG. 4 is a comparison graph of detector effectiveness tracking results of the present invention;
FIG. 5 is a graph comparing the overall performance of the present invention tracking on a data set;
FIG. 6 is a comparison of the tracking characteristic results on a data set of the present invention;
fig. 7 is a sample frame of an actual tracking result of the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.
As shown in fig. 1-7, a long-term object tracking method using dual detectors is disclosed, the method comprising the steps of:
the method comprises the following steps: initializing, before performing target tracking, model initialization is performed according to target information given in the first frame,
first, the bag is extractedContaining fHOG characteristics of target background image and training initial filter F of target tracking part1The system is used for calculating the position of a target in the subsequent tracking process and adopting a mode of updating each frame to adapt to the change of the target; the HOG features of the image containing only the target are then extracted and the filter F is trained2The filter is used for judging the current tracking state in the tracking process, and meanwhile, the filter is updated only when the current tracking result is reliable, so that the filter is ensured not to introduce a large amount of noise to influence the judgment result. And finally, a target image training support vector machine detector and a depth twin network detector are used, so that the target can be effectively retrieved and the tracker can be corrected when the target is lost, and the robustness in the long-term tracking process is improved.
Step two: calculating the target position, starting from the second frame, the target tracking part in the invention will extract the characteristics of the target search area of the current frame, calculate the response value, take the position of the maximum response value as the target position in the current image, and simultaneously, carry out the filter F1Updating is performed every frame.
When the filter is solved, an adaptive spatial regularization term is added to inhibit the boundary effect, the target information is highlighted, the target function is optimized by using an alternative direction multiplier method, and the target result is obtained more efficiently;
solving filter F1The objective function of (a) is:
Figure BDA0003276588740000051
where the filter F ═ F1,f2,...,fK]. The first term of the formula is a ridge regression term, K represents the total number of channels, xKRepresenting the characteristics of the Kth sample, fKRepresenting the corresponding kth filter, y being the desired filter. In the second term, w is an adaptive spatial regularization weight, i.e., each frame of image is updated according to the current frame of image information when being tracked. In the third term, in order to prevent model degradation, an a priori reference weight w of w is introducedr。λ1And λ2Is a regularization parameter.
Because the objective function can not solve the closed-form solution, the alternative direction multiplier method with high calculation efficiency is adopted to carry out optimization solution on the closed-form solution. In the formula (1), the second term is an added adaptive spatial regularization term, and in the process of solving the objective function, a single alternative direction multiplier method is used for solving the spatial weight of the current frame once to realize spatial weight adaptation, so that the current filter is ensured to have larger weight at a target position and smaller weight at a non-target area, and the effect of inhibiting the background is achieved.
Step three: judging a tracking state; and determining whether the detector needs to be started or target information needs to be updated according to the current tracking state. The target information processing part cuts out an image only containing the target at the current position by using the estimated target position obtained in the step two, extracts the characteristics of the image, finally calculates the response value of the target image, obtains the current tracking state by comparing the magnitude relation between the maximum response value and the relevant threshold value, and executes different operations according to the difference of the tracking states:
if the maximum response value is less than TbAnd starting the detector to detect the target again: if the maximum response value is larger than TaThe SVM detector is updated to ensure that the latest and most reliable target information is stored in the detector. The method can detect the target again when the target is lost, and can update the target to adapt to the change of the target under the condition of reliable tracking.
Step four: re-detection; the invention mainly solves the problem of long-term tracking, if the target is lost in the tracking process, the current tracking can fail, therefore, a re-detection module is added, the target can be re-detected and the tracker can be corrected, and the smooth tracking can be ensured;
in the third step, if the current estimated target position is not reliable enough, the detector is started to detect again, so that the algorithm execution efficiency is improved. The re-detection module is mainly composed of a support vector machine detector (Det)1) With depth twin network detector (Det)2) Two parts, during operation, through the cooperation of the two partsThe target is accurately and efficiently detected.
The support vector machine detector draws dense training samples through the tracked target positions and target scales, and gradually trains the classifier by adding positive and negative labels to the samples according to the overlapping rate of the samples and the targets. During tracking, the detector can quickly make certain detection judgment on the target area of the current picture, and is suitable for relatively simple picture information;
assuming that N samples are collected in one frame of image, the training set is { (v)i,ci) 1, 2., N }; v in the formulaiFor the feature vector of the ith sample, class label ci{ -1, +1 }. Let the loss function of hyperplane h be l (h)i)=max{0,1-c<hi,vi>Symbol, symbol<h,v>Represents the inner product of h and v. Thus the objective function can be listed as:
Figure BDA0003276588740000061
in order to more effectively solve the hyperplane during the subsequent updating of the detector, the online passive attack learning algorithm is used for updating the hyperplane parameters of the classifier in the invention:
Figure BDA0003276588740000062
wherein
Figure BDA0003276588740000063
For the gradient of the loss function with respect to the hyperplane h, τ ∈ (0, + ∞) is the hyperparameter that controls the h-update rate.
In the face of more complicated background information, the detection effect of the support vector machine detector Det1 is obviously reduced, and the twin network detector Det2 is selected to be enabled at the moment. The feature extraction and tracking algorithm based on deep learning has the characteristics that abundant target features can be expressed, so that the method can have strong robustness when complex picture information is faced. But also complex structuresThe tracking speed is greatly reduced. In the invention, a fast and high-accuracy twin network structure is used, and the VGGNet in the convolutional neural network is used for extracting the depth feature of the image, so that clear target features can be extracted while the tracking speed is ensured. Since the detector needs to detect the target, an additional target template pool is added in the network structure, which is used for processing a plurality of candidate target areas generated by the picture and selecting the area closest to the template branch as the target area. The selection range of the candidate region takes the position obtained by the target tracking part as the center and the side length as
Figure BDA0003276588740000064
Where w and h are the estimated target width and height in the current frame, respectively, and ρ is a weight coefficient controlling the size of the region.
The twin network detector Det2 trains the detector according to the given target information when the first frame of picture, i.e. uses k-means clustering method to classify the target and the background as the target template pool. In the subsequent tracking process, after a plurality of candidate regions are obtained, respectively calculating the Euclidean distance between each candidate region and the target template through the twin network to be used as matching similarity, and selecting the region with the highest similarity as a tracking target. The following formula can be used:
Figure BDA0003276588740000065
in the formula
Figure BDA0003276588740000066
A similarity score is represented for each candidate region,
Figure BDA0003276588740000067
a sample set representing N candidate regions, upsilon (p, s) representing a tracking target p and an ith candidate sample siThe similarity score of (a).
In this embodiment, as shown in fig. 1, the model can be roughly divided into three parts: an object tracking section, an object information processing section and a detector section;
a target tracking part for respectively extracting the target image and the image characteristics containing the background to train the initial filter F when processing the first frame image1And a filter F2
In the target information processing part, the tracking state is judged through the maximum response value, and a simpler histogram of gradient directions (HOG) feature can be selected; in the target tracking part, an improved gradient direction histogram feature, namely a 31-dimensional gradient direction histogram feature (fHOG), is selected, and the feature has better performance in reflecting the edge information of the target image and the local appearance and shape of the image.
Fig. 2 shows the specific composition of the detector section, i.e. the section mainly consists of a support vector machine detector and a twin network detector, denoted Det1 and Det2, respectively. The specific implementation of each part in the algorithm is as follows:
a target tracking section: firstly, a sliding window containing background information is cut out according to given target position information in a first frame, fHOG characteristics are extracted, and a filter F is trained1. And starting from the second frame, extracting features again according to the target position obtained in the previous frame, calculating a response graph, and predicting the position of the target according to the position of the maximum response value. Solving filter F1The target function of (2) is shown in formula (1); it is noted that a picture is represented numerically as a matrix, where each value in the matrix is a pixel. The extraction of picture features and the calculation of the response map are both matrix calculations. The resulting matrix of response values we call the response map.
Because the formula cannot solve the closed-form solution due to the space regularization term added in the formula (1), the formula is iteratively optimized by adopting an alternating direction multiplier method to obtain an optimal solution, and finally the latest filter can be solved; the spatial regularization weight added in the formula (1) can effectively highlight target information, reduce the influence of a boundary effect, and perform a single optimization operation on the target information in the optimization solving process to realize the self-adaptation of the spatial weight.
The target information processing section: in this section, first, a target image is cut out from a first frame, the HOG feature of the image is extracted, and a filter F is trained2. From the second frame image, the target image of the current frame needs to be cut out according to the position information predicted by the target tracking part, the characteristic information of the image is extracted, and the maximum response value max _ R is calculated. Respectively comparing the maximum response values with a set threshold value Ta(update threshold) and Tb(re-detection threshold) for comparison. Determining the action of the detector according to the magnitude relation between the maximum response value obtained by the target information processing part and the correlation threshold value, namely if max _ R < TbThe detector parts, Det1 and Det2, are activated to re-detect at the predicted position of the current picture; if max _ R > TaThen at the predicted target dense sampling update Det1 and update the filter F of the target information processing section2
A detector section: in processing the first frame of image, the object and its surroundings are first densely sampled to obtain a large number of positive and negative samples about the object for training the initial detector Det1, and the detector Det2 is trained using the object image.
The re-detection module may re-detect when the tracker fails tracking during the tracking process. Wherein, the Det1 updates the positive and negative samples when the tracking reliability is high each time, so as to ensure that the detector has a newer target state; and if the tracking reliability is low, detecting the target in the current frame. Since the deep network is used in Det2, the result detected by the detector is more accurate, but the operation takes more time. In comprehensive consideration, the method selects the Det1 with higher speed and more accurate detection result as the main detector, and starts the Det2 to detect the target again when the result shows that the detection fails.
After the detector detects a new target position, whether the tracking is successful is judged by comparing the magnitude relation between the maximum response value at the position and a set threshold value, if so, the target position obtained by the detector is adopted, and the subsequent tracking is continued. In the tracking process, the problem of target re-retrieval under various target backgrounds can be effectively solved through the cooperative work of the double detectors, and the robustness of long-term tracking is effectively improved.
And (6) evaluating the standard. Experiments were performed in the present invention using an OTB-2015 data set, which contains 100 video sequences, each of which contains a plurality of challenge factors, including: illumination change, target deformation, motion blur, rapid motion, in-plane rotation, out-of-plane rotation, target out-of-view, background clutter, low resolution, and the like; the performance of the algorithm was evaluated using OPE (one pass evaluation) while comparing the present invention (Ours) to several other more advanced trackers (SRDCF, SimFC, PTAV, UDT, CACF, CFNet). Fig. 3 is the result of an experiment performed using the 16 longest video sequences in the data set OTB-2015, each containing more than 1000 frames of images.
It can be seen from the figure that the tracking performance of the invention is ideal and the effect is improved significantly when facing long video sequences.
Fig. 4 is a comparison diagram of actual tracking results of two algorithms without adding the detector (Ours _ ND) and adding the detector (Ours), and it can be seen from data in the diagram that the re-detection module added in the present invention can effectively retrieve the target when the target is lost, thereby improving the tracking robustness.
Fig. 5 is a comparison graph of the overall performance of the present invention on a data set with other algorithms, and it can be seen that the present invention can achieve the best results in both tracking accuracy and tracking success rate.
Fig. 6 is a comparison graph of the results of the present invention and other comparison algorithms on various tracking characteristics, and it can be seen that the present invention can achieve the optimal results on all characteristics.
Fig. 7 is a diagram of an actual tracking result of the present invention and other comparison algorithms, and it can be seen from the diagram that the present invention can track to a target more accurately when facing various challenge factors.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (7)

1. A method for long term object tracking using dual detectors, the method comprising the steps of:
extracting fHOG characteristics of the image containing the target background image, and inputting the fHOG characteristics into a pre-trained initial filter F1Calculating a maximum response value containing a target background image, and taking the position of the maximum response value as a target position in a current image; and to filter F1Updating each frame;
cutting out HOG characteristics of the image only containing the target according to the target position in the current image, and inputting a pre-trained filter F2Calculating the maximum response value of the target image;
judging the maximum response value of the target image and a preset threshold value; when the maximum response value of the target image is less than the threshold: if the tracking is lost, a re-detection module is started to re-detect the predicted position of the current picture; if the value is greater than the threshold value, the tracking is correct, and the filter F is updated2
2. Method according to claim 1, characterized in that said filter F1The solution objective function of (a) is:
Figure FDA0003276588730000011
where the filter F ═ F1,f2,...,fK](ii) a The first term of the formula is a ridge regression term, K represents the total number of channels, xKRepresenting the characteristics of the Kth sample, fKDenotes the corresponding kth filter, y being the desired filter; in the second term, w is an adaptive spatial regularization weight, namely when each frame of image is tracked, each frame of image is updated according to the current frame of image information; in the third term, in order to prevent model degradation, an a priori reference weight w of w is introducedrWherein λ is1And λ2Is a regularization parameter.
3. The method of claim 1, wherein the re-detection module comprises a support vector machine detector and a depth twin network detector; the support vector machine detector draws dense training samples through the tracked target positions and target scales, and gradually trains the classifier by adding positive and negative labels to the samples according to the overlapping rate of the samples and the targets.
4. The method of claim 3, wherein the SVM detector training method comprises the steps of:
when N samples are collected in one frame of image, the training set is { (v)i,ci) 1, 2., N }; v in the formulaiFor the feature vector of the ith sample, class label ci={-1,+1};
Let the loss function of hyperplane h be l (h)i)=max{0,1-c<hi,vi}, symbol (symbol)<h,v>Represents the inner product of h and v;
then the objective function is obtained:
Figure FDA0003276588730000012
wherein +hl(hi) For the gradient of the loss function with respect to the hyperplane h, τ ∈ (0, + ∞) is the hyperparameter that controls the h-update rate.
5. The method of claim 4, wherein the SVM detector updates hyperplane parameters using an online passive attack learning algorithm, and the formula for computing the hyperplane is as follows:
Figure FDA0003276588730000021
wherein +hl(hi) Gradient as a function of loss with respect to hyperplane hτ ∈ (0, + ∞) is a hyper-parameter that controls the h-update rate.
6. The method of claim 3, wherein the deep twin network detector training method comprises the steps of:
training a detector according to given target information during a first frame of picture, namely classifying a target and a background by using a k-means clustering method to serve as a target template pool;
obtaining a plurality of candidate regions, respectively calculating Euclidean distance between each candidate region and a target template through a twin network to be used as matching similarity, and selecting a region with the highest similarity as a tracking target; the similarity calculation formula is as follows:
Figure FDA0003276588730000022
in the formula
Figure FDA0003276588730000024
A similarity score is represented for each candidate region,
Figure FDA0003276588730000023
a sample set representing N candidate regions, upsilon (p, s) representing a tracking target p and an ith candidate sample siThe similarity score of (a).
7. The method of claim 1, wherein the maximum response value of the target image and the preset threshold are determined by:
the maximum response values max _ R are respectively matched with the set updating threshold value TaAnd re-detecting threshold TbComparing;
if max _ R < TbIf the tracking fails, activating a detector part, and detecting the current picture at the predicted position again by the support vector machine detector and the depth twin network detector;
if max _ R > TaIf the tracing is successful, then the target is predictedTarget intensive sampling updates the SVM detector and updates the filter F of the target information processing section2
CN202111119613.3A 2021-09-24 2021-09-24 Long-term target tracking method using double detectors Pending CN113902773A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111119613.3A CN113902773A (en) 2021-09-24 2021-09-24 Long-term target tracking method using double detectors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111119613.3A CN113902773A (en) 2021-09-24 2021-09-24 Long-term target tracking method using double detectors

Publications (1)

Publication Number Publication Date
CN113902773A true CN113902773A (en) 2022-01-07

Family

ID=79029210

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111119613.3A Pending CN113902773A (en) 2021-09-24 2021-09-24 Long-term target tracking method using double detectors

Country Status (1)

Country Link
CN (1) CN113902773A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114660554A (en) * 2022-05-25 2022-06-24 中国人民解放军空军预警学院 Radar target and interference detection and classification method and system
CN114897941A (en) * 2022-07-13 2022-08-12 长沙超创电子科技有限公司 Target tracking method based on Transformer and CNN

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423702A (en) * 2017-07-20 2017-12-01 西安电子科技大学 Video target tracking method based on TLD tracking systems
CN110211157A (en) * 2019-06-04 2019-09-06 重庆邮电大学 A kind of target long time-tracking method based on correlation filtering
CN110232350A (en) * 2019-06-10 2019-09-13 哈尔滨工程大学 A kind of real-time water surface multiple mobile object detecting and tracking method based on on-line study
CN110490899A (en) * 2019-07-11 2019-11-22 东南大学 A kind of real-time detection method of the deformable construction machinery of combining target tracking
CN111476819A (en) * 2020-03-19 2020-07-31 重庆邮电大学 Long-term target tracking method based on multi-correlation filtering model
CN113033356A (en) * 2021-03-11 2021-06-25 大连海事大学 Scale-adaptive long-term correlation target tracking method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423702A (en) * 2017-07-20 2017-12-01 西安电子科技大学 Video target tracking method based on TLD tracking systems
CN110211157A (en) * 2019-06-04 2019-09-06 重庆邮电大学 A kind of target long time-tracking method based on correlation filtering
CN110232350A (en) * 2019-06-10 2019-09-13 哈尔滨工程大学 A kind of real-time water surface multiple mobile object detecting and tracking method based on on-line study
CN110490899A (en) * 2019-07-11 2019-11-22 东南大学 A kind of real-time detection method of the deformable construction machinery of combining target tracking
CN111476819A (en) * 2020-03-19 2020-07-31 重庆邮电大学 Long-term target tracking method based on multi-correlation filtering model
CN113033356A (en) * 2021-03-11 2021-06-25 大连海事大学 Scale-adaptive long-term correlation target tracking method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘万军;张壮;姜文涛;张晟;: "遮挡判别下的多尺度相关滤波跟踪算法", 中国图象图形学报, no. 12, 16 December 2018 (2018-12-16) *
胡昭华;韩庆;李奇;: "基于时间感知和自适应空间正则化的相关滤波跟踪算法", 光学学报, no. 03 *
胡昭华等: "一种使用双检测器的长期目标跟踪方法", 《微电子学与计算机》, 5 September 2021 (2021-09-05) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114660554A (en) * 2022-05-25 2022-06-24 中国人民解放军空军预警学院 Radar target and interference detection and classification method and system
CN114660554B (en) * 2022-05-25 2022-09-23 中国人民解放军空军预警学院 Radar target and interference detection and classification method and system
CN114897941A (en) * 2022-07-13 2022-08-12 长沙超创电子科技有限公司 Target tracking method based on Transformer and CNN

Similar Documents

Publication Publication Date Title
US10990191B2 (en) Information processing device and method, program and recording medium for identifying a gesture of a person from captured image data
Zhang et al. Online multi-person tracking by tracker hierarchy
Yang et al. Robust superpixel tracking
Wang et al. Superpixel tracking
CN108027972B (en) System and method for object tracking
CN110197502B (en) Multi-target tracking method and system based on identity re-identification
CN108921873B (en) Markov decision-making online multi-target tracking method based on kernel correlation filtering optimization
CN111368683B (en) Face image feature extraction method and face recognition method based on modular constraint CenterFace
CN110120064B (en) Depth-related target tracking algorithm based on mutual reinforcement and multi-attention mechanism learning
CN113327272B (en) Robustness long-time tracking method based on correlation filtering
CN113902773A (en) Long-term target tracking method using double detectors
Jeong et al. Mean shift tracker combined with online learning-based detector and Kalman filtering for real-time tracking
CN111986225A (en) Multi-target tracking method and device based on angular point detection and twin network
CN113537077B (en) Label multiple Bernoulli video multi-target tracking method based on feature pool optimization
CN112785622B (en) Method and device for tracking unmanned captain on water surface and storage medium
CN104637052A (en) Object tracking method based on target guide significance detection
Wang et al. Face tracking using motion-guided dynamic template matching
CN112613565A (en) Anti-occlusion tracking method based on multi-feature fusion and adaptive learning rate updating
CN111862147A (en) Method for tracking multiple vehicles and multiple human targets in video
CN108985216B (en) Pedestrian head detection method based on multivariate logistic regression feature fusion
CN116342653A (en) Target tracking method, system, equipment and medium based on correlation filter
CN113706580B (en) Target tracking method, system, equipment and medium based on relevant filtering tracker
Zhu et al. Visual tracking with dynamic model update and results fusion
Zhang et al. Visual object tracking with saliency refiner and adaptive updating
CN110781769A (en) Method for rapidly detecting and tracking pedestrians

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination