CN109102521A - A kind of video target tracking method based on parallel attention correlation filtering - Google Patents
A kind of video target tracking method based on parallel attention correlation filtering Download PDFInfo
- Publication number
- CN109102521A CN109102521A CN201810647331.2A CN201810647331A CN109102521A CN 109102521 A CN109102521 A CN 109102521A CN 201810647331 A CN201810647331 A CN 201810647331A CN 109102521 A CN109102521 A CN 109102521A
- Authority
- CN
- China
- Prior art keywords
- target
- tracking
- weight
- function
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/262—Analysis of motion using transform domain methods, e.g. Fourier domain methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
- G06T2207/20056—Discrete and fast Fourier transform, [DFT, FFT]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The present invention discloses a kind of video target tracking method based on parallel attention correlation filtering, belongs to technical field of image processing.Tracking problem is designed as to the probability of one target position of estimation, spatial choice attention SSA and apparent Selective attention power ASA is integrated, obtains objective function using Log function, realizes the continuous and effective tracking of video object.SSA modeling is carried out first, generate a series of binary map, position response figure is obtained through filtering, then half local field around tracking target samples a series of interference regions, learn anti-interference distance metric in associated video filtering and carry out noise immunity amount canonical correlation filtering, distracter is pushed into negative domain, obtains ASA target figure, the image handled in local field and half local field is blended and is tracked to target by the objective function obtained again by Log function.When having many advantages, such as to handle problem more it is steady it is accurate, adaptable, tracking effect is good.
Description
Technical field
The present invention relates to a kind of video target tracking methods based on parallel attention correlation filtering, belong to image procossing skill
Art field.
Background technique
Vision tracking is a prerequisite, such as video monitoring, behavior in some important computer vision applications
Identification, video frequency searching and human-computer interaction etc..Although Visual Tracking has been achieved for compared with much progress in recent years, but only
The target position information that first frame can be given constantly tracks a general objectives and still has in some unconfined environment
It is challenging, this is because being apparently blocked of target, quickly movement and the disturbing factors such as deformation seriously affect.
The task of target following is to find target position and judge target property, this be where with what the problem of, also and
Attention selection mechanism in human visual perception is related.Psychology and cognitive science research evidence show that human visual perception has
There are principal characteristic and selectivity, so that the vision system of people can be absorbed in the related important visual information of quickly processing.People
There are two main visual attention mechanisms in class visual perception: one is Learning and memory power (SSA), it can subtract
The opposite field of a small neuron, and improve the sensibility to a specific position in vision territory;The other is apparent selection
Attention (ASA), it passes through the different types of feature of special disposal to enhance response, with this in corticocerebral not same district
Domain enhancing activity.
After leaving eyes, these scene input signals for entering prefrontal cerebral cortex are divided into back stream and abdomen stream,
I.e. where the former utilize existing spatial relationship (), and the latter then highlights appearance features (i.e. what).Some perception are ground
Study carefully proof, the function of both types may be processed in parallel, distracter that these mechanism are tracked in processing target, it is fuzzy and
It can play an important role when blocking.How using these research go processing correlation filtering class tracker in where and what
The problem of, it is of great significance for solving the target following under complex environment.
Summary of the invention
The technical problem to be solved by the present invention is to cannot constantly track general objectives for existing method for tracking target
Disadvantage proposes a kind of video target tracking method based on parallel attention correlation filtering, by merging Learning and memory
Power and apparent Selective attention power are realized and are tracked to the continuous and effective of video object.
In order to solve the above technical problems, the present invention provides a kind of video frequency object tracking based on parallel attention correlation filtering
Tracking problem is designed as the probability of one target position of estimation by method, integrates spatial choice attention (SSA) and apparent choosing
Attention (ASA) is selected, obtains objective function using Log function, realizes the continuous and effective tracking of video object, including following step
It is rapid:
(1) SSA position response figure is obtained: firstly, tracking the local field around target for tracking target, generating one
The binary map of series describes the topological structure under different grain size between target and its surrounding's scene, from top to bottom by picture
By description granularity arrangement from thick to thin, one group of tracking target Boolean Graphs B is obtainedi(i=1,2 ..., Nb), coarseness cloth
Your figure, which carries out the apparent target of Coding and description to global shape information, apparently to be changed, and fine-grained Boolean Graphs describe the details in space
Structure;Then, for tracking one two value filter F of object definition, F is acted on into Boolean Graphs BiOn, it obtains condition position and rings
Ying Tu, and study weight is completed by minimizing linear regression function, learn an optimal weight for each Boolean Graphs, it is right
The position response figure that each figure weights to the end:
(2) ASA target figure is obtained: firstly, half local field around tracking target samples a series of interference regions, it will
Ridge regression objective function approximately equivalent be a metric learning correlation filter, associated video filtering in learn it is anti-interference away from
From measurement, the correlation between modeling positive sample is solved;Then, anti-interference measurement regular terms is introduced, is carried out to through target image
Anti-interference measurement canonical correlation filtering learns anti-interference distance metric in correlation filtering, while considering to come from true negative sample
Useful correlation, distracter is pushed into negative domain, obtains target following picture:
(3) it persistently tracks video object: obtaining the objective function for integrating SSA and ASA by Log function modelling, utilize this
Function tracks video object, and online updating parameter, realizes effective tracking to video object.
Specific step is as follows for the video target tracking method based on parallel attention correlation filtering:
(1) SSA position response figure is obtained
(1.1) for tracking target, tracking target around local field, by following formula generate a series of binary map come
Topological structure under different grain size between target and its surrounding's scene is described:
Wherein, I (j) indicates j-th of image pixel intensities, and U () is a function of a single variable, and R () indicates a bracket function,It is the RGB color channel figure an of image block, T indicates transposition;
Picture is pressed to description granularity arrangement from thick to thin from top to bottom, obtains one group of tracking target Boolean Graphs Bi(i=1,
2 ..., Nb), coarseness Boolean Graphs, which carry out the apparent target of Coding and description to global shape information, apparently to be changed, fine granularity
Boolean Graphs the detailed structure in space is described
(1.2) weight study is carried out: according to a conventional method, for tracking one two value filter of object definitionF is acted on into the tracking target Boolean Graphs B that step (1.1) obtainsiOn, obtain one group of condition position response
Figure, and study weight is completed by following minimum linear regression function, learn an optimal weight for each Boolean GraphsEach figure is weighted, one group of last position response figure P (B is obtainedi,F|I∈Ωo):
Wherein, ΩoIt is to occur mesh target area, Ω in scenebIt is the background area occurred in scene, dwIt is the width of feature
Degree, dhIt is the height of feature,It is the classifier parameters vector of kth frame,It is the number of pixels of non-blank-white in target area,It is the number of non-empty white pixel in background area, βkIt is a weight coefficient to be optimized, weight coefficient
It needs to pass throughOnline updating, to adapt to target with the apparent variation of time, βtAfter being update
Weight coefficient vector, η is fusion coefficients,It is the weight coefficient vector of present frame;
(2) ASA target figure is obtained
(2.1) half local field around tracking target samples a series of interference regionsBy following ridge regression mesh
Scalar functions approximately equivalent is the correlation filter of a metric learningStudy is anti-in associated video filtering
Interference distance measurement;
Wherein,XiIt is sample matrix,It is the DFT of vector x,It isThe i-th row, wiIt is i-th of sample matrix XiCorresponding dependent filter weight,It is all wiThe vector of composition, y are
The label of Gaussian, dw′dh' be respectively eigenmatrix width and height, λ is regularization coefficient,It is geneva
Distance,And
(2.2) anti-interference measurement regular terms is introduced in correlation filtering objective function, and it is related to obtain anti-interference measurement canonical
Filtering ModelBy the model to the target image obtained through step (2.1) into one
Step carries out anti-interference measurement canonical correlation filtering, strengthens differentiation and tracking to target signature, the distracter filtered is pushed into negative
Domain obtains positive space target following picture P (Xi, wi|I∈Ωo):
Wherein,It is k-th of subvector in anti-interference measurement canonical correlation filtering weight,Total sample to
K-th of subvector in amount,It is k-th of subvector in Gaussian label vector, wiIt is i-th of circulation sample matrix
Corresponding weight vectors,Pass throughOnline updating obtains,It is to askThe obtained tracking result of t frame of inverse FFT,It isConjugate transposition, I is single
Bit matrix, λ are regularization coefficients, and η is fusion coefficients;
It is defined as:
Wherein, xiIt is i-th of sample vector,It is m-th of circulation sample of k-th of basic sample,It is kth
N-th of circulation sample of basic sample, wmnIt is that differences between samples weight (for measuring the similitude between sample i and j, get over by weight
Greatly, the otherness of sample is bigger, and the appearance features acquired just more have judgement index);
(4) video object is persistently tracked
By Log function modelling, SSA and ASA image is integrated, obtains following objective function:
Wherein, P (Bi, F | I ∈ Ωo) indicate the SSA position response figure obtained, Indicate a series of NbThe Boolean Graphs in channel,Indicate Boolean Graphs filter, P (Xi, wi|I
∈Ωo) indicate the ASA target figure obtained,* a space correlation operation, β are indicatediTable
Show a weight coefficient to be optimized, e(·)Indicate exponential function, Ωo∈R2Indicate target area, o expression appears in scene
Target,Indicate a series of NxCircular matrix (be each of wherein by one mobile
Basic HOG feature channel vectorObtain, all feature channels are all independently distributed),
Indicate ASA filter;
Using the objective function, video object is tracked, and online updating parameter, realize to target it is effective with
Track.
The value of the regularization coefficient λ is 0.001, and the value of fusion coefficients η is 0.006.
The principle of the present invention is:
Core of the invention be tracking problem is planned to estimation one target position probability, seamlessly integrate SSA and
ASA:
Here Ωo∈R2It indicates target area and o indicates to appear in the target in scene,Table
Show a series of NbThe Boolean Graphs in channel,It is a series of NxCircular matrix, be each of wherein
By to a mobile basic HOG feature channel vectorIt obtains,WithIt is their corresponding filters.In addition, to put it more simply, all feature channels are assumed to be independently distributed.
Finally, utilizing Log function on the both sides of formula (1), obtain:
Here P (Bi, F | I ∈ Ωo) and P (Xi, wi|I∈Ωo) is defined as:
Here * is a space correlation operation, βiIt is a weight coefficient to be optimized, and e(·)It is an index letter
Number.
In modeling SSA, the present invention generates a series of binary map first, i.e. generation BMR, describes mesh under different grain size
Topological structure between mark and its surrounding's scene.In Fig. 2, from top to bottom, Boolean Graphs description granularity from coarse to fine, wherein
Coarseness Boolean Graphs encode global shape information, and it is robust that it, which apparently changes big target, however, it is fine-grained then
The CONSTRUCTED SPECIFICATION in space is described, it is effective to the positioning of accurate target.Then by two value filtering predetermined
Device acts on these figures, obtains one group of condition position response diagram, wherein being each weighted the position response for coming to the end
Figure, target are that one optimal weight of study is gone for each Boolean Graphs.
BMR is the inspiration studied by nearest human visual attention, shows as realizing energy to the of short duration consciousness of a scene
Enough indicated using one group of Boolean Graphs.Particularly, it providesIt is the RGB color channel figure an of image block, it
AccordinglyIt is obtained by following formula
Here threshold θiFrom being independently distributed (black and white binary map) between [0,255], and this symbol
The sign of inequality of >=expression Element-Level.To put it more simply, threshold value is set as θi=Nb(i-1)/255, it passes through a fixed step-length
δ=Nb/ 255 from 0 to 255 between sample because the sampling of fixed step size is and the uniform sampling equivalent of infinite δ → 0.
Therefore, it is very easy to proveAlso, j-th of image pixel intensities I (j) can be expressed as
Here U () is a function of a single variable, such as U (2)=[1;1;0], (3)=[1 U;1;1] there are 3 discrete layers,
Also, R () indicates a bracket function.It is the RGB color channel figure an of image block,
When carrying out weight study, the present invention learns weight by minimizing linear regression function below:
Here | | | |FIndicate F norm.It will be apparent that minimizing in formula (6)It is equivalent to minimize following
Objective function:
Here formula (5) by formula (7) instead of.ΩoAnd ΩbTarget and background region is respectively indicated, and
By settingIt minimizesObtained solution { βiCan obtain
In order to which adaptive targets are with the apparent variation of time, online updating coefficient
HereIt is calculated by formula (8) using the tracking result in t frame.
In terms of solving interference problem, the present invention focuses on that study appearance features are former using the ASA in human visual perception
Reason is shifted distracter in negative space by one anti-interference distance metric of study, with the discriminating power of this Enhanced feature,
Robust tracking when to generate for distracter, can distinguish target in distracter well.Phase will first be learnt
Closing filtering is approximately one distance metric of study, solves the correlation between modeling positive sample, then learns in correlation filtering
Anti-interference distance metric, while considering the useful correlation from true negative sample.
In learning distance metric, study CF is expressed as a space ridge regression objective function:
HereIt is a Gauss regressive object,
Also, λ is a regularization coefficient.Notice ifIt is arrived by remodelingFor any a ≠ 0, then, formula (10) can
It is divided by weight-normalityIn addition to remolding y with the ratio of 1/a, it be equivalent to formula (10) also, by
In same peak response position, this will generate same tracking result.
Based on this, in order to clearly show the relationship between correlation filtering study and metric learning, the setting in formula (10)And it remoldsThis is equivalent to that constraint is inside added
Next, with markTo indicateThe i-th row, then, rewrite formula (10) in data item are as follows:
HereIt is mahalanobis distance,AndBe one be entirely one vector.Cause
This, study correlation filtering substantially can be regarded as one optimal distance metric of study.
But it only considered the relationship between positive sample in formula (11), therefore limit it and distinguish target from background
Discriminating power.In order to solve this problem, an anti-interference measurement regular terms is added in formula (10), by negative space
Relationship composition, and as a kind of strength for distracter being shifted onto negative space.
When carrying out anti-interference measurement canonical correlation filtering, a series of interference are sampled from half local field around target first
RegionThen the interaction between them is modeled asAnd it is whole
It closes into formula (10) as a regular terms:
Here γ is a regularization coefficient, and wmnIt is a weight, it measures out similar between sample i and j
Property.Weight is bigger, and sample variation is bigger, so that the appearance features acquired more have judgement index.
Formula (12) can be formulated for again:
HereAnd
ThisMinimal solution can pass throughIt obtains:
HereIt is a block matrix, there is Nx×NxA block
HereAnd
Because circular matrix X meets
Here F indicates Discrete Fourier Transform (DFT) matrix,Indicate the DFT of reference vector x, and FH=(F*)T
Indicate conjugate transposition.Using this modeling, formula (15) can be diagonally melted into
HereAndIn addition, (16) are substituted into (14), it
The right can be divided by weight-normality
Substitution formula (17), (18), obtain the FFT of its solution in formula (14)
HereIts i-th of element beK-th of element, andSimilar to formula (9),It is to be obtained by online updating
HereIt is to be calculated by formula (19), is the tracking result of t frame.
It is defined as
HereBecauseLine number beIt is
Columns, in direct formula for calculating (19)It is inverse not-so-practical.On the contrary, we pass through with transformationTo calculateIt is inverse.OwnedLater, it can concurrently be calculated, the optimal solution of formula (14)It can be by askingInverse FFT
It obtains.
The present invention is based on human visual perceptions to propose a correlation filtering class track algorithm, reflects in human vision sense
SSA and ASA mechanism in knowing enhances the robust of target following by concurrently handling a part and half local background domain
Property and anti-interference.For local field, in order to model SSA, a simple but effective BMR is introduced into correlation filtering study
Among, the local topology of target and its scene is portrayed by random binary image color channel, for various
Transformation is constant.For half local field, in order to model ASA, an anti-interference measurement regular terms is introduced into the mesh of correlation filtering
Among scalar functions, it is encountering challenging target similar object as a strength distracter push-in negative domain
When distracter, the robustness of tracking is enhanced.When with processing problem more it is steady it is accurate, adaptable, tracking effect is good etc.
Advantage is, it can be achieved that the continuous and effective to video object tracks.
Detailed description of the invention
Fig. 1 is the principle of the present invention figure.
Fig. 2 is the flow chart of present invention modeling SSA.
Specific embodiment
A specific embodiment of the invention is further described in detail with reference to the accompanying drawing, the skill being not specified in embodiment
The conventional products that art or product are the prior art or can be obtained by purchase.
Embodiment 1: as shown in Figure 1, 2, the video target tracking method based on parallel attention correlation filtering is will be with
Track problem is designed as the probability of one target position of estimation, integrates spatial choice attention (SSA) and apparent Selective attention power
(ASA), objective function is obtained using Log function, realizes the continuous and effective tracking of video object, comprising the following steps:
(1) SSA position response figure is obtained: firstly, tracking the local field around target for tracking target, generating one
The binary map of series describes the topological structure under different grain size between target and its surrounding's scene, from top to bottom by picture
By description granularity arrangement from thick to thin, one group of tracking target Boolean Graphs B is obtainedi(i=1,2 ..., Nb), coarseness cloth
Your figure, which carries out the apparent target of Coding and description to global shape information, apparently to be changed, and fine-grained Boolean Graphs describe the details in space
Structure;Then, for tracking one two value filter F of object definition, F is acted on into Boolean Graphs BiOn, it obtains condition position and rings
Ying Tu, and study weight is completed by minimizing linear regression function, learn an optimal weight for each Boolean Graphs, it is right
The position response figure that each figure weights to the end:
(2) ASA target figure is obtained: firstly, half local field around tracking target samples a series of interference regions, it will
Ridge regression objective function approximately equivalent be a metric learning correlation filter, associated video filtering in learn it is anti-interference away from
From measurement, the correlation between modeling positive sample is solved;Then, anti-interference measurement regular terms is introduced, is carried out to through target image
Anti-interference measurement canonical correlation filtering learns anti-interference distance metric in correlation filtering, while considering to come from true negative sample
Useful correlation, distracter is pushed into negative domain, obtains target following picture:
(3) it persistently tracks video object: obtaining the objective function for integrating SSA and ASA by Log function modelling, utilize this
Function tracks video object, and online updating parameter, realizes effective tracking to video object.
Based on the video target tracking method of parallel attention correlation filtering, specific step is as follows:
(1) SSA position response figure is obtained
(1.1) for tracking target, tracking target around local field, by following formula generate a series of binary map come
Topological structure under different grain size between target and its surrounding's scene is described:
Wherein, I (j) indicates j-th of image pixel intensities, and U () is a function of a single variable, and R () indicates a bracket function,It is the RGB color channel figure an of image block, T indicates transposition;
Picture is pressed to description granularity arrangement from thick to thin from top to bottom, obtains one group of tracking target Boolean Graphs Bi(i=1,
2 ..., Nb), coarseness Boolean Graphs, which carry out the apparent target of Coding and description to global shape information, apparently to be changed, fine granularity
Boolean Graphs the detailed structure in space is described
(1.2) weight study is carried out: according to a conventional method, for tracking one two value filter of object definitionF is acted on into the tracking target Boolean Graphs B that step (1.1) obtainsiOn, obtain one group of condition position response
Figure, and study weight is completed by following minimum linear regression function, learn an optimal weight for each Boolean GraphsEach figure is weighted, one group of last position response figure P (B is obtainedi, F | I ∈ Ωo):
Wherein, ΩoIt is to occur mesh target area, Ω in scenebIt is the background area occurred in scene, dwIt is the width of feature
Degree, dhIt is the height of feature,It is the classifier parameters vector of kth frame,It is the number of pixels of non-blank-white in target area,It is the number of non-empty white pixel in background area, βkIt is a weight coefficient to be optimized, weight coefficient
It needs to pass throughOnline updating, to adapt to target with the apparent variation of time, βtAfter being update
Weight coefficient vector, η is fusion coefficients,It is the weight coefficient vector of present frame;
(2) ASA target figure is obtained
(2.1) half local field around tracking target samples a series of interference regionsBy following ridge regression mesh
Scalar functions approximately equivalent is the correlation filter of a metric learningLearn in associated video filtering
Anti-interference distance metric;
Wherein,xiSample matrix,It is the DFT of vector x,It isThe i-th row, wiIt is i-th of sample matrix XiCorresponding dependent filter weight,It is all wiThe vector of composition, y are Gausses
The label of type, dw′dh' be eigenmatrix width and height, λ is regularization coefficient,It is mahalanobis distance,And
(2.2) anti-interference measurement regular terms is introduced in correlation filtering objective function, and it is related to obtain anti-interference measurement canonical
Filtering ModelBy the model to the target image obtained through step (2.1) into one
Step carries out anti-interference measurement canonical correlation filtering, strengthens differentiation and tracking to target signature, the distracter filtered is pushed into negative
Domain obtains positive space target following picture P (Xi, wi|I∈Ωo):
Wherein,It is k-th of subvector in anti-interference measurement canonical correlation filtering model,Total sample to
K-th of subvector in amount,It is k-th of subvector in Gaussian label vector, wiIt is i-th of circulation sample matrix
Corresponding weight vectors,Pass throughWhat online updating obtained,It is to askThe obtained tracking result of t frame of inverse FFT,It isConjugate transposition, I is
Unit matrix, λ are regularization coefficients, and η is fusion coefficients;
It is defined as:
Wherein, XiIt is i-th of sample vector,It is m-th of circulation sample of k-th of basic sample,It is kth base
N-th of circulation sample of plinth sample, WmnIt is that differences between samples weight (for measuring the similitude between sample i and j, get over by weight
Greatly, the otherness of sample is bigger, and the appearance features acquired just more have judgement index);
(4) video object is persistently tracked
By Log function modelling, SSA and ASA image is integrated, obtains following objective function:
Wherein, P (Bi, F | I ∈ Ωo) indicate the SSA position response figure obtained, Indicate a series of NbThe Boolean Graphs in channel,Indicate Boolean Graphs filter, P (Xi, w, | I
∈Ωo) indicate the ASA target figure obtained,* a space correlation operation, β are indicatediTable
Show a weight coefficient to be optimized, e(·)Indicate exponential function, Ωo∈R2Indicate target area, o expression appears in scene
Target,Indicate a series of NxCircular matrix (be each of wherein by one mobile
Basic HOG feature channel vectorObtain, all feature channels are all independently distributed),Indicate ASA filter;
Using the objective function, video object is tracked, and online updating parameter, realize to target it is effective with
Track.
In this example, regularization coefficient λ=0.001, fusion coefficients η=0.3.
Embodiment 2: as shown in Figure 1, 2, the video target tracking method based on parallel attention correlation filtering is will be with
Track problem is designed as the probability of one target position of estimation, integrates spatial choice attention SSA and apparent Selective attention power ASA,
Objective function is obtained using Log function, realizes the continuous and effective tracking of video object, comprising the following steps:
(1) SSA position response figure is obtained: firstly, generating a series of binary map for tracking target to describe different grains
Lower topological structure between target and its surrounding's scene is spent, picture is pressed to description granularity arrangement from thick to thin from top to bottom,
Obtain one group of tracking target Boolean Graphs Bi, it is apparent that coarseness Boolean Graphs carry out the apparent target of Coding and description to global shape information
Variation, fine-grained Boolean Graphs describe the detailed structure in space;Then, for tracking one two value filter F of object definition,
F is acted on into Boolean Graphs BiOn, condition position response diagram is obtained, and complete study weight by minimizing linear regression function,
Learn an optimal weight for each Boolean Graphs, the position response figure for weighting to the end to each figure:
(2) obtain ASA target figure: half local field first around tracking target samples a series of interference regions, by ridge
Regressive object approximation to function is equivalent to the correlation filter of a metric learning, learns anti-interference distance in associated video filtering
Measurement solves the correlation between modeling positive sample;Then anti-interference measurement regular terms is introduced, is resisted to through target image
Interference metric canonical correlation filtering learns anti-interference distance metric in correlation filtering, while considering from true negative sample
Distracter is pushed into negative domain, obtains target following picture by useful correlation:
(3) it persistently tracks video object: obtaining the objective function for integrating SSA and ASA by Log function modelling, utilize this
Function tracks video object, and online updating parameter, realizes effective tracking to video object.
The specific steps of this example are same as Example 1, regularization coefficient λ=0.001, fusion coefficients η=0.3.
Technology contents of the invention are described above in conjunction with attached drawing, but protection scope of the present invention be not limited to it is described
Content within the knowledge of one of ordinary skill in the art can also be in the premise for not departing from present inventive concept
Under technology contents of the invention are made a variety of changes, all within the spirits and principles of the present invention, any modification for being made,
Equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.
Claims (3)
1. a kind of video target tracking method based on parallel attention correlation filtering, it is characterised in that: design tracking problem
For the probability for estimating a target position, spatial choice attention (SSA) and apparent Selective attention power (ASA) are integrated, Log is utilized
Function obtains objective function, realizes the continuous and effective tracking of video object, comprising the following steps:
(1) SSA position response figure is obtained: firstly, tracking the local field around target for tracking target, generating a series of
Binary map the topological structure under different grain size between target and its surrounding's scene is described, picture pressed from top to bottom by thick
It is arranged to thin description granularity, obtains one group of tracking target Boolean Graphs Bi(i=1,2 ..., Nb), coarseness Boolean Graphs are to complete
Office's shape information carries out the apparent target of Coding and description and apparently changes, and fine-grained Boolean Graphs describe the detailed structure in space;So
Afterwards, for tracking one two value filter F of object definition, F is acted on into Boolean Graphs BiOn, condition position response diagram is obtained, and
Study weight is completed by minimizing linear regression function, learns an optimal weight for each Boolean Graphs, each figure is added
The position response figure for weighing to the end:
(2) ASA target figure is obtained: firstly, half local field around tracking target samples a series of interference regions, by ridge regression
Objective function approximately equivalent is the correlation filter of a metric learning, is learnt in associated video filtering anti-interference apart from degree
Amount;Then, anti-interference measurement regular terms is introduced, anti-interference measurement canonical correlation filtering is carried out to through target image, by distracter
It is pushed into negative domain, obtains target following picture:
(3) it persistently tracks video object: obtaining the objective function for integrating SSA and ASA by Log function modelling, utilize the function
Video object is tracked, and online updating parameter, realizes effective tracking to video object.
2. the video target tracking method according to claim 1 based on parallel attention correlation filtering, it is characterised in that:
Specific step is as follows for the video target tracking method:
(1) SSA position response figure is obtained
(1.1) for tracking target, the local field around target is being tracked, a series of binary map is generated by following formula to describe
Topological structure under different grain size between target and its surrounding's scene:
Wherein, I (j) indicates j-th of image pixel intensities, and U () is a function of a single variable, and R () indicates a bracket function,It is the RGB color channel figure an of image block, T indicates transposition;
Picture is pressed to description granularity arrangement from thick to thin from top to bottom, obtains one group of tracking target Boolean Graphs Bi(i=1,
2 ..., Nb), coarseness Boolean Graphs, which carry out the apparent target of Coding and description to global shape information, apparently to be changed, fine granularity
Boolean Graphs the detailed structure in space is described
(1.2) weight study is carried out: for tracking one two value filter of object definitionF is acted on into step
(1.1) the tracking target Boolean Graphs B obtainediOn, one group of condition position response diagram is obtained, and pass through following minimum linear regression
Function completes study weight, learns an optimal weight for each Boolean GraphsEach figure is weighted, obtain one group it is last
Position response figure P (Bi, F | I ∈ Ωo):
Wherein, ΩoIt is to occur mesh target area, Ω in scenebIt is the background area occurred in scene, dwIt is the width of feature, dh
It is the height of feature,It is the classifier parameters vector of kth frame,It is the number of pixels of non-blank-white in target area,It is
The number of non-empty white pixel, β in background areakIt is a weight coefficient to be optimized, weight coefficientPass throughOnline updating, βtIt is the weight coefficient vector after updating, η is fusion coefficients,It is present frame
Weight coefficient vector;
(2) ASA target figure is obtained
(2.1) half local field around tracking target samples a series of interference regionsBy following ridge regression target letter
Number approximately equivalent is the correlation filter of a metric learningLearn in associated video filtering anti-interference
Distance metric;
Wherein,XiIt is sample matrix,It is the DFT of vector x,It is
The i-th row, wiIt is i-th of sample matrix xiCorresponding dependent filter weight,It is all wiThe vector of composition, y are Gaussians
Label, dw′dh' be respectively eigenmatrix width and height, λ is regularization coefficient,It is mahalanobis distance,And
(2.2) anti-interference measurement regular terms is introduced in correlation filtering objective function, obtains anti-interference measurement canonical correlation filtering
ModelBy the model to the target image obtained through step (2.1) further into
The anti-interference measurement canonical correlation filtering of row, strengthens the differentiation and tracking to target signature, and the distracter filtered is pushed into negative domain, is obtained
Take positive space target following picture P (Xi, wi| I ∈ Ω o):
Wherein,It is k-th of subvector in anti-interference measurement canonical correlation filtering weight,It is in total sample vector
K-th of subvector,It is k-th of subvector in Gaussian label vector, wiIt is that i-th of circulation sample matrix is corresponding
Weight vectors,Pass throughOnline updating obtains,It is to askThe obtained tracking result of t frame of inverse FFT,It isConjugate transposition, I is single
Bit matrix, λ are regularization coefficients, and η is fusion coefficients;
It is defined as:
Wherein, xiIt is i-th of sample vector,It is m-th of circulation sample of k-th of basic sample,It is k-th of basic sample
This n-th of circulation sample, wmnIt is differences between samples weight;
(4) video object is persistently tracked
By Log function modelling, SSA and ASA image is integrated, obtains following objective function:
Wherein, P (Bi, F | I ∈ Ωo) indicate the SSA position response figure obtained, Indicate a series of NbThe Boolean Graphs in channel,Indicate Boolean Graphs filter, P (Xi, wi|I
∈Ωo) indicate the ASA target figure obtained,* a space correlation operation, β are indicatediTable
Show a weight coefficient to be optimized, e(·)Indicate exponential function, Ωo∈R2Indicate target area, o expression appears in scene
Target,Indicate a series of NxCircular matrix (be each of wherein by a mobile base
This HOG feature channel vectorObtain, all feature channels are all independently distributed),Table
Show ASA filter;
Using the objective function, video object is tracked, and online updating parameter, realizes effective tracking to target.
3. the video target tracking method according to claim 2 based on parallel attention correlation filtering, it is characterised in that:
The value of the regularization coefficient λ is 0.001, and the value of fusion coefficients η is 0.3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810647331.2A CN109102521B (en) | 2018-06-22 | 2018-06-22 | Video target tracking method based on parallel attention-dependent filtering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810647331.2A CN109102521B (en) | 2018-06-22 | 2018-06-22 | Video target tracking method based on parallel attention-dependent filtering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109102521A true CN109102521A (en) | 2018-12-28 |
CN109102521B CN109102521B (en) | 2021-08-27 |
Family
ID=64844863
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810647331.2A Active CN109102521B (en) | 2018-06-22 | 2018-06-22 | Video target tracking method based on parallel attention-dependent filtering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109102521B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109919114A (en) * | 2019-03-14 | 2019-06-21 | 浙江大学 | One kind is based on the decoded video presentation method of complementary attention mechanism cyclic convolution |
CN109993777A (en) * | 2019-04-04 | 2019-07-09 | 杭州电子科技大学 | A kind of method for tracking target and system based on double-template adaptive threshold |
CN110102050A (en) * | 2019-04-30 | 2019-08-09 | 腾讯科技(深圳)有限公司 | Virtual objects display methods, device, electronic equipment and storage medium |
CN110335290A (en) * | 2019-06-04 | 2019-10-15 | 大连理工大学 | Twin candidate region based on attention mechanism generates network target tracking method |
CN110443852A (en) * | 2019-08-07 | 2019-11-12 | 腾讯科技(深圳)有限公司 | A kind of method and relevant apparatus of framing |
CN110807437A (en) * | 2019-11-08 | 2020-02-18 | 腾讯科技(深圳)有限公司 | Video granularity characteristic determination method and device and computer-readable storage medium |
CN112085765A (en) * | 2020-09-15 | 2020-12-15 | 浙江理工大学 | Video target tracking method combining particle filtering and metric learning |
CN113704684A (en) * | 2021-07-27 | 2021-11-26 | 浙江工商大学 | Centralized fusion robust filtering method |
CN113808171A (en) * | 2021-09-27 | 2021-12-17 | 山东工商学院 | Unmanned aerial vehicle visual tracking method based on dynamic feature selection of feature weight pool |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105809713A (en) * | 2016-03-03 | 2016-07-27 | 南京信息工程大学 | Object tracing method based on online Fisher discrimination mechanism to enhance characteristic selection |
-
2018
- 2018-06-22 CN CN201810647331.2A patent/CN109102521B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105809713A (en) * | 2016-03-03 | 2016-07-27 | 南京信息工程大学 | Object tracing method based on online Fisher discrimination mechanism to enhance characteristic selection |
Non-Patent Citations (3)
Title |
---|
QINGSHAN LIU,ETC: "Visual Tracking via Nonlocal Similarity Learning", 《IEEE TRANSACTION ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》 * |
ZHETAO LI,ETC: "Visual Tracking With Weighted Adaptive Local Sparse Appearance Model via Spatio-Temporal Context Learning", 《IEEE TRANSCATION ON IMAGE PROCESSING》 * |
樊佳庆等: "通道稳定性加权补充学习的实时视觉跟踪算法", 《计算机应用》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109919114A (en) * | 2019-03-14 | 2019-06-21 | 浙江大学 | One kind is based on the decoded video presentation method of complementary attention mechanism cyclic convolution |
CN109993777A (en) * | 2019-04-04 | 2019-07-09 | 杭州电子科技大学 | A kind of method for tracking target and system based on double-template adaptive threshold |
CN110102050A (en) * | 2019-04-30 | 2019-08-09 | 腾讯科技(深圳)有限公司 | Virtual objects display methods, device, electronic equipment and storage medium |
US11615570B2 (en) | 2019-04-30 | 2023-03-28 | Tencent Technology (Shenzhen) Company Limited | Virtual object display method and apparatus, electronic device, and storage medium |
CN110335290A (en) * | 2019-06-04 | 2019-10-15 | 大连理工大学 | Twin candidate region based on attention mechanism generates network target tracking method |
CN110443852B (en) * | 2019-08-07 | 2022-03-01 | 腾讯科技(深圳)有限公司 | Image positioning method and related device |
CN110443852A (en) * | 2019-08-07 | 2019-11-12 | 腾讯科技(深圳)有限公司 | A kind of method and relevant apparatus of framing |
CN110807437A (en) * | 2019-11-08 | 2020-02-18 | 腾讯科技(深圳)有限公司 | Video granularity characteristic determination method and device and computer-readable storage medium |
CN110807437B (en) * | 2019-11-08 | 2023-01-03 | 腾讯科技(深圳)有限公司 | Video granularity characteristic determination method and device and computer-readable storage medium |
CN112085765A (en) * | 2020-09-15 | 2020-12-15 | 浙江理工大学 | Video target tracking method combining particle filtering and metric learning |
CN112085765B (en) * | 2020-09-15 | 2024-05-31 | 浙江理工大学 | Video target tracking method combining particle filtering and metric learning |
CN113704684A (en) * | 2021-07-27 | 2021-11-26 | 浙江工商大学 | Centralized fusion robust filtering method |
CN113704684B (en) * | 2021-07-27 | 2023-08-29 | 浙江工商大学 | Centralized fusion robust filtering method |
CN113808171A (en) * | 2021-09-27 | 2021-12-17 | 山东工商学院 | Unmanned aerial vehicle visual tracking method based on dynamic feature selection of feature weight pool |
Also Published As
Publication number | Publication date |
---|---|
CN109102521B (en) | 2021-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109102521A (en) | A kind of video target tracking method based on parallel attention correlation filtering | |
Baldwin et al. | Time-ordered recent event (tore) volumes for event cameras | |
Zou et al. | Attend to count: Crowd counting with adaptive capacity multi-scale CNNs | |
Li et al. | Directional temporal modeling for action recognition | |
AU2013254437A1 (en) | Abnormal object track determination using a Gaussian Processes based Variational Bayes Expectation Maximisation | |
Wu et al. | A visual attention model based on hierarchical spiking neural networks | |
CN107704924B (en) | Construction method of synchronous self-adaptive space-time feature expression learning model and related method | |
Hamker | Modeling feature-based attention as an active top-down inference process | |
Wang et al. | A cognitive memory-augmented network for visual anomaly detection | |
Rudi et al. | Parameter estimation with dense and convolutional neural networks applied to the FitzHugh–Nagumo ODE | |
Li et al. | Dynamic spatio-temporal specialization learning for fine-grained action recognition | |
Ratre et al. | Tucker visual search-based hybrid tracking model and Fractional Kohonen Self-Organizing Map for anomaly localization and detection in surveillance videos | |
Iqbal et al. | Learning feature fusion strategies for various image types to detect salient objects | |
Kavikuil et al. | Leveraging deep learning for anomaly detection in video surveillance | |
Xie et al. | Representation learning: A statistical perspective | |
Li et al. | Action recognition using visual attention with reinforcement learning | |
Wu et al. | Spatial-temporal graph attention network for video-based gait recognition | |
Yildirim et al. | A new model for classification of human movements on videos using convolutional neural networks: MA-Net | |
Linsley et al. | Tracking without re-recognition in humans and machines | |
CN115410222A (en) | Video pedestrian re-recognition network with posture sensing function | |
Ding et al. | Machine learning model for feature recognition of sports competition based on improved TLD algorithm | |
CN108257148B (en) | Target suggestion window generation method of specific object and application of target suggestion window generation method in target tracking | |
Lehnert et al. | Retina-inspired visual module for robot navigation in complex environments | |
Xu et al. | Label noise robust crowd counting with loss filtering factor | |
Venu et al. | Disease Identification in Plant Leaf Using Deep Convolutional Neural Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |