CN109191491B - Target tracking method and system of full convolution twin network based on multi-layer feature fusion - Google Patents
Target tracking method and system of full convolution twin network based on multi-layer feature fusion Download PDFInfo
- Publication number
- CN109191491B CN109191491B CN201810878152.XA CN201810878152A CN109191491B CN 109191491 B CN109191491 B CN 109191491B CN 201810878152 A CN201810878152 A CN 201810878152A CN 109191491 B CN109191491 B CN 109191491B
- Authority
- CN
- China
- Prior art keywords
- image
- target
- frame
- score
- map
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/223—Analysis of motion using block-matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a target tracking method and a target tracking system of a convolution twin network based on multilayer feature fusion, wherein the method comprises the following steps: according to the target position and size of the image, cutting out target template images and search area images of all images in an image sequence training set, wherein an image pair formed by the target template images and the search area images forms a training data set; constructing a convolution twin network based on multilayer feature fusion; training the multilayer feature fusion-based convolution twin network based on a training data set to obtain a well-trained multilayer feature fusion-based convolution twin network; and (4) performing target tracking by using a trained convolution twin network based on multi-layer feature fusion. In the process of tracking the target, the scoring graphs of different layers are fused, the interference of similar or similar targets is better distinguished by combining high-layer semantic features and bottom-layer detail features, and the problems of target drift and target loss in the tracking process are prevented.
Description
Technical Field
The invention belongs to the crossing field of digital image processing, deep learning and pattern recognition, and particularly relates to a target tracking method and a target tracking system of a convolution twin network based on multi-layer feature fusion.
Background
Target tracking has a very important position in computer vision, however, due to the complexity of natural scenes, the sensitivity of the target to illumination changes, the requirements of tracking on real-time performance and robustness, and the existence of factors such as occlusion, posture and scale change, the tracking problem is still difficult. The traditional target tracking method cannot extract abundant characteristics from a target, so that the target and a background are strictly distinguished, a tracking drift phenomenon easily occurs, and the target cannot be tracked for a long time. With the rise of deep learning, a general convolutional neural network can effectively extract characteristics rich in targets, but network parameters are too many, and if online tracking is needed, the requirement of real-time performance cannot be met, and the practical engineering utilization value is limited.
Due to the improvement of hardware performance and the popularization of high-performance computing devices such as a GPU (graphics processing unit) and the like, the tracking instantaneity is not a problem which is difficult to overcome any more, and an effective target appearance model is of great importance in the tracking process. The essence of target tracking is a similarity measurement process, and due to the special structure of the twin convolution network, the target tracking has natural advantages in similarity measurement, and has a convolution structure, so that abundant features can be extracted for target tracking. The pure twin-convolution-based network adopts offline training and online tracking, although the requirement can be met on high-performance computing equipment in real time, the full-convolution twin network only utilizes semantic information extracted from the high layer of the convolution network in the tracking process, and the background similar to the target cannot be well distinguished in a complex scene, so that the problems of tracking drift and target loss are caused.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to solve the technical problems of tracking drift and target loss caused by similar background interference in the prior art.
In order to achieve the above object, in a first aspect, an embodiment of the present invention provides a target tracking method based on a multilayer feature fusion convolutional twin network, where the method includes the following steps:
(1) according to the target position and size of the image, cutting out target template images and search area images of all images in an image sequence training set, wherein an image pair formed by the target template images and the search area images forms a training data set;
(2) constructing a convolution twin network based on multilayer feature fusion, wherein the convolution twin network based on multilayer feature fusion comprises 2 identical first branch convolution networks and second branch convolution networks, the first branch convolution networks are used for obtaining a feature map of a search area image, the second branch convolution networks are used for obtaining a feature map of a target template image, the two branch networks are connected on a feature map of a designated layer, and cross-correlation operation is respectively carried out on the feature map of the target template image and corresponding layers of the feature map of the search area image to obtain corresponding score maps;
(3) training the multilayer feature fusion-based convolution twin network based on the training data set to obtain a well-trained multilayer feature fusion-based convolution twin network;
(4) and calculating a score map of an image in the image sequence to be detected by using the trained convolution twin network based on the multi-layer feature fusion, and tracking the target based on the score map.
Specifically, the step (1) comprises the following steps: the target template image cutting method comprises the following steps: the method comprises the steps of taking a target area as a center of a target rectangular frame, taking the center position of the target area as a target position, respectively expanding p pixels on four sides of the target rectangular frame, filling an exceeding part with image mean pixels if the rectangular frame exceeds an image boundary, and finally scaling the size of a cut target image block to 127 multiplied by 127; the cutting method of the search area image comprises the following steps: respectively expanding 2p pixels on four sides of a target rectangular frame by taking a target area as a center, filling an excess part with image mean pixels if the rectangular frame exceeds an image boundary, and finally scaling the size of a clipped search area image block to 255 multiplied by 255; where p is (w + h)/4, w is the target rectangular frame width pixel, and h is the target rectangular frame length pixel.
Specifically, the step (2) includes: inputting the search area image into a first branch convolution network, and obtaining a first-layer feature map SFM through Conv11Then, a second layer characteristic diagram SFM is obtained through the Pool1 and Conv2 layers2Finally, obtaining the third layer of characteristics through Pool2, Conv3, Conv4 and Conv5Diagram SFM3(ii) a Inputting the target template image into a second branch convolution network, and obtaining a first-layer feature map GFM (Gaussian filtered model) through Conv11Then, a second layer characteristic diagram GFM is obtained through Pool1 and Conv22Finally, obtaining a third-layer characteristic diagram GFM through Pool2, Conv3, Conv4 and Conv53(ii) a Respectively carrying out cross-correlation operation on the corresponding layers of the target template characteristic diagram and the search area image characteristic diagram to obtain three corresponding score graphs SM1、SM2、SM3The formula is as follows:
SMi=GFMi*SFMi
wherein, i is 1,2 and 3 respectively, and is cross-correlation operation.
Specifically, the joint loss function L (y, v) constructed in step (3) is calculated as follows:
L(y,v)=α1L1(y,v1)+α2L2(y,v2)+α3L3(y,v3)
l(y[u],vi[u])=log(1+exp(y[u]×vi[u]))
wherein L isiIs a score map SMiA loss function of l (y [ u ]],vi[u]) Is a score map SMiLogarithmic loss function of each point in αiIs a score map SMiWeight of (0 < α)1<α2<α3≤1,DiScore representation map SMiU is a point in the score plot, ciIs a score map SMiCentral point of (2), RiIs a score map SMiRadius of (a), kiIs a score map SMiStep length of (v)i[u]Is a score map SMiThe corresponding value of u in the table, | | | | represents the euclidean distance, i ═ 1,2, 3.
Specifically, the step (4) includes:
1) cutting out a target template image of a 1 st frame image according to the target position and size of the 1 st frame image of the image sequence to be detected, inputting the target template image of the 1 st frame image into a second branch convolution network of a trained multilayer feature fusion convolution twin network, and obtaining a feature map M of the target template image1,t=2;
2) Cutting out a search area image of a t-frame image according to the target position and the size of a t-1 frame image of an image sequence to be detected, inputting the search area image of the t-frame into a first branch convolution network of a trained multilayer feature fusion convolution twin network, and obtaining a search area image feature map of the t-frame image;
3) respectively carrying out cross-correlation operation on the target template characteristic diagram of the t-1 th frame and the corresponding layer of the search area image characteristic diagram of the t-th frame to obtain three score diagrams of a target in the search area image of the t-th frame, and then fusing the score diagrams in a linear weighting mode to obtain a final score diagram of the t-th frame;
4) calculating the target position of the target in the image of the t frame according to the final score map of the t frame;
5) cutting out a target template image of the t frame image according to the target position and size in the t frame image, inputting the target template image of the t frame image into a second branch convolution network of a trained multilayer feature fusion convolution twin network, and marking the obtained feature map of the target template image as MtIf the feature map of the target template image of the t-th frame isη is a smoothing factor;
6) and t +1, repeating the steps 2) -5) until t is equal to N, and ending the target tracking of the image sequence to be detected, wherein N is the total frame number of the image sequence to be detected.
In order to achieve the above object, in a second aspect, an embodiment of the present invention provides a target tracking system based on a multilayer feature fusion convolution twin network, where the system includes:
the cutting module is used for cutting out target template images and search area images of all images in the image sequence training set according to the target positions and the sizes of the images, and the images formed by the target template images and the search area images form a training data set;
the multi-layer feature fusion-based convolution twin network module comprises 2 identical first branch convolution networks and second branch convolution networks, wherein the first branch convolution networks are used for obtaining a feature map of a search area image, the second branch convolution networks are used for obtaining a feature map of a target template image, the two branch networks are connected on a feature map of a designated layer, and cross-correlation operation is respectively carried out on the feature map of the target template image and corresponding layers of the feature map of the search area image to obtain corresponding score maps;
the training module is used for training the multilayer feature fusion-based convolution twin network based on the training data set to obtain a trained multilayer feature fusion-based convolution twin network;
and the target tracking module is used for calculating a score map of an image in the image sequence to be detected by using the trained convolution twin network based on the multilayer feature fusion and tracking the target based on the score map.
Specifically, the method for cropping the target template image is characterized by comprising the following steps: the method comprises the steps of taking a target area as a center of a target rectangular frame, taking the center position of the target area as a target position, respectively expanding p pixels on four sides of the target rectangular frame, filling an exceeding part with image mean pixels if the rectangular frame exceeds an image boundary, and finally scaling the size of a cut target image block to 127 multiplied by 127; the cutting method of the search area image comprises the following steps: respectively expanding 2p pixels on four sides of a target rectangular frame by taking a target area as a center, filling an excess part with image mean pixels if the rectangular frame exceeds an image boundary, and finally scaling the size of a clipped search area image block to 255 multiplied by 255; where p is (w + h)/4, w is the target rectangular frame width pixel, and h is the target rectangular frame length pixel.
Specifically, the multilayer feature fusion based convolution twin network comprises: searchingInputting the cable region image into a first branch convolution network, and obtaining a first-layer feature map SFM through Conv11Then, a second layer characteristic diagram SFM is obtained through the Pool1 and Conv2 layers2Finally, obtaining a third-layer characteristic diagram SFM through Pool2, Conv3, Conv4 and Conv53(ii) a Inputting the target template image into a second branch convolution network, and obtaining a first-layer feature map GFM (Gaussian filtered model) through Conv11Then, a second layer characteristic diagram GFM is obtained through Pool1 and Conv22Finally, obtaining a third-layer characteristic diagram GFM through Pool2, Conv3, Conv4 and Conv53(ii) a Respectively carrying out cross-correlation operation on the corresponding layers of the target template characteristic diagram and the search area image characteristic diagram to obtain three corresponding score graphs SM1、SM2、SM3The formula is as follows:
SMi=GFMi*SFMi
wherein, i is 1,2 and 3 respectively, and is cross-correlation operation.
Specifically, the joint loss function L (y, v) constructed in the training module is calculated as follows:
L(y,v)=α1L1(y,v1)+α2L2(y,v2)+α3L3(y,v3)
l(y[u],vi[u])=log(1+exp(y[u]×vi[u]))
wherein L isiIs a score map SMiA loss function of l (y [ u ]],vi[u]) Is a score map SMiLogarithmic loss function of each point in αiIs a score map SMiWeight of (0 < α)1<α2<α3≤1,DiScore representation map SMiU is a point in the score plot, ciIs a score map SMiCentral point of (2), RiIs a score map SMiRadius of (a), kiIs a score map SMiStep length of (v)i[u]Is a score map SMiThe corresponding value of u in the table, | | | | represents the euclidean distance, i ═ 1,2, 3.
Specifically, the target tracking module performs target tracking through the following steps:
1) cutting out a target template image of a 1 st frame image according to the target position and size of the 1 st frame image of the image sequence to be detected, inputting the target template image of the 1 st frame image into a second branch convolution network of a trained multilayer feature fusion convolution twin network, and obtaining a feature map M of the target template image1,t=2;
2) Cutting out a search area image of a t-frame image according to the target position and the size of a t-1 frame image of an image sequence to be detected, inputting the search area image of the t-frame into a first branch convolution network of a trained multilayer feature fusion convolution twin network, and obtaining a search area image feature map of the t-frame image;
3) performing cross-correlation operation on the target template characteristic diagram of the t-1 th frame and the corresponding layer of the target search area image characteristic diagram of the t-1 th frame respectively to obtain three score maps of a target in the search area image of the t-1 th frame, and then fusing the score maps in a linear weighting manner to obtain a final score map of the t-1 th frame;
4) calculating the target position of the target in the image of the t frame according to the final score map of the t frame;
5) cutting out a target template image of the t frame image according to the target position and size in the t frame image, inputting the target template image of the t frame image into a second branch convolution network of a trained multilayer feature fusion convolution twin network, and marking the obtained feature map of the target template image as MtIf the feature map of the target template image of the t-th frame isη is a smoothing factor;
6) and t +1, repeating the steps 2) -5) until t is equal to N, and ending the target tracking of the image sequence to be detected, wherein N is the total frame number of the image sequence to be detected.
Generally, compared with the prior art, the above technical solution conceived by the present invention has the following beneficial effects:
(1) in the process of tracking the target, the scoring graphs of different layers are fused, and the interference of similar or similar targets can be better distinguished by combining high-layer semantic features and bottom-layer detail features, so that the problems of target drift and target loss in the tracking process are prevented.
(2) The invention uses the fusion score maps obtained by the cross correlation of the multilayer characteristic maps to carry out supervision training and design a new combined loss function, and the design of the combined loss function considers the action sizes of different layer score maps to endow different weights, thereby preventing gradient dispersion and accelerating the convergence process.
Drawings
FIG. 1 is a flowchart of a target tracking method based on a multilayer feature fusion convolution twin network according to an embodiment of the present invention;
FIG. 2 is an exemplary diagram of a target template image and a search area image provided by an embodiment of the invention;
FIG. 3 is a schematic diagram of a convolutional twin network structure based on multi-layer feature fusion according to an embodiment of the present invention;
4(a), 4(b), and 4(c) are images of 36 th frame, 102 th frame, and 136 th frame of the first video sequence for target tracking by using the method of the present invention according to an embodiment of the present invention;
5(a), 5(b), and 5(c) are images of frame 14, frame 24, and frame 470, respectively, of a second video sequence for target tracking using the method of the present invention according to an embodiment of the present invention;
fig. 6(a), fig. 6(b), and fig. 6(c) are images of frame 39, frame 61, and frame 85, respectively, of performing target tracking on a third video sequence by using the method of the present invention according to an embodiment of the present invention;
FIG. 7(a), FIG. 7(b), and FIG. 7(c) are images of frame 23, frame 239, and frame 257, respectively, for performing target tracking on a fourth video sequence by using the method of the present invention according to an embodiment of the present invention;
8(a), 8(b), and 8(c) are images of 14 th frame, 52 th frame, and 98 th frame of a fifth video sequence for target tracking using the method of the present invention according to an embodiment of the present invention;
fig. 9(a), 9(b), and 9(c) are images of frame 23, frame 37, and frame 63, respectively, of a sixth video sequence for object tracking by using the method of the present invention according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Fig. 1 is a flowchart of a target tracking method based on a multilayer feature fusion convolution twin network according to an embodiment of the present invention. As shown in fig. 1, the method comprises the steps of:
(1) according to the target position and size of the image, target template images and search area images of all images in the image sequence training set are cut out, and an image pair formed by the target template images and the search area images forms a training data set.
The image sequence training set is an image pair consisting of an image and a label graph, and the label graph marks the target position and size of the corresponding image. And cutting out a target template image and a search area image which take the target area as the center from the image through the label graph. The training data set for this example is 4 ten thousand pairs of training images.
The target template image cutting method comprises the following steps: and a target rectangular frame taking the target area as the center, wherein the target area center position represents the target position. And respectively expanding p pixels on four sides of the target rectangular frame to obtain the size of the target template image block as (w +2p) × (h +2p), wherein p is (w + h)/4, w is a target rectangular frame width pixel, and h is a target rectangular frame length pixel. If the rectangular frame exceeds the image boundary, the exceeding part is filled with image mean pixels. Finally, the cropped target image block size is scaled to 127 × 127.
The cutting method of the search area image comprises the following steps: and (2) expanding 2p pixels on four sides of the target rectangular frame by taking the target area as the center to obtain the size of the image block of the search area, wherein the size is (w +4p) × (h +4p), and p is (w + h)/4. If the rectangular frame exceeds the image boundary, the exceeding part is filled with image mean pixels. Finally, the cropped search area image block size is scaled to 255 x 255.
Fig. 2 is an exemplary diagram of a target template image and a search area image according to an embodiment of the present invention. As shown in fig. 2, the 1 st line is a target template image, and the 2 nd line is a corresponding search area image.
(2) And constructing a convolution twin network based on multilayer feature fusion.
Fig. 3 is a schematic diagram of a convolutional twin network structure based on multi-layer feature fusion according to an embodiment of the present invention. As shown in fig. 3, the convolution twin network based on multi-layer feature fusion includes 2 identical first branch convolution networks and second branch convolution networks, the first branch convolution network is used for acquiring the feature map of the search area image, and the second branch convolution network is used for acquiring the feature map of the target template image.
The two branch networks have the same structure and parameters, and each branch network comprises a first convolutional layer Conv1, a first pooling layer Pool1, a second convolutional layer Conv2, a second pooling layer Pool2, a third convolutional layer Conv3, a fourth convolutional layer Conv4 and a fifth convolutional layer Conv5 which are connected in sequence. The specific parameters are as follows: the Conv1 convolution kernel size is 11 × 11, the step size is 2, and the number of channels is 48; the size of the Pool1 convolution kernel is 3 multiplied by 3, the step length is 2, and the number of channels is 48; the Conv2 convolution kernel size is 5 × 5, the step size is 1, and the number of channels is 128; the size of the Pool2 convolution kernel is 3 multiplied by 3, the step length is 1, and the number of channels is 128; conv3, Conv4 and Conv5 convolution kernels are all 3 × 3 in size, step size is 1, the number of Conv3 and Conv4 channels is 192, and the number of Conv5 channels is 128.
Inputting the search area image into a first branch convolution network, and obtaining a first-layer feature map SFM through Conv1 1123 × 123 × 48, and then obtaining a second layer characteristic diagram SFM through Pool1 and Conv2 layers2Size and diameter57 × 57 × 128, and finally obtaining a third-layer feature map SFM through Pool2, Conv3, Conv4 and Conv53And size 22 × 22 × 128.
Inputting the target template image into a second branch convolution network, and obtaining a first-layer feature map GFM (Gaussian filtered model) through Conv11Size 59 × 59 × 48, followed by Pool1, Conv2 to obtain a second layer profile GFM225 × 25 × 128, and finally obtaining a third layer characteristic diagram GFM through Pool2, Conv3, Conv4 and Conv53And size 6 × 6 × 128.
The two branch networks are connected on the feature map of the designated layer, and the feature map of the target template image and the corresponding layer of the feature map of the search area image are respectively subjected to cross-correlation operation to obtain corresponding score maps.
Respectively carrying out cross-correlation operation on the corresponding layers of the target template characteristic diagram and the search area image characteristic diagram to obtain three corresponding score graphs SM1、SM2、SM3The sizes are 65 × 65, 33 × 33 and 17 × 17 respectively, and the formula is as followsi=GFMi*SFMiWherein, i is 1,2 and 3 respectively, and is cross-correlation operation.
(3) And training the multilayer feature fusion-based convolution twin network based on the training data set to obtain the well-trained multilayer feature fusion-based convolution twin network.
And constructing a joint loss function. There is a real label y [ u ] ∈ D { +1, -1} for each point u ∈ D in the score-score graph, and since the target is at the center of the score-score graph, the center of the score-score graph is set as the center of the circle, and the element in the score-score graph is considered to belong to a positive sample within the radius R (considering the stride k of the network), and vice versa, the formula is as follows:
wherein: c is the central point of the score map, and | | represents the euclidean distance.
The loss function used in the training is based on a logarithmic loss function, taking the average of the losses for all points for the overall loss of a single score map. The joint loss function L (y, v) constructed by the invention is as follows:
L(y,v)=α1L1(y,v1)+α2L2(y,v2)+α3L3(y,v3)
l(y[u],vi[u])=log(1+exp(y[u]×vi[u]))
wherein L isiIs a score map SMiA loss function of l (y [ u ]],vi[u]) Is a score map SMiLogarithmic loss function of each point in αiIs a score map SMiWeight of (0 < α)1<α2<α3≤1,DiScore representation map SMiU is a point in the score plot, ciIs a score map SMiCentral point of (2), RiIs a score map SMiRadius of (a), kiIs a score map SMiStep length of (v)i[u]Is a score map SMiThe corresponding value of u in the table, | | | | represents the euclidean distance, i ═ 1,2, 3.
Specifically, α1、α2、α3The step k is taken as 0.3, 0.6 and 1 respectively, and for the score chart 1, the score chart 2 and the score chart 3, the corresponding values of the step k are respectively 2, 4 and 8.
And (4) minimizing a joint loss function into an objective function, and learning a network parameter W of the multilayer feature fused convolution twin network by adopting a back propagation algorithm.
This embodiment trains 40 times, 5000 times per iteration, using 8 pairs of training images per iteration. In the network training process, along with the convergence of network parameters, the learning rate in the random gradient descent method is set to be 10 in sequence-2Reduced to 10-5I.e. after each 10 training sessions, the learning rate of the gradient descent method decreases by a factor of 10.
(4) And calculating a score map of an image in the image sequence to be detected by using the trained convolution twin network based on the multi-layer feature fusion, and tracking the target based on the score map.
1) Cutting out a target template image of a 1 st frame image according to the target position and size of the 1 st frame image of the image sequence to be detected, inputting the target template image of the 1 st frame image into a second branch convolution network of a trained multilayer feature fusion convolution twin network, and obtaining a feature map M of the target template image1,t=2;
The target position and target size in the initial frame image of the image sequence to be measured are known. And cutting out a target template image of the 1 st frame image according to the target position and the size of the 1 st frame image of the image sequence to be detected.
2) Cutting out a search area image of a t-frame image according to the target position and the size of a t-1 frame image of an image sequence to be detected, inputting the search area image of the t-frame into a first branch convolution network of a trained multilayer feature fusion convolution twin network, and obtaining a search area image feature map of the t-frame image;
the target position and target size in the initial frame image of the image sequence to be measured are known. And cutting out a search area image of the 2 nd frame image according to the target position and the size of the 1 st frame image of the image sequence to be detected.
3) Respectively carrying out cross-correlation operation on the target template characteristic diagram of the t-1 th frame and the corresponding layer of the search area image characteristic diagram of the t-th frame to obtain three score diagrams of a target in the search area image of the t-th frame, and then fusing the score diagrams in a linear weighting mode to obtain a final score diagram of the t-th frame;
SM to size 17 × 173Bistric interpolated upsampling to a score map of size 65 × 65SM of size 33 × 332Bistric interpolated upsampling to a score map of size 65 × 65Calculating the final score map SM by adopting the following calculation formula123:
Wherein the content of the first and second substances,andare respectively a score map SM2And SM3The score map obtained after upsampling is w taken in the embodiment1=21、w2=22、w3=23。
4) Calculating the target position of the target in the image of the t frame according to the final score map of the t frame;
obtaining a final score map SM after the three score maps are superposed according to the weight123Then, SM123And performing bicubic interpolation to 255 × 255, and recording the position of the maximum score point in the score map as a position pt。
In order to make the tracking process more continuous, a linear interpolation position p is adoptedtTo determine the target position of the target in the t-th frame imageThe specific treatment method is as follows:
where γ is a smoothing factor.
This example γ is taken to be 0.35.
5) Cutting out a target template image of the t frame image according to the target position and size in the t frame image, inputting the target template image of the t frame image into a second branch convolution network of a trained multilayer feature fusion convolution twin network, and obtaining a feature map of the target template imageIs marked as MtIf the feature map of the target template image of the t-th frame isη is a smoothing factor;
in this example η, 0.01 is used.
6) And t +1, repeating the steps 2) -5) until t is equal to N, and ending the target tracking of the image sequence to be detected, wherein N is the total frame number of the image sequence to be detected.
FIG. 4(a) is a 36 th frame of image for object tracking of a first video sequence using the method of the present invention according to an embodiment of the present invention; FIG. 4(b) is a 102 th frame of image for object tracking of a first video sequence using the method of the present invention according to an embodiment of the present invention; fig. 4(c) is a 136 th frame image for performing object tracking on a first video sequence by using the method of the present invention according to an embodiment of the present invention. Therefore, the target tracking method provided by the invention can effectively track the target with the rapid movement, posture change, shielding and similar background interference.
FIG. 5(a) is a 14 th frame of image for object tracking of a second video sequence using the method of the present invention according to an embodiment of the present invention; FIG. 5(b) is a 24 th frame image for object tracking of a second video sequence using the method of the present invention according to an embodiment of the present invention; fig. 5(c) is a 470 th frame image of the second video sequence for object tracking by using the method of the present invention according to the embodiment of the present invention. Therefore, the target tracking method provided by the invention can effectively track the target with posture change, shielding and similar background interference.
FIG. 6(a) is a 39 th frame of image for object tracking of a third video sequence using the method of the present invention according to an embodiment of the present invention; FIG. 6(b) is a 61 st frame of image for object tracking of a third video sequence using the method of the present invention according to an embodiment of the present invention; fig. 6(c) is an image of the 85 th frame of the third video sequence for object tracking by using the method of the present invention according to the embodiment of the present invention. Therefore, the target tracking method provided by the invention can effectively track the targets with posture change, shielding and motion blur.
FIG. 7(a) is a 23 rd frame image of a fourth video sequence for object tracking using the method of the present invention according to an embodiment of the present invention; FIG. 7(b) is a 239 th frame of image for target tracking of a fourth video sequence using the method of the present invention according to an embodiment of the present invention; fig. 7(c) is a 257 th frame image of a fourth video sequence for object tracking by using the method of the present invention according to an embodiment of the present invention. Therefore, the target tracking method provided by the invention can effectively track the target with illumination change and shielding.
FIG. 8(a) is a 14 th frame image of a fifth video sequence for object tracking using the method of the present invention according to an embodiment of the present invention; FIG. 8(b) is a 52 th frame image for object tracking of a fifth video sequence using the method of the present invention according to an embodiment of the present invention; fig. 8(c) is a 98 th frame image of a fifth video sequence for object tracking by using the method of the present invention according to an embodiment of the present invention. Therefore, the target tracking method provided by the invention can effectively track the target with attitude change and similar background interference.
FIG. 9(a) is a 23 rd frame image of a sixth video sequence for object tracking according to an embodiment of the present invention; FIG. 9(b) is a 37 th frame image of a sixth video sequence for object tracking according to an embodiment of the present invention; fig. 9(c) is a 63 rd frame image of the sixth video sequence for object tracking by using the method of the present invention according to the embodiment of the present invention. Therefore, the target tracking method provided by the invention can effectively track the target with illumination change.
The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (8)
1. The target tracking method of the convolution twin network based on multilayer feature fusion is characterized by comprising the following steps of:
(1) according to the target position and size of the image, cutting out target template images and search area images of all images in an image sequence training set, wherein an image pair formed by the target template images and the search area images forms a training data set;
(2) constructing a convolution twin network based on multilayer feature fusion, wherein the convolution twin network based on multilayer feature fusion comprises 2 identical first branch convolution networks and second branch convolution networks, the first branch convolution networks are used for obtaining a feature map of a search area image, the second branch convolution networks are used for obtaining a feature map of a target template image, the two branch networks are connected on a feature map of a designated layer, and cross-correlation operation is respectively carried out on the feature map of the target template image and corresponding layers of the feature map of the search area image to obtain corresponding score maps;
(3) training the multilayer feature fusion-based convolution twin network based on the training data set to obtain a well-trained multilayer feature fusion-based convolution twin network;
(4) calculating a score map of an image in an image sequence to be detected by using a trained convolution twin network based on multi-layer feature fusion, and tracking a target based on the score map;
the step (2) comprises the following steps:
inputting the search area image into a first branch convolution network, and obtaining a first-layer feature map SFM through Conv11Then, a second layer characteristic diagram SFM is obtained through the Pool1 and Conv2 layers2Finally, obtaining a third-layer characteristic diagram SFM through Pool2, Conv3, Conv4 and Conv53;
Inputting the target template image into a second branch convolution network, and obtaining a first-layer feature map GFM (Gaussian filtered model) through Conv11Then, a second layer characteristic diagram GFM is obtained through Pool1 and Conv22Finally, obtaining a third-layer characteristic diagram GFM through Pool2, Conv3, Conv4 and Conv53;
Respectively carrying out cross-correlation operation on the corresponding layers of the target template characteristic diagram and the search area image characteristic diagram to obtain three corresponding score graphs SM1、SM2、SM3The formula is as follows:
SMi=GFMi*SFMi
wherein, i is 1,2 and 3 respectively, and is cross-correlation operation.
2. The target tracking method of claim 1, wherein step (1) comprises:
the target template image cutting method comprises the following steps: the method comprises the steps of taking a target area as a center of a target rectangular frame, taking the center position of the target area as a target position, respectively expanding p pixels on four sides of the target rectangular frame, filling an exceeding part with image mean pixels if the rectangular frame exceeds an image boundary, and finally scaling the size of a cut target image block to 127 multiplied by 127;
the cutting method of the search area image comprises the following steps: respectively expanding 2p pixels on four sides of a target rectangular frame by taking a target area as a center, filling an excess part with image mean pixels if the rectangular frame exceeds an image boundary, and finally scaling the size of a clipped search area image block to 255 multiplied by 255;
where p is (w + h)/4, w is the target rectangular frame width pixel, and h is the target rectangular frame length pixel.
3. The target tracking method of claim 1, wherein the joint loss function L (y, v) constructed in step (3) is calculated as follows:
L(y,v)=α1L1(y,v1)+α2L2(y,v2)+α3L3(y,v3)
l(y[u],vi[u])=log(1+exp(y[u]×vi[u]))
wherein L isiIs a score chart SMiOf the loss function y [ u ]]The true label representing point u in the score plot,is a score map SMiLogarithmic loss function of each point in αiIs a score map SMiWeight of (0 < α)1<α2<α3≤1,DiScore representation map SMiU is a point in the score plot, ciIs a score map SMiCentral point of (2), RiIs a score map SMiRadius of (a), kiIs a score map SMiStep length of (v)i[u]Is a score map SMiThe corresponding value of u in the table, | | | | represents the euclidean distance, i ═ 1,2, 3.
4. The target tracking method of claim 1, wherein step (4) comprises:
1) cutting out a target template image of a 1 st frame image according to the target position and size of the 1 st frame image of the image sequence to be detected, inputting the target template image of the 1 st frame image into a second branch convolution network of a trained multilayer feature fusion convolution twin network, and obtaining a feature map M of the target template image1,t=2;
2) Cutting out a search area image of a t-frame image according to the target position and the size of a t-1 frame image of an image sequence to be detected, inputting the search area image of the t-frame into a first branch convolution network of a trained multilayer feature fusion convolution twin network, and obtaining a search area image feature map of the t-frame image;
3) respectively carrying out cross-correlation operation on the target template characteristic diagram of the t-1 th frame and the corresponding layer of the search area image characteristic diagram of the t-th frame to obtain three score diagrams of a target in the search area image of the t-th frame, and then fusing the score diagrams in a linear weighting mode to obtain a final score diagram of the t-th frame;
4) calculating the target position of the target in the image of the t frame according to the final score map of the t frame;
5) according to the t frame imageCutting out a target template image of the t frame image according to the target position and size in the image, inputting the target template image of the t frame image into a second branch convolution network of a trained multilayer feature fusion convolution twin network, and recording the obtained feature image of the target template image as MtIf the feature map of the target template image of the t-th frame isη is a smoothing factor;
6) and t +1, repeating the steps 2) -5) until t is equal to N, and ending the target tracking of the image sequence to be detected, wherein N is the total frame number of the image sequence to be detected.
5. The target tracking system of the convolution twin network based on the multi-layer feature fusion is characterized by comprising the following components:
the cutting module is used for cutting out target template images and search area images of all images in the image sequence training set according to the target positions and the sizes of the images, and the images formed by the target template images and the search area images form a training data set;
the multi-layer feature fusion-based convolution twin network module comprises 2 identical first branch convolution networks and second branch convolution networks, wherein the first branch convolution networks are used for obtaining a feature map of a search area image, the second branch convolution networks are used for obtaining a feature map of a target template image, the two branch networks are connected on a feature map of a designated layer, and cross-correlation operation is respectively carried out on the feature map of the target template image and corresponding layers of the feature map of the search area image to obtain corresponding score maps;
the training module is used for training the multilayer feature fusion-based convolution twin network based on the training data set to obtain a trained multilayer feature fusion-based convolution twin network;
the target tracking module is used for calculating a score map of an image in an image sequence to be detected by using a trained convolution twin network based on multi-layer feature fusion, and tracking a target based on the score map;
the multilayer feature fusion based convolution twin network comprises:
inputting the search area image into a first branch convolution network, and obtaining a first-layer feature map SFM through Conv11Then, a second layer characteristic diagram SFM is obtained through the Pool1 and Conv2 layers2Finally, obtaining a third-layer characteristic diagram SFM through Pool2, Conv3, Conv4 and Conv53;
Inputting the target template image into a second branch convolution network, and obtaining a first-layer feature map GFM (Gaussian filtered model) through Conv11Then, a second layer characteristic diagram GFM is obtained through Pool1 and Conv22Finally, obtaining a third-layer characteristic diagram GFM through Pool2, Conv3, Conv4 and Conv53;
Respectively carrying out cross-correlation operation on the corresponding layers of the target template characteristic diagram and the search area image characteristic diagram to obtain three corresponding score graphs SM1、SM2、SM3The formula is as follows:
SMi=GFMi*SFMi
wherein, i is 1,2 and 3 respectively, and is cross-correlation operation.
6. The object tracking system of claim 5,
the target template image cutting method comprises the following steps: the method comprises the steps of taking a target area as a center of a target rectangular frame, taking the center position of the target area as a target position, respectively expanding p pixels on four sides of the target rectangular frame, filling an exceeding part with image mean pixels if the rectangular frame exceeds an image boundary, and finally scaling the size of a cut target image block to 127 multiplied by 127;
the cutting method of the search area image comprises the following steps: respectively expanding 2p pixels on four sides of a target rectangular frame by taking a target area as a center, filling an excess part with image mean pixels if the rectangular frame exceeds an image boundary, and finally scaling the size of a clipped search area image block to 255 multiplied by 255;
where p is (w + h)/4, w is the target rectangular frame width pixel, and h is the target rectangular frame length pixel.
7. The target tracking system of claim 5, wherein the joint loss function L (y, v) constructed in the training module is calculated as follows:
L(y,v)=α1L1(y,v1)+α2L2(y,v2)+α3L3(y,v3)
wherein L isiIs a score map SMiOf the loss function y [ u ]]The true label representing point u in the score plot,is a score map SMiLogarithmic loss function of each point in αiIs a score map SMiWeight of (0 < α)1<α2<α3≤1,DiScore representation map SMiU is a point in the score plot, ciIs a score map SMiCentral point of (2), RiIs a score map SMiRadius of (a), kiIs a score map SMiStep length of (v)i[u]Is a score map SMiThe corresponding value of u in the table, | | | | represents the euclidean distance, i ═ 1,2, 3.
8. The target tracking system of claim 5, wherein the target tracking module performs target tracking by:
1) cutting out a 1 st frame image according to the target position and size of the 1 st frame image of the image sequence to be detectedInputting the target template image of the 1 st frame image into the second branch convolution network of the trained multilayer feature fusion convolution twin network to obtain the feature map M of the target template image1,t=2;
2) Cutting out a search area image of a t-frame image according to the target position and the size of a t-1 frame image of an image sequence to be detected, inputting the search area image of the t-frame into a first branch convolution network of a trained multilayer feature fusion convolution twin network, and obtaining a search area image feature map of the t-frame image;
3) respectively carrying out cross-correlation operation on the target template characteristic diagram of the t-1 th frame and the corresponding layer of the search area image characteristic diagram of the t-th frame to obtain three score diagrams of a target in the search area image of the t-th frame, and then fusing the score diagrams in a linear weighting mode to obtain a final score diagram of the t-th frame;
4) calculating the target position of the target in the image of the t frame according to the final score map of the t frame;
5) cutting out a target template image of the t frame image according to the target position and size in the t frame image, inputting the target template image of the t frame image into a second branch convolution network of a trained multilayer feature fusion convolution twin network, and marking the obtained feature map of the target template image as MtIf the feature map of the target template image of the t-th frame isη is a smoothing factor;
6) and t +1, repeating the steps 2) -5) until t is equal to N, and ending the target tracking of the image sequence to be detected, wherein N is the total frame number of the image sequence to be detected.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810878152.XA CN109191491B (en) | 2018-08-03 | 2018-08-03 | Target tracking method and system of full convolution twin network based on multi-layer feature fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810878152.XA CN109191491B (en) | 2018-08-03 | 2018-08-03 | Target tracking method and system of full convolution twin network based on multi-layer feature fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109191491A CN109191491A (en) | 2019-01-11 |
CN109191491B true CN109191491B (en) | 2020-09-08 |
Family
ID=64920067
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810878152.XA Expired - Fee Related CN109191491B (en) | 2018-08-03 | 2018-08-03 | Target tracking method and system of full convolution twin network based on multi-layer feature fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109191491B (en) |
Families Citing this family (62)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110070562A (en) * | 2019-04-02 | 2019-07-30 | 西北工业大学 | A kind of context-sensitive depth targets tracking |
CN110021052B (en) * | 2019-04-11 | 2023-05-30 | 北京百度网讯科技有限公司 | Method and apparatus for generating fundus image generation model |
CN110246155B (en) * | 2019-05-17 | 2021-05-18 | 华中科技大学 | Anti-occlusion target tracking method and system based on model alternation |
CN110210551B (en) * | 2019-05-28 | 2021-07-30 | 北京工业大学 | Visual target tracking method based on adaptive subject sensitivity |
CN110222641B (en) * | 2019-06-06 | 2022-04-19 | 北京百度网讯科技有限公司 | Method and apparatus for recognizing image |
CN110378938A (en) * | 2019-06-24 | 2019-10-25 | 杭州电子科技大学 | A kind of monotrack method based on residual error Recurrent networks |
CN110443827B (en) * | 2019-07-22 | 2022-12-20 | 浙江大学 | Unmanned aerial vehicle video single-target long-term tracking method based on improved twin network |
CN110570458B (en) * | 2019-08-12 | 2022-02-01 | 武汉大学 | Target tracking method based on internal cutting and multi-layer characteristic information fusion |
CN110473231B (en) * | 2019-08-20 | 2024-02-06 | 南京航空航天大学 | Target tracking method of twin full convolution network with prejudging type learning updating strategy |
CN110516745B (en) * | 2019-08-28 | 2022-05-24 | 北京达佳互联信息技术有限公司 | Training method and device of image recognition model and electronic equipment |
CN110480128A (en) * | 2019-08-28 | 2019-11-22 | 华南理工大学 | A kind of real-time welding seam tracking method of six degree of freedom welding robot line laser |
CN110675423A (en) * | 2019-08-29 | 2020-01-10 | 电子科技大学 | Unmanned aerial vehicle tracking method based on twin neural network and attention model |
CN110580713A (en) * | 2019-08-30 | 2019-12-17 | 武汉大学 | Satellite video target tracking method based on full convolution twin network and track prediction |
CN112446900B (en) * | 2019-09-03 | 2024-05-17 | 中国科学院长春光学精密机械与物理研究所 | Twin neural network target tracking method and system |
CN110807793B (en) * | 2019-09-29 | 2022-04-22 | 南京大学 | Target tracking method based on twin network |
CN110728697B (en) * | 2019-09-30 | 2023-06-13 | 华中光电技术研究所(中国船舶重工集团有限公司第七一七研究所) | Infrared dim target detection tracking method based on convolutional neural network |
CN110782480B (en) * | 2019-10-15 | 2023-08-04 | 哈尔滨工程大学 | Infrared pedestrian tracking method based on online template prediction |
CN110796679B (en) * | 2019-10-30 | 2023-04-07 | 电子科技大学 | Target tracking method for aerial image |
CN111105031B (en) * | 2019-11-11 | 2023-10-17 | 北京地平线机器人技术研发有限公司 | Network structure searching method and device, storage medium and electronic equipment |
CN111161311A (en) * | 2019-12-09 | 2020-05-15 | 中车工业研究院有限公司 | Visual multi-target tracking method and device based on deep learning |
CN111179307A (en) * | 2019-12-16 | 2020-05-19 | 浙江工业大学 | Visual target tracking method for full-volume integral and regression twin network structure |
CN112837344B (en) * | 2019-12-18 | 2024-03-29 | 沈阳理工大学 | Target tracking method for generating twin network based on condition countermeasure |
CN110992404B (en) * | 2019-12-23 | 2023-09-19 | 驭势科技(浙江)有限公司 | Target tracking method, device and system and storage medium |
CN111161317A (en) * | 2019-12-30 | 2020-05-15 | 北京工业大学 | Single-target tracking method based on multiple networks |
CN111091582A (en) * | 2019-12-31 | 2020-05-01 | 北京理工大学重庆创新中心 | Single-vision target tracking algorithm and system based on deep neural network |
CN111260688A (en) * | 2020-01-13 | 2020-06-09 | 深圳大学 | Twin double-path target tracking method |
CN111260682B (en) * | 2020-02-10 | 2023-11-17 | 深圳市铂岩科技有限公司 | Target object tracking method and device, storage medium and electronic equipment |
CN111462175B (en) * | 2020-03-11 | 2023-02-10 | 华南理工大学 | Space-time convolution twin matching network target tracking method, device, medium and equipment |
CN111340850A (en) * | 2020-03-20 | 2020-06-26 | 军事科学院***工程研究院***总体研究所 | Ground target tracking method of unmanned aerial vehicle based on twin network and central logic loss |
CN111415373A (en) * | 2020-03-20 | 2020-07-14 | 北京以萨技术股份有限公司 | Target tracking and segmenting method, system and medium based on twin convolutional network |
CN111415318B (en) * | 2020-03-20 | 2023-06-13 | 山东大学 | Unsupervised related filtering target tracking method and system based on jigsaw task |
CN111489361B (en) * | 2020-03-30 | 2023-10-27 | 中南大学 | Real-time visual target tracking method based on deep feature aggregation of twin network |
CN113538507B (en) * | 2020-04-15 | 2023-11-17 | 南京大学 | Single-target tracking method based on full convolution network online training |
CN111583345B (en) * | 2020-05-09 | 2022-09-27 | 吉林大学 | Method, device and equipment for acquiring camera parameters and storage medium |
CN111639551B (en) * | 2020-05-12 | 2022-04-01 | 华中科技大学 | Online multi-target tracking method and system based on twin network and long-short term clues |
CN111582214B (en) * | 2020-05-15 | 2023-05-12 | 中国科学院自动化研究所 | Method, system and device for analyzing behavior of cage animal based on twin network |
CN111753667B (en) * | 2020-05-27 | 2024-05-14 | 江苏大学 | Intelligent automobile single-target tracking method based on twin network |
CN113805240B (en) * | 2020-05-28 | 2023-06-27 | 同方威视技术股份有限公司 | Vehicle inspection method and system |
CN111797716B (en) * | 2020-06-16 | 2022-05-03 | 电子科技大学 | Single target tracking method based on Siamese network |
CN111754546A (en) * | 2020-06-18 | 2020-10-09 | 重庆邮电大学 | Target tracking method, system and storage medium based on multi-feature map fusion |
CN111950493B (en) * | 2020-08-20 | 2024-03-08 | 华北电力大学 | Image recognition method, device, terminal equipment and readable storage medium |
CN112184752A (en) * | 2020-09-08 | 2021-01-05 | 北京工业大学 | Video target tracking method based on pyramid convolution |
CN112734726B (en) * | 2020-09-29 | 2024-02-02 | 首都医科大学附属北京天坛医院 | Angiography typing method, angiography typing device and angiography typing equipment |
CN112183675B (en) * | 2020-11-10 | 2023-09-26 | 武汉工程大学 | Tracking method for low-resolution target based on twin network |
CN112418203B (en) * | 2020-11-11 | 2022-08-30 | 南京邮电大学 | Robustness RGB-T tracking method based on bilinear convergence four-stream network |
CN112330718B (en) * | 2020-11-12 | 2022-08-23 | 重庆邮电大学 | CNN-based three-level information fusion visual target tracking method |
CN112381788B (en) * | 2020-11-13 | 2022-11-22 | 北京工商大学 | Part surface defect increment detection method based on double-branch matching network |
CN112330719B (en) * | 2020-12-02 | 2024-02-27 | 东北大学 | Deep learning target tracking method based on feature map segmentation and self-adaptive fusion |
CN112836606A (en) * | 2021-01-25 | 2021-05-25 | 合肥工业大学 | Aerial photography target tracking method fusing target significance and online learning interference factor |
CN112785626A (en) * | 2021-01-27 | 2021-05-11 | 安徽大学 | Twin network small target tracking method based on multi-scale feature fusion |
CN112884037B (en) * | 2021-02-09 | 2022-10-21 | 中国科学院光电技术研究所 | Target tracking method based on template updating and anchor-frame-free mode |
CN113192124B (en) * | 2021-03-15 | 2024-07-02 | 大连海事大学 | Image target positioning method based on twin network |
CN113052874B (en) * | 2021-03-18 | 2022-01-25 | 上海商汤智能科技有限公司 | Target tracking method and device, electronic equipment and storage medium |
CN113379788B (en) * | 2021-06-29 | 2024-03-29 | 西安理工大学 | Target tracking stability method based on triplet network |
CN113808166B (en) * | 2021-09-15 | 2023-04-18 | 西安电子科技大学 | Single-target tracking method based on clustering difference and depth twin convolutional neural network |
CN113870312B (en) * | 2021-09-30 | 2023-09-22 | 四川大学 | Single target tracking method based on twin network |
CN114155274B (en) * | 2021-11-09 | 2024-05-24 | 中国海洋大学 | Target tracking method and device based on global scalable twin network |
WO2023112581A1 (en) * | 2021-12-14 | 2023-06-22 | 富士フイルム株式会社 | Inference device |
CN114429491B (en) * | 2022-04-07 | 2022-07-08 | 之江实验室 | Pulse neural network target tracking method and system based on event camera |
CN114820709B (en) * | 2022-05-05 | 2024-03-08 | 郑州大学 | Single-target tracking method, device, equipment and medium based on improved UNet network |
CN115393406B (en) * | 2022-08-17 | 2024-05-10 | 中船智控科技(武汉)有限公司 | Image registration method based on twin convolution network |
CN115330876B (en) * | 2022-09-15 | 2023-04-07 | 中国人民解放军国防科技大学 | Target template graph matching and positioning method based on twin network and central position estimation |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107818575A (en) * | 2017-10-27 | 2018-03-20 | 深圳市唯特视科技有限公司 | A kind of visual object tracking based on layering convolution |
CN107992826A (en) * | 2017-12-01 | 2018-05-04 | 广州优亿信息科技有限公司 | A kind of people stream detecting method based on the twin network of depth |
CN108257158A (en) * | 2018-03-27 | 2018-07-06 | 福州大学 | A kind of target prediction and tracking based on Recognition with Recurrent Neural Network |
CN108320297A (en) * | 2018-03-09 | 2018-07-24 | 湖北工业大学 | A kind of video object method for real time tracking and system |
US10204299B2 (en) * | 2015-11-04 | 2019-02-12 | Nec Corporation | Unsupervised matching in fine-grained datasets for single-view object reconstruction |
-
2018
- 2018-08-03 CN CN201810878152.XA patent/CN109191491B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10204299B2 (en) * | 2015-11-04 | 2019-02-12 | Nec Corporation | Unsupervised matching in fine-grained datasets for single-view object reconstruction |
CN107818575A (en) * | 2017-10-27 | 2018-03-20 | 深圳市唯特视科技有限公司 | A kind of visual object tracking based on layering convolution |
CN107992826A (en) * | 2017-12-01 | 2018-05-04 | 广州优亿信息科技有限公司 | A kind of people stream detecting method based on the twin network of depth |
CN108320297A (en) * | 2018-03-09 | 2018-07-24 | 湖北工业大学 | A kind of video object method for real time tracking and system |
CN108257158A (en) * | 2018-03-27 | 2018-07-06 | 福州大学 | A kind of target prediction and tracking based on Recognition with Recurrent Neural Network |
Non-Patent Citations (1)
Title |
---|
Hierarchical Convolutional Features for Visual Tracking;Chao Ma等;《2015 IEEE International Conference on Computer Vision》;20151231;3074-3082 * |
Also Published As
Publication number | Publication date |
---|---|
CN109191491A (en) | 2019-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109191491B (en) | Target tracking method and system of full convolution twin network based on multi-layer feature fusion | |
CN111354017B (en) | Target tracking method based on twin neural network and parallel attention module | |
CN107274433B (en) | Target tracking method and device based on deep learning and storage medium | |
CN106127684B (en) | Image super-resolution Enhancement Method based on forward-backward recutrnce convolutional neural networks | |
WO2023273136A1 (en) | Target object representation point estimation-based visual tracking method | |
CN111311666B (en) | Monocular vision odometer method integrating edge features and deep learning | |
CN108038420B (en) | Human behavior recognition method based on depth video | |
WO2020108362A1 (en) | Body posture detection method, apparatus and device, and storage medium | |
CN111161317A (en) | Single-target tracking method based on multiple networks | |
CN112184752A (en) | Video target tracking method based on pyramid convolution | |
CN109360156A (en) | Single image rain removing method based on the image block for generating confrontation network | |
CN107730536B (en) | High-speed correlation filtering object tracking method based on depth features | |
CN112288627B (en) | Recognition-oriented low-resolution face image super-resolution method | |
CN112232134B (en) | Human body posture estimation method based on hourglass network and attention mechanism | |
CN113706581B (en) | Target tracking method based on residual channel attention and multi-level classification regression | |
CN110705344A (en) | Crowd counting model based on deep learning and implementation method thereof | |
CN110942476A (en) | Improved three-dimensional point cloud registration method and system based on two-dimensional image guidance and readable storage medium | |
CN111476133B (en) | Unmanned driving-oriented foreground and background codec network target extraction method | |
CN111815665A (en) | Single image crowd counting method based on depth information and scale perception information | |
CN115147456B (en) | Target tracking method based on time sequence self-adaptive convolution and attention mechanism | |
CN111882581A (en) | Multi-target tracking method for depth feature association | |
CN108629301A (en) | A kind of human motion recognition method based on moving boundaries dense sampling and movement gradient histogram | |
CN113592900A (en) | Target tracking method and system based on attention mechanism and global reasoning | |
CN116128763A (en) | Aircraft skin damage image enhancement method based on deep neural network fusion | |
CN109492524B (en) | Intra-structure relevance network for visual tracking |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200908 Termination date: 20210803 |
|
CF01 | Termination of patent right due to non-payment of annual fee |