CN111862156A - Multi-target tracking method and system based on graph matching - Google Patents

Multi-target tracking method and system based on graph matching Download PDF

Info

Publication number
CN111862156A
CN111862156A CN202010689629.7A CN202010689629A CN111862156A CN 111862156 A CN111862156 A CN 111862156A CN 202010689629 A CN202010689629 A CN 202010689629A CN 111862156 A CN111862156 A CN 111862156A
Authority
CN
China
Prior art keywords
vertex
graph
module
target
vertices
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010689629.7A
Other languages
Chinese (zh)
Other versions
CN111862156B (en
Inventor
项俊
王超
侯建华
麻建
徐国寒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South Central Minzu University
Original Assignee
South Central University for Nationalities
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South Central University for Nationalities filed Critical South Central University for Nationalities
Priority to CN202010689629.7A priority Critical patent/CN111862156B/en
Publication of CN111862156A publication Critical patent/CN111862156A/en
Application granted granted Critical
Publication of CN111862156B publication Critical patent/CN111862156B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an on-line multi-target tracking algorithm based on graph matching, which converts a data association problem of detection response between two continuous frames into a graph matching problem. Firstly, two deep convolution neural networks are designed to respectively solve the intimacy between two graph vertexes and the intimacy between two graph edges, then the intimacy between the two graphs is directly filled with the intimacy between the vertexes and the intimacy between the edges, and finally the intimacy matrix is processed to obtain a final matching matrix (namely, a correlation matrix between detection). Therefore, the method can effectively reflect the relevance of real data in the multi-target tracking process, and the tracking result is high in accuracy.

Description

Multi-target tracking method and system based on graph matching
Technical Field
The invention belongs to the technical field of pattern recognition, and particularly relates to a multi-target tracking method and system based on graph matching.
Background
multi-Object Tracking (MOT) plays an important role in the field of computer vision, and its main task is to analyze video to identify and track objects belonging to one or more categories without any prior appearance and number of objects, and has an important role in the fields of motion analysis, human-computer interaction, video surveillance (e.g., abnormal behavior identification), and automatic driving.
The first method is to adopt a detection-before-tracking strategy, namely, a detector is firstly used to locate the position of an interested target in each frame based on a tracking mode of detection, and then an identity is allocated to each detection through data association, wherein the core problem of the strategy is data association; in multi-target tracking, data association can be regarded as a bipartite graph distribution problem, namely, a corresponding relation between an existing track and a newly generated detection is determined, and then a Hungarian algorithm is adopted for solving; the second one is to adopt a multi-target tracking strategy based on a graph model, which utilizes a conditional random field to model and detect the space-time relationship between the two or between the two track pieces; the third is to employ a machine learning based multi-target tracking strategy that uses a kalman filter to model the detected motion model for estimating the position of the detection in the next frame.
However, the above existing multi-target tracking method has some non-negligible defects in data association:
firstly, for a multi-target tracking method based on detection and tracking, as the traditional Hungarian algorithm only uses the associated cost between vertexes in the solving process and does not use the topological information of the vertexes, which is very important in graph matching, the accuracy of the tracking result is low;
Second, in graph models such as conditional random fields, the structure of the graph is fixed, which results in nodes and edges being fixed, so that nodes and edges with inaccurate features cannot be corrected, resulting in low tracking accuracy;
third, for the target tracking strategy using the conventional machine learning method, the model complexity is high, and manual design parameters are required.
Disclosure of Invention
The invention provides a multi-target tracking method and a multi-target tracking system based on graph matching, aiming at solving the problem of data association between detection between continuous frames by combining deep learning and a traditional graph matching frame so as to complete on-line multi-target tracking and further solving the technical problems that the topological relation between vertexes in the traditional multi-target tracking method is not fully utilized, the target tracking accuracy rate is low due to fixed structure in the multi-target tracking method based on a graph model, and the multi-target tracking method adopting a machine learning model has high complexity and needs manual parameter design.
To achieve the above object, according to one aspect of the present invention, there is provided a multi-target tracking method based on graph matching, including the steps of:
(1) Acquiring a multi-target tracking data set which comprises an input video sequence and a detection response of each frame in the input video sequence;
(2) setting counter cnt1 to 1;
(3) judging whether cnt1 is equal to the total frame number of the input video sequence, if yes, entering step (14), otherwise, entering step (4);
(4) acquiring a cnt1 th frame and a cnt1+1 th frame from the input video sequence obtained in the step (1), constructing a graph G1 according to a previous frame, constructing a graph G2 according to a next frame, wherein the graphs G1 and G2 respectively comprise two vertex sets V1 and V2;
(5) respectively acquiring all vertexes in vertex sets V1 and V2 in graphs G1 and G2 constructed in the step (4), and respectively inputting the vertexes into a trained triplet network to obtain a feature vector corresponding to each vertex;
(6) inputting the feature vectors of the vertexes obtained in the step (5) into the trained first shallow neural network and second shallow neural network respectively to obtain the matching degree between each vertex in the vertex set V1 of the graph G1 and each vertex in the vertex set V2 of the graph G2 and the matching degree between each connecting edge in the edge set E1 of the graph G1 and each connecting edge in the edge set E2 of the graph G2 respectively;
(7) constructing an affinity matrix M according to the matching degree between each vertex in the vertex set V1 of the graph G1 and each vertex in the vertex set V2 of the graph G2 obtained in the step (6) and the matching degree between each connecting edge in the edge set E1 of the graph G1 and each connecting edge in the edge set E2 of the graph G2;
(8) Performing power iteration processing on the intimacy matrix M obtained in the step (7) to obtain an optimal distribution vector v*
(9) For the optimal distribution vector v obtained in the step (8)*Performing bidirectional randomization to obtain a distribution matrix S;
(10) setting counter cnt2 to 1;
(11) judging whether the counter cnt2 is equal to the total number of the vertexes in the graph G1, if so, entering the step (14), otherwise, entering the step (12);
(12) obtaining a vertex preliminarily matched with a cnt2 th vertex in the graph G1 in the graph G2 by the distribution matrix S obtained in the step (9), judging whether the intersection ratio between two detection responses corresponding to the two vertices preliminarily matched with each other in the graph G1 and the graph G2 is greater than or equal to a preset threshold value, if so, establishing the association between the target corresponding to the vertex in the graph G1 in the previous frame and the target corresponding to the vertex in the graph G2 in the next frame, and then entering the step (13), otherwise, directly entering the step (13);
(13) setting a counter cnt 1-cnt 1+1, and returning to the step (3);
(14) for each target in the first frame of the input video sequence, all targets which are associated with the target in all the remaining frames of the whole input video sequence are combined with the target to form a tracking track of the target.
Preferably, the vertices in the vertex set V1 are all the detection responses in the previous frame, and the vertices in the vertex set V2 are all the detection responses in the next frame;
if the distance between the corresponding targets in the previous frame of two vertexes in the vertex set V1 is smaller than or equal to a threshold, the two vertexes have a connecting edge therebetween, and all the connecting edges corresponding to the vertex set V1 form an edge set E1;
if the distance between the corresponding targets in the next frame of two vertices in the vertex set V2 is less than or equal to a threshold, the two vertices have a connecting edge therebetween, and all the connecting edges corresponding to the vertex set V2 form an edge set E2.
Preferably, the triple network is composed of three paths, each path comprising a sequentially connected ResNet-50 convolutional neural network model and two fully-connected layers, wherein:
the first layer is a ResNet-50 convolutional neural network model, which comprises 1 convolutional layer, 16 building block structures and 1 full-connection layer.
The second layer is a full connection layer, the ReLU is adopted as the activation function, and the output is 1024-dimensional characteristic vectors;
the third layer is a fully connected layer, the activation function adopts ReLU, and the output is a 128-dimensional feature vector.
Preferably, the first and second shallow neural networks have the same structure, and the network structures are both:
The first layer is a full-connection layer, the activation function adopts ReLU, the retention probability of Dropout is set to be 0.5, and the output is a 1024-dimensional feature vector;
the second layer is a full connection layer, the activation function adopts Softmax, and the output is a feature vector with 2 dimensions.
Preferably, for the vertex viE V1 and vertex VaE.g. V2, its matching degree SiaComprises the following steps:
Sia=FN([Fi,Fa])
wherein each corresponds to a vertex viAnd vaFeature vector F ofi,Fa∈R1×dD is the size of the feature vector; [. the]Representing a stitching operation that concatenates a plurality of vectors; fNRepresenting a first shallow neural network with a softmax layer, i and j ∈[1,n],a∈[1,m]N and m represent the total number of vertices in vertex sets V1 and V2, respectively;
connecting edge F starting from ith vertex and ending from jth vertexij=[Fi,Fj]In which F isi,FjRespectively represent a connecting edge FijThe feature vectors of the two connected vertexes i and j;
connecting edge (v)i,vj) E E1 and connecting edge (v)a,vb) E matching degree S between E2ij:abComprises the following steps:
Sij:ab=FE([Fij,Fab])
wherein b is ∈ [1, m ]]Feature vector Fij,Fab∈R1×2d,FERepresenting a second shallow neural network with a softmax layer.
Preferably, the elements on the diagonal in the intimacy matrix M correspond to the degree of match between the vertices, i.e. i ═ j and a ═ b, then Mia:jbIs equal to vertex viAnd vertex vjThe degree of matching;
the elements on the non-diagonal in the affinity matrix M represent the degree of matching between the connected edges, i.e. for i ≠ j and a ≠ b, then M ia:jbEqual to the connecting edge (v)i,vj) And a connecting edge (v)a,vb) The matching degree between the two; for i ≠ j and a ≠ b or i ≠ j and a ═ b, then Mia:jb=0。
Preferably, the step (8) is to calculate the optimal distribution vector v of the affinity matrix M by using the following formula*
Figure BDA0002588834070000051
Wherein v is0As unit vectors, i.e. v01, | | · | | denotes l2Norm, k represents iteration frequency, the value range of k is 2 to 4, and the final result of iterative computation is used as the optimal distribution vector v*
Preferably, step (9) is iterated using the following formula, resulting in the final assignmentMatrix array
Figure BDA0002588834070000052
As the allocation matrix S:
Figure BDA0002588834070000053
wherein S isnmIs represented by v*The matrix with the size of nxm is obtained through transformation, the superscript t +1 represents the t +1 th iteration process, t represents the iteration times, and the value range of t is 3-7;
in the distribution matrix S, the ith row and the a column of elements represent the vertex v of the graph G1iAnd the vertex v in the graph G2aA match value between.
Preferably, in step (12), all elements in the ith row are obtained from the distribution matrix S, and the element with the largest value is selected from the i-th row, and the corresponding column number a is the vertex v in the graph G1iPreliminarily matched vertex v in graph G2a
According to another aspect of the present invention, there is provided a multi-target tracking system based on graph matching, including:
A first module for obtaining a multi-target tracking data set comprising an input video sequence and a detection response for each frame in the input video sequence;
a second module for setting the counter cnt1 to 1;
a third module, configured to determine whether cnt1 equals to a total frame number of the input video sequence, if so, enter a fourteenth module, otherwise, enter a fourth module;
a fourth module, configured to obtain a cnt1 th frame and a cnt1+1 th frame from the input video sequence obtained by the first module, construct a graph G1 according to a previous frame, construct a graph G2 according to a next frame, where the graphs G1 and G2 respectively include two vertex sets V1 and V2;
a fifth module, configured to obtain all vertices in vertex sets V1 and V2 in graphs G1 and G2 constructed by the fourth module, respectively, and input the vertices into a trained triplet network, so as to obtain feature vectors corresponding to the vertices;
a sixth module, configured to input the feature vectors of the vertices obtained by the fifth module into the trained first and second shallow neural networks, respectively, to obtain a matching degree between each vertex in the vertex set V1 of fig. G1 and each vertex in the vertex set V2 of fig. G2, and a matching degree between each connecting edge in the edge set E1 of fig. G1 and each connecting edge in the edge set E2 of fig. G2;
A seventh module, configured to construct an affinity matrix M according to the matching degree between each vertex in the vertex set V1 of the graph G1 and each vertex in the vertex set V2 of the graph G2, and the matching degree between each connecting edge in the edge set E1 of the graph G1 and each connecting edge in the edge set E2 of the graph G2 obtained by the sixth module;
an eighth module, configured to perform power iteration on the intimacy matrix M obtained by the seventh module to obtain an optimal allocation vector v*
A ninth module for obtaining the optimal distribution vector v of the eighth module*Performing bidirectional randomization to obtain a distribution matrix S;
a tenth module for setting the counter cnt2 to 1;
an eleventh module for determining whether the counter cnt2 is equal to the total number of vertices in graph G1, if so, entering a fourteenth module, otherwise, entering a twelfth module;
a twelfth module, configured to obtain, in the distribution matrix S obtained by the ninth module, a vertex in the graph G2 that is preliminarily matched with the cnt 2-th vertex in the graph G1, determine whether a cross-over ratio between two detection responses corresponding to the two vertices preliminarily matched with each other in the graph G1 and the graph G2 is greater than or equal to a preset threshold, if so, establish an association between a target corresponding to the vertex in the previous frame in the graph G1 and a target corresponding to the vertex in the next frame in the graph G2, and then enter the thirteenth module, otherwise, directly enter the thirteenth module;
A thirteenth module for setting the counter cnt 1-cnt 1+1 and returning to the third module;
and a fourteenth module, configured to, for each target in the first frame of the input video sequence, form a tracking track of the target by combining all targets, which are associated with the target, in all remaining frames of the entire input video sequence with the target.
In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:
(1) the invention adopts the steps (2) to (8) to convert the data association problem of detection response between two continuous frames into the problem of graph matching, establishes the corresponding relation between two graph vertexes represented by a vertex local structure and a pair relation, and effectively utilizes the topological information between the vertexes in the graph structure, so that the invention can solve the technical problem that the accuracy of the tracking result is low because the topological information of the vertexes is not utilized in the traditional multi-target tracking method based on detection and tracking.
(2) Because the invention adopts the steps (5) to (8) and utilizes the matching degree relation of the vertexes and the edges to construct the intimacy matrix, the invention can solve the technical problem that nodes and edges with inaccurate characteristics can not be corrected and the tracking accuracy is low because the structure of the graph is fixed in the existing graph model such as a conditional random field, which can cause the nodes and the edges to be fixed.
(3) Because the step (5) and the step (6) are adopted, the matching degree of the top point and the edge is calculated by using the neural network, and the accuracy of the matching degree is improved by effectively utilizing the neural network, the method can solve the problems that the tracking change under the complex background can not be processed due to poor model robustness of the machine learning model in the conventional multi-target tracking method adopting the machine learning model.
(4) The invention has wide application range, not only can be used for tracking pedestrians, but also can be applied to tracking the track of a moving target of any known type.
(5) The method solves the problem of data association between detection of two continuous frames in the multi-target tracking process by using the deep learning-based graph matching algorithm for the first time, realizes the data association by using the deep learning method, and is beneficial to improving the accuracy and efficiency of target tracking in a complex scene.
Drawings
FIG. 1 is a flow chart of a multi-target tracking method based on graph matching according to the present invention;
FIG. 2 is the frame extracted in step (4) of the method of the present invention, wherein FIG. 2(a) is the previous frame and FIG. 2(b) is the next frame;
FIG. 3 is the object in the frame extracted in step (4) of the method of the present invention, wherein FIG. 3(a) is the object in the previous frame and FIG. 3(b) is the object in the next frame;
FIG. 4 is a diagram constructed in step (4) of the method of the present invention, wherein FIG. 4(a) is a diagram G1 constructed, and FIG. 4(b) is a diagram G2 constructed;
FIG. 5 is a schematic diagram of the structure of the three-tuple network used in step (5) of the method of the invention;
FIG. 6 is a schematic diagram of the structure of the shallow neural network used in step (6) of the method of the present invention;
FIG. 7 is a schematic representation of the affinity matrix M constructed in step (7) of the method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
As shown in fig. 1, the present invention provides a multi-target tracking method based on graph matching, which comprises the following steps:
(1) acquiring a multi-target tracking data set, which comprises an input video sequence and a Detection response (Detection response) of each frame in the input video sequence;
in this step, the multi-target tracking data set is an MOT16 data set (mainly labeled as moving pedestrians and vehicles) which contains a total of 14 video sequences, wherein 7 video sequences are training sets with labeling information, the other 7 video sequences are test sets, the 7 test video sequences come from 7 different scenes, the shooting view angles and the camera motion conditions are different, and the test set has a length of 5919 frames and contains 182326 detection responses and 830 tracks.
(2) Setting counter cnt1 to 1;
(3) judging whether cnt1 is equal to the total frame number of the input video sequence, if yes, entering step (14), otherwise, entering step (4);
(4) acquiring a cnt1 th frame (shown in FIG. 2 (a)) and a cnt1+1 th frame (shown in FIG. 2 (b)) from the input video sequence obtained in the step (1), and constructing a graph G1 from a previous frame (shown in FIG. 4 (a)) and a graph G2 from a next frame (shown in FIG. 4 (b));
specifically, the graphs G1 and G2 constructed at this step include two vertex sets V1 and V2, respectively, the vertices in the vertex set V1 are all the detection responses in the previous frame, and the vertices in the vertex set V2 are all the detection responses in the next frame.
If the distance between two vertices (i.e., detection responses) in the vertex set V1 and the corresponding targets (as shown in fig. 3 (a)) in the previous frame is smaller than or equal to a threshold, there is a connecting edge between the two vertices, and all connecting edges corresponding to the vertex set V1 constitute an edge set E1; similarly, if the distance between two vertices (i.e., detection responses) in the vertex set V2 in the next frame and the corresponding targets (as shown in fig. 3 (b)) is smaller than or equal to a threshold, there is a connecting edge between the two vertices, and all connecting edges corresponding to the vertex set V2 constitute the edge set E2. In the present embodiment, the threshold value ranges from 80 to 120 pixel values, preferably 100.
The method has the advantages that the data association problem of detection response between two continuous frames is converted into the graph matching problem, the graph matching is carried out on the two graph vertexes represented by the vertex local structure and the pair relationship, the topological information of the vertexes and the vertexes in the graph structure is effectively utilized, the correlation of real data in the multi-target tracking process can be effectively reflected, and the tracking result is high in accuracy.
(5) Respectively acquiring all vertexes in vertex sets V1 and V2 in graphs G1 and G2 constructed in the step (4), and respectively inputting the vertexes into a trained triplet network (shown in FIG. 5) to obtain a feature vector corresponding to each vertex;
as shown in fig. 5, the architecture of the tri-tuple network of the present invention is as follows:
the triple network is composed of three paths, each path comprises a ResNet-50 convolution neural network model and two fully connected layers which are connected in sequence.
The first layer is a ResNet-50 convolutional neural network model, which comprises 1 convolutional layer, 16 building block (building block) structures and 1 full-connection layer.
The second layer is a fully-connected layer, the ReLU is adopted by the activation function, and the output is a feature vector with 1024 dimensions.
The third layer is a fully connected layer, the activation function adopts ReLU, and the output is a 128-dimensional feature vector.
The loss function used in the training of the triplet network of the present invention is a triplet loss function. On the basis, a triple network of sample pairs is constructed by using training sets of MOT16 and 2DMOT15 to be subjected to fine adjustment, and a final model is obtained. Each detection response in the two graphs is selected as the input of the network, so that the feature vector of the detection response can be obtained, and the dimension of the feature vector is 128.
(6) Inputting the feature vectors of the vertexes obtained in the step (5) into the trained first shallow neural network and second shallow neural network respectively to obtain the matching degree between each vertex in the vertex set V1 of the graph G1 and each vertex in the vertex set V2 of the graph G2 and the matching degree between each connecting edge in the edge set E1 of the graph G1 and each connecting edge in the edge set E2 of the graph G2 respectively;
fig. 6 shows the structures of the first and second shallow neural networks used in this step, which are identical, and the network structures are both:
the first layer is a fully-connected layer, the activation function adopts ReLU, the retention probability of Dropout is set to be 0.5, and the output is a feature vector with 1024 dimensions.
The second layer is a full connection layer, the activation function adopts Softmax, and the output is a feature vector with 2 dimensions.
The loss function used by the first and second shallow neural networks during training used in this step is the cross entropy (Softmax cross-entropy) between the predicted classes and the true classes between vertices (or between connecting edges).
The back propagation algorithm in this step uses Adam-Optimizer as the Optimizer, since the Optimizer has the advantages of momentum and adaptive learning rate, and the initial learning rate is set to 0.0001.
In this step, the training times are 100000 times, and the sample size is set to 32, where 1/4 is a positive sample and 3/4 is a negative sample. The training samples are from six videos including MOT16-04, MOT16-05, MOT16-09, MOT16-10, MOT16-11 and MOT16-13 in a MOT16 training set.
Specifically, let v ∈ {0,1}nm×1For identifying vectors, the matching relation between the vertexes is expressed if vi∈V1,vae.V 2, and vertex V in the set of vertices V1 of the graph G1iAnd vertex V in vertex set V2 of FIG. G2aMatch (i.e. constitute vertex v)i、vaIs from the same target), then v ia1, otherwise, via=0。
In the implementation process, for the matching degree between the vertexes, the feature vector corresponding to one vertex in one graph extracted in the step (3) and the feature vector corresponding to one vertex in the other graph are spliced to be used as the input of the shallow neural network, and the matching degree of the two vertexes is directly output by the network. For vertex v iE V1 and vertex VaE.g. V2, its matching degree SiaComprises the following steps:
Sia=FN([Fi,Fa])
wherein each corresponds to a vertex viAnd vaFeature vector F ofi,Fa∈R1×dD is the size of the feature vector; [. the]Representing a stitching operation that concatenates a plurality of vectors; fNRepresenting the first shallow neural network with a softmax layer, i ∈ [1 ], total number of vertices in vertex set V1]A ∈ [1, total number of vertices in vertex set V2]。
For the degree of matching between edges: for an edge starting from the ith vertex and ending at the jth vertex, we set the feature vector corresponding to the edge as the connection of the feature vectors of the two vertices i and j to which the edge is connected, i.e., Fij=[Fi,Fj]. The matching degree between the vertexes is the same as that of the solution, and the edges (v) are connectedi,vj) E E1 and connecting edge (v)a,vb) E matching degree S between E2ij:abComprises the following steps:
Sij:ab=FE([Fij,Fab])
where j ∈ [1, total number of vertices in vertex set V1]B ∈ [1, total number of vertices in vertex set V2]Feature vector Fij,Fab∈R1×2d,FERepresenting a second shallow neural network with a softmax layer.
(7) Constructing an affinity matrix M according to the matching degree between each vertex in the vertex set V1 of the graph G1 and each vertex in the vertex set V2 of the graph G2 obtained in the step (6) and the matching degree between each connecting edge in the edge set E1 of the graph G1 and each connecting edge in the edge set E2 of the graph G2;
Specifically, the elements in the affinity matrix M are mainly classified into two types: the elements on the diagonal correspond to the degree of matching between the vertices, i.e., i equals j and a equals b, then Mia:jbIs equal to vertex viAnd vertex vjThe degree of matching;
while the elements on the non-diagonal represent the degree of matching between the connected edges, i.e. for i ≠ j and a ≠ b, then Mia:jbEqual to the connecting edge (v)i,vj) And a connecting edge (v)a,vb) The matching degree between the two; for i ≠ j and a ≠ b or i ≠ j and a ═ b, then Mia:jb=0。
For example, as shown in FIG. 7, the vertex v in G11And vertex v in G2aThe degree of match between is 1 (i.e., both are matched), then M is1a:1a1 is ═ 1; the middle edge of G1 in FIG. 7 (v)1,v2) And G2 middle edge (v)a,vb) The degree of match between is 1 (i.e., both are matched), then M is1a:2b=1。
In the implementation process, in the step, the degree of matching obtained in the step (4) is substituted into the degree of affinity matrix M.
(8) Performing power iteration processing on the intimacy matrix M obtained in the step (7) to obtain an optimal distribution vector v*
Specifically, the optimal allocation vector v of the affinity matrix M can be calculated by performing power iteration using the following formula*
Figure BDA0002588834070000131
Wherein we will v0Initialised as unit vectors, i.e. v 01, | | · | | denotes l2Norm, k represents the number of iterations, the range of which is 2 to 4, preferably 2, and the final result of the iterative computation is used as the optimal distribution vector v *
The traditional algorithm (such as Hungarian algorithm) only uses the associated cost between the vertexes in the solving process, does not use the topological information of the vertexes, and the information is very important in graph matching.
(9) For the optimal distribution vector v obtained in the step (8)*Performing bidirectional randomization to obtain a distribution matrix S;
in particular, the optimal allocation vector v is because one object may only occur at one location at one time instant*The following constraints must be satisfied, namely: for arbitrary viE.g. V1, having
Figure BDA0002588834070000132
For arbitrary vaE.g. V2, having
Figure BDA0002588834070000133
To satisfy this constraint, the optimal allocation vector v is iteratively obtained for the powers*And performing bidirectional randomization processing. Initial algorithm falseOnly a square matrix is set, but in a multi-target tracking scene, due to frequent appearance and disappearance of targets and non-ideal performance of a detector, the detection numbers contained in the front frame and the rear frame are probably different, and a bidirectional randomization method based on the square matrix assumption is not applicable any more.
Specifically, this step is solved by using a normalization method, and v is first calculated*Into a matrix S of size n x mnm(where n and m represent the total number of vertices in vertex sets V1 and V2, respectively), and then iterate using the following equations, resulting in the final assignment matrix
Figure BDA0002588834070000141
As the allocation matrix S.
Figure BDA0002588834070000142
And the distribution matrix is subjected to normalized calculation according to rows and columns respectively, wherein i belongs to [1, n ], and j belongs to [1, n ]. The superscript t +1 in the above formula represents the t +1 th iteration process, and t represents the iteration number, and the value range thereof is 3 to 7, preferably 5.
In the distribution matrix S obtained in this step, the ith row and the a th column of elements indicate the vertex v of the graph G1iAnd the vertex v in the graph G2aA match value between.
(10) Setting counter cnt2 to 1;
(11) judging whether the counter cnt2 is equal to the total number of the vertexes in the graph G1, if so, entering the step (14), otherwise, entering the step (12);
(12) obtaining a vertex preliminarily matched with a cnt 2-th vertex in a graph G1 in a graph G2 by using the distribution matrix S obtained in step (9), judging whether an Intersection over unit (IOU for short) between two detection responses corresponding to the two vertices preliminarily matched with each other in a graph G1 and a graph G2 is greater than or equal to a preset threshold, if so, establishing association between a target corresponding to the vertex in a previous frame in a graph G1 and a target corresponding to the vertex in a next frame in a graph G2 (namely, indicating that the targets respectively corresponding to the two frames before and after the two vertices are continuous tracks), and then entering step (13), otherwise, directly entering step (13);
Specifically, all elements in the ith row are obtained from the distribution matrix S, and the element with the largest numerical value is selected from the elements, and the corresponding column number a is the vertex v in the graph G1iPreliminarily matched vertex v in graph G2a
In the present embodiment, the value range of the preset threshold is 0.4 to 1, and preferably 0.5.
Distribution matrix
Figure BDA0002588834070000143
A one-to-one mapping constraint is satisfied (i.e., a vertex in G1 has and matches at most one vertex in G2). When associated, v is associated with any vertex in the graph G1iIn the matching matrix, the vertex viThere are m numbers in the row, where the m numbers represent the m vertices and v of graph G2iFor each vertex viThe matching is considered to be matched with the vertex of the maximum value of the row; each row in the matching matrix must have a maximum value, i.e. any detection in the previous frame must have a detection in the next frame that matches it, which obviously does not correspond to the actual situation of multi-target tracking. Due to the frequent appearance and disappearance of the target and the large number of missed detections, the number of detections contained in two consecutive frames is often different, i.e. the track of the previous frame does not necessarily have a detection matching the track of the next frame, so that the present invention must then use another method to determine whether the track is finished.
(13) Setting a counter cnt 1-cnt 1+1, and returning to the step (3);
(14) for each target in the first frame of the input video sequence, all targets which are associated with the target in all the remaining frames of the whole input video sequence are combined with the target to form a tracking track of the target.
For example, if the 3 rd object in the first frame is associated with the 4 th object in the second frame, the 5 th object in the third frame is associated with the 4 th object, the 6 th object in the fourth frame is associated with the 5 th object, …, and the 1 st object in the last frame is associated with the 3 rd object in the second last frame, the 3 rd object in the first frame and the 4 th object in the second frame will be combined. The 5 th target in the third frame, the 6 th target in the fourth frame, …, the 3 rd target in the second last frame, and the 1 st target in the last frame form the tracking trajectory of the 3 rd target in the first frame.
In summary, the invention provides an online multi-target tracking algorithm based on graph matching, which converts the data association problem of detection response between two continuous frames into a graph matching problem. Firstly, two deep convolution neural networks are designed to respectively solve the intimacy between two graph vertexes and the intimacy between two graph edges, then the intimacy between the two graphs is directly filled with the intimacy between the vertexes and the intimacy between the edges, and finally the intimacy matrix is processed to obtain a final matching matrix (namely, a correlation matrix between detection). Therefore, the method can effectively reflect the relevance of real data in the multi-target tracking process, and the tracking result is high in accuracy.
Results of the experiment
The practical effect of the invention is illustrated here by the test results on the MOT16 test set. The tracking result of the multi-target tracking algorithm on the MOT16 data set is evaluated by using the following standard evaluation indexes: multi-Object Tracking Accuracy (MOTA), multi-Object Tracking Precision (MOTP), Mostly Lost objects (MT), ML, False Positives (FP), False Negatives (FN), Fragmentation (FM), and identity switching (ID Switches). "↓ ' indicates the higher the better, and' ↓ ' indicates the lower the better. As shown in table 1 below, a detailed comparison of the test results of the present invention and the existing Quad-CNN algorithm, OTCD _1_16 algorithm, and CDA _ DDALv2 algorithm on the MOT16 test set is shown.
Figure BDA0002588834070000161
As can be seen from table 1 above: (1) the method of the invention is arranged in parallel with OTCD _1_16 on MOTA in the first place, and the three indexes of ML and FN exceed other three algorithms, and particularly, the method needs 12.2 percent less than ML. The MOTA is a main index for evaluating the overall performance of the algorithm, and compared with other three tracking algorithms, the tracking algorithm provided by the invention achieves the best performance, which shows that the overall performance of the tracking algorithm provided by the invention is superior to that of the other three algorithms;
(2) The higher MT and the lower ML show that the method provided by the invention can correctly recover the track of the occluded target or the drifted target by considering the proper time interval and utilizing the dependency relationship between the targets.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A multi-target tracking method based on graph matching is characterized by comprising the following steps:
(1) acquiring a multi-target tracking data set which comprises an input video sequence and a detection response of each frame in the input video sequence;
(2) setting counter cnt1 to 1;
(3) judging whether cnt1 is equal to the total frame number of the input video sequence, if yes, entering step (14), otherwise, entering step (4);
(4) acquiring a cnt1 th frame and a cnt1+1 th frame from the input video sequence obtained in the step (1), constructing a graph G1 according to a previous frame, constructing a graph G2 according to a next frame, wherein the graphs G1 and G2 respectively comprise two vertex sets V1 and V2;
(5) respectively acquiring all vertexes in vertex sets V1 and V2 in graphs G1 and G2 constructed in the step (4), and respectively inputting the vertexes into a trained triplet network to obtain a feature vector corresponding to each vertex;
(6) Inputting the feature vectors of the vertexes obtained in the step (5) into the trained first shallow neural network and second shallow neural network respectively to obtain the matching degree between each vertex in the vertex set V1 of the graph G1 and each vertex in the vertex set V2 of the graph G2 and the matching degree between each connecting edge in the edge set E1 of the graph G1 and each connecting edge in the edge set E2 of the graph G2 respectively;
(7) constructing an affinity matrix M according to the matching degree between each vertex in the vertex set V1 of the graph G1 and each vertex in the vertex set V2 of the graph G2 obtained in the step (6) and the matching degree between each connecting edge in the edge set E1 of the graph G1 and each connecting edge in the edge set E2 of the graph G2;
(8) performing power iteration processing on the intimacy matrix M obtained in the step (7) to obtain an optimal distribution vector v*
(9) For the optimal distribution vector v obtained in the step (8)*Performing bidirectional randomization to obtain a distribution matrix S;
(10) setting counter cnt2 to 1;
(11) judging whether the counter cnt2 is equal to the total number of the vertexes in the graph G1, if so, entering the step (14), otherwise, entering the step (12);
(12) obtaining a vertex preliminarily matched with a cnt2 th vertex in the graph G1 in the graph G2 by the distribution matrix S obtained in the step (9), judging whether the intersection ratio between two detection responses corresponding to the two vertices preliminarily matched with each other in the graph G1 and the graph G2 is greater than or equal to a preset threshold value, if so, establishing the association between the target corresponding to the vertex in the graph G1 in the previous frame and the target corresponding to the vertex in the graph G2 in the next frame, and then entering the step (13), otherwise, directly entering the step (13);
(13) Setting a counter cnt 1-cnt 1+1, and returning to the step (3);
(14) for each target in the first frame of the input video sequence, all targets which are associated with the target in all the remaining frames of the whole input video sequence are combined with the target to form a tracking track of the target.
2. The multi-target tracking method according to claim 1,
the vertices in the set of vertices V1 are all the detection responses in the previous frame, and the vertices in the set of vertices V2 are all the detection responses in the next frame;
if the distance between the corresponding targets in the previous frame of two vertexes in the vertex set V1 is smaller than or equal to a threshold, the two vertexes have a connecting edge therebetween, and all the connecting edges corresponding to the vertex set V1 form an edge set E1;
if the distance between the corresponding targets in the next frame of two vertices in the vertex set V2 is less than or equal to a threshold, the two vertices have a connecting edge therebetween, and all the connecting edges corresponding to the vertex set V2 form an edge set E2.
3. The multi-target tracking method according to claim 1 or 2, wherein the triple network is composed of three paths, each path including a ResNet-50 convolutional neural network model and two fully-connected layers, which are sequentially connected, wherein:
The first layer is a ResNet-50 convolutional neural network model, which comprises 1 convolutional layer, 16 building block structures and 1 full-connection layer.
The second layer is a full connection layer, the ReLU is adopted as the activation function, and the output is 1024-dimensional characteristic vectors;
the third layer is a fully connected layer, the activation function adopts ReLU, and the output is a 128-dimensional feature vector.
4. The multi-target tracking method according to claim 1 or 2, wherein the first shallow neural network and the second shallow neural network have the same structure, and the network structures are both:
the first layer is a full-connection layer, the activation function adopts ReLU, the retention probability of Dropout is set to be 0.5, and the output is a 1024-dimensional feature vector;
the second layer is a full connection layer, the activation function adopts Softmax, and the output is a feature vector with 2 dimensions.
5. The multi-target tracking method according to claim 1,
for vertex viE V1 and vertex VaE.g. V2, its matching degree SiaComprises the following steps:
Sia=FN([Fi,Fa])
wherein each corresponds to a vertex viAnd vaFeature vector F ofi,Fa∈R1×dD is the size of the feature vector; [. the]Representing a stitching operation that concatenates a plurality of vectors; fNRepresenting a first shallow neural network with a softmax layer, i and j ∈ [1, n ] ],a∈[1,m]N and m represent the total number of vertices in vertex sets V1 and V2, respectively;
connecting edge F starting from ith vertex and ending from jth vertexij=[Fi,Fj]In which F isi,FjRespectively represent a connecting edge FijThe feature vectors of the two connected vertexes i and j;
connecting edge (v)i,vj) E E1 and connecting edge (v)a,vb) E matching degree S between E2ij:abComprises the following steps:
Sij:ab=FE([Fij,Fab])
wherein b is ∈ [1, m ]]Feature vector Fij,Fab∈R1×2d,FERepresenting a second shallow neural network with a softmax layer.
6. The multi-target tracking method according to claim 5,
the elements on the diagonal of the intimacy matrix M correspond to the degree of match between the vertices, i.e. i equals j and a equals b, then Mia:jbIs equal to vertex viAnd vertex vjThe degree of matching;
the elements on the non-diagonal of the affinity matrix M represent the degree of matching between the connected edges, i.e. for i ≠ jAnd a ≠ b, then Mia:jbEqual to the connecting edge (v)i,vj) And a connecting edge (v)a,vb) The matching degree between the two; for i ≠ j and a ≠ b or i ≠ j and a ═ b, then Mia:jb=0。
7. The multi-target tracking method according to claim 6, wherein the step (8) is to calculate the optimal distribution vector v of the affinity matrix M using the following formula*
Figure FDA0002588834060000041
Wherein v is0As unit vectors, i.e. v01, | | · | | denotes l2Norm, k represents iteration frequency, the value range of k is 2 to 4, and the final result of iterative computation is used as the optimal distribution vector v *
8. The multi-target tracking method of claim 7, wherein step (9) is performed iteratively using the following equations to obtain the final assignment matrix
Figure FDA0002588834060000043
As the allocation matrix S:
Figure FDA0002588834060000042
wherein S isnmIs represented by v*The matrix with the size of nxm is obtained through transformation, the superscript t +1 represents the t +1 th iteration process, t represents the iteration times, and the value range of t is 3-7;
in the distribution matrix S, the ith row and the a column of elements represent the vertex v of the graph G1iAnd the vertex v in the graph G2aA match value between.
9. The multi-target tracking method according to claim 8, wherein in step (12), the assignment matrix S is obtainedAll the elements in the ith row are selected as the element with the largest value, and the corresponding column number a is the vertex v 1 in the graph G1iPreliminarily matched vertex v in graph G2a
10. A multi-target tracking system based on graph matching, comprising:
a first module for obtaining a multi-target tracking data set comprising an input video sequence and a detection response for each frame in the input video sequence;
a second module for setting the counter cnt1 to 1;
a third module, configured to determine whether cnt1 equals to a total frame number of the input video sequence, if so, enter a fourteenth module, otherwise, enter a fourth module;
A fourth module, configured to obtain a cnt1 th frame and a cnt1+1 th frame from the input video sequence obtained by the first module, construct a graph G1 according to a previous frame, construct a graph G2 according to a next frame, where the graphs G1 and G2 respectively include two vertex sets V1 and V2;
a fifth module, configured to obtain all vertices in vertex sets V1 and V2 in graphs G1 and G2 constructed by the fourth module, respectively, and input the vertices into a trained triplet network, so as to obtain feature vectors corresponding to the vertices;
a sixth module, configured to input the feature vectors of the vertices obtained by the fifth module into the trained first and second shallow neural networks, respectively, to obtain a matching degree between each vertex in the vertex set V1 of fig. G1 and each vertex in the vertex set V2 of fig. G2, and a matching degree between each connecting edge in the edge set E1 of fig. G1 and each connecting edge in the edge set E2 of fig. G2;
a seventh module, configured to construct an affinity matrix M according to the matching degree between each vertex in the vertex set V1 of the graph G1 and each vertex in the vertex set V2 of the graph G2, and the matching degree between each connecting edge in the edge set E1 of the graph G1 and each connecting edge in the edge set E2 of the graph G2 obtained by the sixth module;
An eighth module for obtaining the parent from the seventh modulePerforming power iteration treatment on the density matrix M to obtain an optimal distribution vector v*
A ninth module for obtaining the optimal distribution vector v of the eighth module*Performing bidirectional randomization to obtain a distribution matrix S;
a tenth module for setting the counter cnt2 to 1;
an eleventh module for determining whether the counter cnt2 is equal to the total number of vertices in graph G1, if so, entering a fourteenth module, otherwise, entering a twelfth module;
a twelfth module, configured to obtain, in the distribution matrix S obtained by the ninth module, a vertex in the graph G2 that is preliminarily matched with the cnt 2-th vertex in the graph G1, determine whether a cross-over ratio between two detection responses corresponding to the two vertices preliminarily matched with each other in the graph G1 and the graph G2 is greater than or equal to a preset threshold, if so, establish an association between a target corresponding to the vertex in the previous frame in the graph G1 and a target corresponding to the vertex in the next frame in the graph G2, and then enter the thirteenth module, otherwise, directly enter the thirteenth module;
a thirteenth module for setting the counter cnt 1-cnt 1+1 and returning to the third module;
and a fourteenth module, configured to, for each target in the first frame of the input video sequence, form a tracking track of the target by combining all targets, which are associated with the target, in all remaining frames of the entire input video sequence with the target.
CN202010689629.7A 2020-07-17 2020-07-17 Multi-target tracking method and system based on graph matching Expired - Fee Related CN111862156B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010689629.7A CN111862156B (en) 2020-07-17 2020-07-17 Multi-target tracking method and system based on graph matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010689629.7A CN111862156B (en) 2020-07-17 2020-07-17 Multi-target tracking method and system based on graph matching

Publications (2)

Publication Number Publication Date
CN111862156A true CN111862156A (en) 2020-10-30
CN111862156B CN111862156B (en) 2021-02-26

Family

ID=72984639

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010689629.7A Expired - Fee Related CN111862156B (en) 2020-07-17 2020-07-17 Multi-target tracking method and system based on graph matching

Country Status (1)

Country Link
CN (1) CN111862156B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507054A (en) * 2020-12-12 2021-03-16 武汉中海庭数据技术有限公司 Method and system for automatically determining road outside line incidence relation
CN113379788A (en) * 2021-06-29 2021-09-10 西安理工大学 Target tracking stability method based on three-element network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3152279B2 (en) * 1995-07-13 2001-04-03 日本電気株式会社 Sub-optimal assignment determination method
CN104200488A (en) * 2014-08-04 2014-12-10 合肥工业大学 Multi-target tracking method based on graph representation and matching
CN107292911A (en) * 2017-05-23 2017-10-24 南京邮电大学 A kind of multi-object tracking method merged based on multi-model with data correlation
US20180130215A1 (en) * 2016-11-07 2018-05-10 Nec Laboratories America, Inc. Deep network flow for multi-object tracking
CN108447080A (en) * 2018-03-02 2018-08-24 哈尔滨工业大学深圳研究生院 Method for tracking target, system and storage medium based on individual-layer data association and convolutional neural networks
CN110047096A (en) * 2019-04-28 2019-07-23 中南民族大学 A kind of multi-object tracking method and system based on depth conditions random field models
US20190384982A1 (en) * 2018-05-23 2019-12-19 Tusimple, Inc. Method and apparatus for Sampling Training Data and Computer Server
CN111161315A (en) * 2019-12-18 2020-05-15 北京大学 Multi-target tracking method and system based on graph neural network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3152279B2 (en) * 1995-07-13 2001-04-03 日本電気株式会社 Sub-optimal assignment determination method
CN104200488A (en) * 2014-08-04 2014-12-10 合肥工业大学 Multi-target tracking method based on graph representation and matching
US20180130215A1 (en) * 2016-11-07 2018-05-10 Nec Laboratories America, Inc. Deep network flow for multi-object tracking
CN107292911A (en) * 2017-05-23 2017-10-24 南京邮电大学 A kind of multi-object tracking method merged based on multi-model with data correlation
CN108447080A (en) * 2018-03-02 2018-08-24 哈尔滨工业大学深圳研究生院 Method for tracking target, system and storage medium based on individual-layer data association and convolutional neural networks
US20190384982A1 (en) * 2018-05-23 2019-12-19 Tusimple, Inc. Method and apparatus for Sampling Training Data and Computer Server
CN110047096A (en) * 2019-04-28 2019-07-23 中南民族大学 A kind of multi-object tracking method and system based on depth conditions random field models
CN111161315A (en) * 2019-12-18 2020-05-15 北京大学 Multi-target tracking method and system based on graph neural network

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
ANDREI ZANFIR等: "Deep Learning of Graph Matching", 《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
JUN XIANG等: "Online Multi-Object Tracking Based on Feature Representation and Bayesian Filtering Within a Deep Learning Architecture", 《IEEE ACCESS》 *
SAMUEL SCHULTER等: "Deep Network Flow for Multi-object Tracking", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
TIMOTHEE COUR等: "Balanced Graph Matching", 《ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 19: PROCEEDINGS OF THE 2006 CONFERENCE》 *
侯建华等: "基于深度学习的多目标跟踪关联模型设计", 《自动化学报》 *
吴汉宝等: "基于二分图最优完备匹配的目标关联算法", 《华中科技大学学报(自然科学版)》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507054A (en) * 2020-12-12 2021-03-16 武汉中海庭数据技术有限公司 Method and system for automatically determining road outside line incidence relation
CN113379788A (en) * 2021-06-29 2021-09-10 西安理工大学 Target tracking stability method based on three-element network
CN113379788B (en) * 2021-06-29 2024-03-29 西安理工大学 Target tracking stability method based on triplet network

Also Published As

Publication number Publication date
CN111862156B (en) 2021-02-26

Similar Documents

Publication Publication Date Title
CN109961034B (en) Video target detection method based on convolution gating cyclic neural unit
CN114972418B (en) Maneuvering multi-target tracking method based on combination of kernel adaptive filtering and YOLOX detection
Xu et al. Deep learning for multiple object tracking: a survey
Xu et al. Multimodal cross-layer bilinear pooling for RGBT tracking
CN107609460B (en) Human body behavior recognition method integrating space-time dual network flow and attention mechanism
CN113628249B (en) RGBT target tracking method based on cross-modal attention mechanism and twin structure
CN110555387B (en) Behavior identification method based on space-time volume of local joint point track in skeleton sequence
CN108520203B (en) Multi-target feature extraction method based on fusion of self-adaptive multi-peripheral frame and cross pooling feature
CN111339908B (en) Group behavior identification method based on multi-mode information fusion and decision optimization
CN111862156B (en) Multi-target tracking method and system based on graph matching
CN111582091B (en) Pedestrian recognition method based on multi-branch convolutional neural network
CN111898432A (en) Pedestrian detection system and method based on improved YOLOv3 algorithm
Chang et al. Fast Random‐Forest‐Based Human Pose Estimation Using a Multi‐scale and Cascade Approach
CN116681728A (en) Multi-target tracking method and system based on Transformer and graph embedding
CN115761534A (en) Method for detecting and tracking small target of infrared unmanned aerial vehicle under air background
CN115761888A (en) Tower crane operator abnormal behavior detection method based on NL-C3D model
CN111291785A (en) Target detection method, device, equipment and storage medium
US20240177525A1 (en) Multi-view human action recognition method based on hypergraph learning
Chen et al. Pyramid attention object detection network with multi-scale feature fusion
CN112329662A (en) Multi-view saliency estimation method based on unsupervised learning
Han et al. Light-field depth estimation using RNN and CRF
CN117173595A (en) Unmanned aerial vehicle aerial image target detection method based on improved YOLOv7
Wang et al. Sture: Spatial–temporal mutual representation learning for robust data association in online multi-object tracking
CN116311504A (en) Small sample behavior recognition method, system and equipment
Zhao et al. Paralleled attention modules and adaptive focal loss for Siamese visual tracking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210226