CN110298248A - A kind of multi-object tracking method and system based on semantic segmentation - Google Patents
A kind of multi-object tracking method and system based on semantic segmentation Download PDFInfo
- Publication number
- CN110298248A CN110298248A CN201910444189.6A CN201910444189A CN110298248A CN 110298248 A CN110298248 A CN 110298248A CN 201910444189 A CN201910444189 A CN 201910444189A CN 110298248 A CN110298248 A CN 110298248A
- Authority
- CN
- China
- Prior art keywords
- goal
- sub
- bounding box
- target
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 51
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000001514 detection method Methods 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 230000003993 interaction Effects 0.000 abstract description 7
- 230000004069 differentiation Effects 0.000 abstract description 3
- 239000000284 extract Substances 0.000 abstract description 3
- 239000011159 matrix material Substances 0.000 description 12
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000002059 diagnostic imaging Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/49—Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The present invention provides a kind of multi-object tracking method based on semantic segmentation, which comprises reads video or image sequence, defines target position using bounding box to each frame image;Pixel-level segmentation is carried out to the bounding box by way of semantic segmentation, bounding box is divided into the different parts of background and target, and classify, obtains the location information of each classification sub-goal;Characteristic matching network is input to by adjacent two field pictures after rejecting background classification to the location information of every a kind of sub-goal of acquisition;The matching degree for calculating sub-goal feature between different frame carries out the data correlation between different frame to sub-goal, determines that sub-goal in the position of present frame, exports target trajectory.The present invention will track target using semantic segmentation and background realizes the differentiation of Pixel-level, extracts the target signature input network for rejecting background information, and the effective multiple target interaction bring that reduces influences, and improves the precision and performance of multiple target tracking.
Description
Technical field
The present invention relates to technical field of computer vision, in particular to a kind of multi-object tracking method based on semantic segmentation
And system.
Background technique
In recent years, the method based on deep learning makes a breakthrough in Computer Vision Task, computer vision
Technology is fast-developing, intelligent monitoring, human-computer interaction, virtual reality and augmented reality, medical imaging analyze etc. all conglomeraties and
It is widely applied in field.
Target following (Object Tracking) is classical Computer Vision Task.The sense obtained by target following
Interest region is basis and intelligent monitoring, human-computer interaction, the robot navigation and automatic of further progress high level visual analysis
The basis of driving, virtual reality and augmented reality, medical imaging analysis etc., the accuracy of target following will directly influence calculating
The performance of machine vision system.
Motion profile related question in multiple target tracking task is extremely complex.Due to the interaction between background and target,
The detection of mistake, the difficulties such as too small, frame per second is low, angle, change of scale that there are targets are frequently accompanied by multiple target tracking task.
Simultaneously as target is apparently similar, interaction is frequently accompanied by the interference frequently blocked with background information between target, seriously affects
The precision and performance of tracking.
Therefore, for the background information interference problem occurred in the prior art, a kind of more mesh based on semantic segmentation are needed
Mark tracking and system.
Summary of the invention
One aspect of the present invention is to provide a kind of multi-object tracking method based on semantic segmentation, the method packet
It includes:
Video or image sequence are read, target position is defined using bounding box to each frame image;
By way of semantic segmentation to the bounding box carry out Pixel-level segmentation, by bounding box be divided into background and
The different parts of target, and classify, obtain the location information of each classification sub-goal;
Spy is input to by adjacent two field pictures after rejecting background classification to the location information of every a kind of sub-goal of acquisition
Levy matching network;
The matching degree for calculating sub-goal feature between different frame carries out the data correlation between different frame to sub-goal, really
Stator target exports target trajectory in the position of present frame.
Preferably, by detecting to each frame image in screen or image sequence, output test result is as side
Boundary's frame defines target position.
Preferably, each frame image in screen or image sequence is detected as follows:
Screen or image sequence obtain convolution characteristic pattern through convolutional Neural network model treatment with tensor form;
Based on the mode in area-of-interest pond, the first candidate region is generated on the convolution characteristic pattern, and to described
First candidate region carries out bounding box recurrence processing, obtains the second candidate region;
Based on the mode in area-of-interest pond, the characteristic pattern of the second candidate region is extracted, and generates third candidate region;
Full articulamentum carries out kind judging to the third candidate region, carries out bounding box recurrence processing again, obtains side
The target position that boundary's frame defines.
Preferably, by Inception V3 model, Pixel-level segmentation is carried out to the bounding box.
Preferably, by bipartite graph matching, the matching degree of sub-goal feature between different frame is calculated, sub-goal is carried out not
Data correlation between at same frame determines sub-goal in the position of present frame.
Another aspect of the present invention is to provide a kind of multiple-target system based on semantic segmentation, the system packet
It includes:
Module of target detection: for reading video or image sequence, target institute is defined using bounding box to each frame image
In position;
Semantic segmentation module: for carrying out Pixel-level segmentation to the bounding box by way of semantic segmentation, by boundary
Frame is divided into the different parts of background and target, and classifies, and obtains the location information of each classification sub-goal;
Characteristic matching network: to the location information of every a kind of sub-goal of acquisition, after rejecting background classification, by adjacent two frame
Image is input to characteristic matching network;
The matching degree for calculating sub-goal feature between different frame carries out the data correlation between different frame to sub-goal, really
Stator target exports target trajectory in the position of present frame.
Preferably, by detecting to each frame image in screen or image sequence, output test result is as side
Boundary's frame defines target position.
Preferably, each frame image in screen or image sequence is detected as follows:
Screen or image sequence obtain convolution characteristic pattern through convolutional Neural network model treatment with tensor form;
Based on the mode in area-of-interest pond, the first candidate region is generated on the convolution characteristic pattern, and to described
First candidate region carries out bounding box recurrence processing, obtains the second candidate region;
Based on the mode in area-of-interest pond, the characteristic pattern of the second candidate region is extracted, and generates third candidate region;
Full articulamentum carries out kind judging to the third candidate region, carries out bounding box recurrence processing again, obtains side
The target position that boundary's frame defines.
Preferably, by Inception V3 model, Pixel-level segmentation is carried out to the bounding box.
Preferably, by bipartite graph matching, the matching degree of sub-goal feature between different frame is calculated, sub-goal is carried out not
Data correlation between at same frame determines sub-goal in the position of present frame.
A kind of multi-object tracking method and system based on semantic segmentation provided by the invention will be tracked using semantic segmentation
Target and background realize the differentiation of Pixel-level, to extract the target signature input network for rejecting background information, effectively subtract
Few multiple target interaction bring influences, and improves the precision and performance of multiple target tracking.
It should be appreciated that aforementioned description substantially and subsequent detailed description are exemplary illustration and explanation, it should not
As the limitation to the claimed content of the present invention.
Detailed description of the invention
With reference to the attached drawing of accompanying, the more purposes of the present invention, function and advantage are by the as follows of embodiment through the invention
Description is illustrated, in which:
Fig. 1 has schematically shown a kind of flow diagram of multi-object tracking method based on semantic segmentation of the invention.
Fig. 2 shows the flow diagrams that the present invention defines target position.
Fig. 3 shows the schematic diagram that bounding box of the present invention carries out Pixel-level segmentation.
Fig. 4 shows Inception V3 model schematic in one embodiment of the invention.
Fig. 5, which is shown, to be distributed in bright one embodiment using bipartite graph matching method to the number between sub-goal progress different frame
According to associated schematic diagram.
Fig. 6 shows a kind of structural block diagram of the multiple-target system based on semantic segmentation of the present invention.
Specific embodiment
By reference to exemplary embodiment, the purpose of the present invention and function and the side for realizing these purposes and function
Method will be illustrated.However, the present invention is not limited to exemplary embodiment as disclosed below;Can by different form come
It is realized.The essence of specification is only to aid in those skilled in the relevant arts' Integrated Understanding detail of the invention.
Hereinafter, the embodiment of the present invention will be described with reference to the drawings.In the accompanying drawings, identical appended drawing reference represents identical
Or similar component or same or like step.
Detailed description is provided to the contents of the present invention below by specific embodiment, the present invention is a kind of as shown in Figure 1
The flow diagram of multi-object tracking method based on semantic segmentation, embodiment according to the present invention are a kind of based on the more of semantic segmentation
Method for tracking target includes following method and step:
Step S101 defines target position.
Video or image sequence are read, target position is defined using bounding box to each frame image.
According to an embodiment of the invention, by being detected to each frame image in screen or image sequence, output inspection
Result is surveyed as bounding box and defines target position.
In some embodiments, the present invention as shown in Figure 2 defines the flow diagram of target position, to screen or figure
As each frame image in sequence is detected as follows:
S201, screen or image sequence obtain convolution characteristic pattern through convolutional Neural network model treatment with tensor form.
S202, the mode based on area-of-interest pond generate the first candidate region on the convolution characteristic pattern, and right
First candidate region carries out bounding box recurrence processing, obtains the second candidate region.
S203, the mode based on area-of-interest pond, extract the characteristic pattern of the second candidate region, and generate third candidate
Region.
S204, full articulamentum carry out kind judging to third candidate region, carry out bounding box recurrence processing again, obtain side
The target position that boundary's frame defines.
Step S102, Pixel-level segmentation is carried out to bounding box.
By way of semantic segmentation to the bounding box carry out Pixel-level segmentation, by bounding box be divided into background and
The different parts of target, and classify, obtain the location information of each classification sub-goal.
By taking pedestrian as an example, the pedestrian's bounding box detected is interior to further comprise a large amount of background interference information other than target,
Interference in order to avoid background information to tracking result carries out Pixel-level segmentation to pedestrian's (target) bounding box, will test
Pedestrian's bounding box is divided into background and human body different parts.Bounding box of the present invention carries out the schematic diagram of Pixel-level segmentation as shown in Figure 3,
After being divided by semantic segmentation to bounding box, bounding box is divided into five background, head, the upper part of the body, the lower part of the body and shoes classifications,
Head, the upper part of the body, the lower part of the body and shoes are the sub-goal for tracking target.
In the present embodiment, by Inception V3 model, Pixel-level segmentation is carried out to the bounding box.
Inception V3 model schematic in one embodiment of the invention as shown in Figure 4, is based on Inception
V3Module carries out semantic segmentation.Inception V3 is the network with excellent local topology, i.e., simultaneously to input picture
Multiple convolution algorithms or pondization operation are executed capablely, and all output results are spliced into a very deep characteristic pattern.Because
1*1,3*3 or 5*5 different convolution algorithms can obtain the different information of input picture from pondization operation, and parallel processing simultaneously combines
The result of these operations will obtain better image characterization.In the present embodiment, the convolution of 3*3 can split into 1*3 and 3*1 convolution,
More save parameter.
Step S103, the data correlation between different frame is carried out to sub-goal, determine sub-goal in the position of present frame, it is defeated
The motion profile of target out.
Spy is input to by adjacent two field pictures after rejecting background classification to the location information of every a kind of sub-goal of acquisition
Levy matching network.Semantic segmentation obtains sub-goal classification, and after rejecting background classification information, sub-goal feature is more excellent at this time, will scheme
As being input to characteristic matching network, the robustness of algorithm is effectively enhanced.
The matching degree for calculating sub-goal feature between different frame carries out the data correlation between different frame to sub-goal, really
Stator target exports the motion profile of target in the position of present frame.
Such as in above-described embodiment, spy is input to by adjacent two frames picture after rejecting background classification to the target of tracking
It levies in matching network.The matching degree for calculating sub-goal head between different frame, the upper part of the body, the lower part of the body and shoes, between different frame
Data correlation.Such as the 1st frame image sub-goal head, carry out data pass with the sub-goal head of the 2nd frame image
Connection;……;The sub-goal shoes and the sub-goal shoes of the 2nd frame image of 1st frame image are associated.
According to an embodiment of the invention, the matching degree of sub-goal feature between different frame is calculated by bipartite graph matching, it is right
Sub-goal carries out the data correlation between different frame, determines sub-goal in the position of present frame.The bright implementation of distribution shown in Fig. 5
Using bipartite graph matching method to the schematic diagram of the data correlation between sub-goal progress different frame in example.
In some embodiments, bipartite graph matching is carried out as follows:
Input adjacent two field pictures I1And I2, construct graph model G1(V1,E1) and G2(V2,E2).Wherein, | V1|=n, | V2|
=m, respectively sub-goal quantity, E detected by two field pictures1,E2The side collection of respectively two graph models.
If indicator vector v ∈ { 0,1 }nm×1, when belonging to V1In node i (sub-goal i) and belong to V2In node a (specific item
When mark a) matches, it is directed toward vector via=1.
Establish a square symmetrical positive matrices M ∈ Rnm×nm, so that Mij,abEach pair of sub-goal can be measured in corresponding graph model
The matching of side collection, i.e. (i, j) ∈ E1With (a, b) ∈ E2Matching.Wherein, M ∈ Rnm×nmIndicate that each element in positive matrices belongs to reality
Manifold Rnm×nm, Mij,abIndicate the submatrix in symmetrical positive matrices M.
For not forming pair on side, their respective items in a matrix are set as 0.Optimal assignment v*It can indicate are as follows:
v*argmaxvTMv, so that Cv=1, v ∈ { 0,1 }nm×1 (1)
Wherein argmax indicates argmax function.
Matrix C is used to constrain one-to-one matching, it may be assumed that
(1) formula is converted to (3) formula to solve:
v*argmaxvTMv, so that | | v | |2=1 (3)
It can be in the hope of optimal v according to the feature vector of matrix M*。As node i (sub-goal i) matched node a
(the confidence level of sub-goal a).According to calculated matching confidence level, available final matching result, realize sub-goal into
Data correlation between row different frame obtains the current location of sub-goal.
In further embodiments, bipartite graph matching can also be carried out as follows:
A. depth characteristic extracts (Deep Feature Extractor):
For adjacent two field pictures I1And I2, the target signature U from different layers that is extracted by feature extraction network1,
U2, F1, F2。
B. affinity matrix calculates (Affinity Matrix Calculation)
According to the connection matrix A of known figure1, A2, decompose and obtain matrix G1, G2, H1, H2, wherein
The feature for defining corresponding sides is X, Y, connects to obtain by matrix.
MpIt indicates the similarity between the associated energy of the intermediate node of two figures i.e. two points, is obtained by the feature inner product put
It arrives, it may be assumed that
Mp=U1U2T,
MeIt indicates two simultaneous energy in side in two width figures, therefore is obtained by the feature inner product on side, wherein Λ is to need
The parameter of study.That is:
Me=X Λ YT,
Metzler matrix can be solved by following formula, and wherein vec () indicates matrix pulling into row vector by column, and [] is indicated will be to
Quantitative change diagonally battle array.
C. power iteration (Power Iteration)
The feature vector of matrix can be by power iteration (Power Iteration) approximation, and iterative formula is as follows:
Obtain v*Afterwards, by ranks l1It is unitization, give initial matrix S0=(v*)n×m
D. loss function (Loss Function)
The position of prediction and the gap of actual position will be calculated as loss, defined:
Weighted average obtains the deviation of predicted position and actual position, realizes weight normalization using softmax.
Wherein,Indicate the offset distance of origin-to-destination in true match,
The present invention provides a kind of multiple-target system based on semantic segmentation, and the present invention is a kind of as shown in Figure 6 is based on language
The structural block diagram of the multiple-target system of justice segmentation, according to an embodiment of the invention, a kind of multiple target based on semantic segmentation
Tracking system includes:
Module of target detection 100: for reading video or image sequence, target is defined using bounding box to each frame image
Position.
In some embodiments, by being detected to each frame image in screen or image sequence, output detection knot
Fruit defines target position as bounding box.Each frame image in screen or image sequence is examined as follows
It surveys:
Screen or image sequence obtain convolution characteristic pattern through convolutional Neural network model treatment with tensor form.
Based on the mode in area-of-interest pond, the first candidate region is generated on the convolution characteristic pattern, and to described
First candidate region carries out bounding box recurrence processing, obtains the second candidate region.
Based on the mode in area-of-interest pond, the characteristic pattern of the second candidate region is extracted, and generates third candidate region.
Full articulamentum carries out kind judging to the third candidate region, carries out bounding box recurrence processing again, obtains side
The target position that boundary's frame defines.
Semantic segmentation module 200: for carrying out Pixel-level segmentation to the bounding box by way of semantic segmentation, by side
Boundary's frame is divided into the different parts of background and target, and classifies, and obtains the location information of each classification sub-goal.
According to an embodiment of the invention, carrying out Pixel-level segmentation to bounding box by Inception V3 model.
Characteristic matching network 300: will be adjacent after rejecting background classification to the location information of every a kind of sub-goal of acquisition
Two field pictures are input to characteristic matching network.
The matching degree for calculating sub-goal feature between different frame carries out the data correlation between different frame to sub-goal, really
Stator target is in the position of present frame, the motion profile of final output sub-goal.
According to an embodiment of the invention, the matching degree of sub-goal feature between different frame is calculated by bipartite graph matching, it is right
Sub-goal carries out the data correlation between different frame, determines sub-goal in the position of present frame.
A kind of multi-object tracking method and system based on semantic segmentation provided by the invention will be tracked using semantic segmentation
Target and background realize the differentiation of Pixel-level, to extract the target signature input network for rejecting background information, effectively subtract
Few multiple target interaction bring influences, and improves the precision and performance of multiple target tracking.
In conjunction with the explanation and practice of the invention disclosed here, the other embodiment of the present invention is for those skilled in the art
It all will be readily apparent and understand.Illustrate and embodiment is regarded only as being exemplary, true scope of the invention and purport are equal
It is defined in the claims.
Claims (10)
1. a kind of multi-object tracking method based on semantic segmentation, which is characterized in that the described method includes:
Video or image sequence are read, target position is defined using bounding box to each frame image;
Pixel-level segmentation is carried out to the bounding box by way of semantic segmentation, bounding box is divided into background and target
Different parts, and classify, obtain the location information of each classification sub-goal;
Feature is input to by adjacent two field pictures after rejecting background classification to the location information of every a kind of sub-goal of acquisition
Distribution network;
The matching degree for calculating sub-goal feature between different frame carries out the data correlation between different frame to sub-goal, determines son
Target exports target trajectory in the position of present frame.
2. the method according to claim 1, wherein by each frame image in screen or image sequence into
Row detection, output test result define target position as bounding box.
3. according to the method described in claim 2, it is characterized in that, to each frame image in screen or image sequence according to such as
Lower method is detected:
Screen or image sequence obtain convolution characteristic pattern through convolutional Neural network model treatment with tensor form;
Based on the mode in area-of-interest pond, the first candidate region is generated on the convolution characteristic pattern, and to described first
Candidate region carries out bounding box recurrence processing, obtains the second candidate region;
Based on the mode in area-of-interest pond, the characteristic pattern of the second candidate region is extracted, and generates third candidate region;
Full articulamentum carries out kind judging to the third candidate region, carries out bounding box recurrence processing again, obtains bounding box
The target position defined.
4. the method according to claim 1, wherein by Inception V3 model, to the bounding box into
The segmentation of row Pixel-level.
5. the method according to claim 1, wherein calculating sub-goal between different frame by bipartite graph matching
The matching degree of feature carries out the data correlation between different frame to sub-goal, determines sub-goal in the position of present frame.
6. a kind of multiple-target system based on semantic segmentation, which is characterized in that the system comprises:
Module of target detection: for reading video or image sequence, target institute is defined in place using bounding box to each frame image
It sets;
Semantic segmentation module: for carrying out Pixel-level segmentation to the bounding box by way of semantic segmentation, bounding box is drawn
It is divided into the different parts of background and target, and classifies, obtains the location information of each classification sub-goal;
Characteristic matching network: to the location information of every a kind of sub-goal of acquisition, after rejecting background classification, by adjacent two field pictures
It is input to characteristic matching network;
The matching degree for calculating sub-goal feature between different frame carries out the data correlation between different frame to sub-goal, determines son
Target exports target trajectory in the position of present frame.
7. system according to claim 6, which is characterized in that by each frame image in screen or image sequence into
Row detection, output test result define target position as bounding box.
8. system according to claim 6, which is characterized in that each frame image in screen or image sequence according to such as
Lower method is detected:
Screen or image sequence obtain convolution characteristic pattern through convolutional Neural network model treatment with tensor form;
Based on the mode in area-of-interest pond, the first candidate region is generated on the convolution characteristic pattern, and to described first
Candidate region carries out bounding box recurrence processing, obtains the second candidate region;
Based on the mode in area-of-interest pond, the characteristic pattern of the second candidate region is extracted, and generates third candidate region;
Full articulamentum carries out kind judging to the third candidate region, carries out bounding box recurrence processing again, obtains bounding box
The target position defined.
9. system according to claim 6, which is characterized in that by Inception V3 model, to the bounding box into
The segmentation of row Pixel-level.
10. system according to claim 6, which is characterized in that by bipartite graph matching, calculate sub-goal between different frame
The matching degree of feature carries out the data correlation between different frame to sub-goal, determines sub-goal in the position of present frame.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910444189.6A CN110298248A (en) | 2019-05-27 | 2019-05-27 | A kind of multi-object tracking method and system based on semantic segmentation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910444189.6A CN110298248A (en) | 2019-05-27 | 2019-05-27 | A kind of multi-object tracking method and system based on semantic segmentation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110298248A true CN110298248A (en) | 2019-10-01 |
Family
ID=68027226
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910444189.6A Pending CN110298248A (en) | 2019-05-27 | 2019-05-27 | A kind of multi-object tracking method and system based on semantic segmentation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110298248A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111028154A (en) * | 2019-11-18 | 2020-04-17 | 哈尔滨工程大学 | Rough-terrain seabed side-scan sonar image matching and splicing method |
CN111179311A (en) * | 2019-12-23 | 2020-05-19 | 全球能源互联网研究院有限公司 | Multi-target tracking method and device and electronic equipment |
CN111428566A (en) * | 2020-02-26 | 2020-07-17 | 沈阳大学 | Deformation target tracking system and method |
CN113643330A (en) * | 2021-10-19 | 2021-11-12 | 青岛根尖智能科技有限公司 | Target tracking method and system based on dynamic semantic features |
WO2022068522A1 (en) * | 2020-09-30 | 2022-04-07 | 华为技术有限公司 | Target tracking method and electronic device |
CN117876428A (en) * | 2024-03-12 | 2024-04-12 | 金锐同创(北京)科技股份有限公司 | Target tracking method, device, computer equipment and medium based on image processing |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101464946A (en) * | 2009-01-08 | 2009-06-24 | 上海交通大学 | Detection method based on head identification and tracking characteristics |
CN105745687A (en) * | 2012-01-06 | 2016-07-06 | 派尔高公司 | Context aware moving object detection |
CN106688011A (en) * | 2014-09-10 | 2017-05-17 | 北京市商汤科技开发有限公司 | Method and system for multi-class object detection |
CN106845373A (en) * | 2017-01-04 | 2017-06-13 | 天津大学 | Towards pedestrian's attribute forecast method of monitor video |
CN107341446A (en) * | 2017-06-07 | 2017-11-10 | 武汉大千信息技术有限公司 | Specific pedestrian's method for tracing and system based on inquiry self-adaptive component combinations of features |
CN107766791A (en) * | 2017-09-06 | 2018-03-06 | 北京大学 | A kind of pedestrian based on global characteristics and coarseness local feature recognition methods and device again |
CN109035293A (en) * | 2018-05-22 | 2018-12-18 | 安徽大学 | Method suitable for segmenting remarkable human body example in video image |
CN109145713A (en) * | 2018-07-02 | 2019-01-04 | 南京师范大学 | A kind of Small object semantic segmentation method of combining target detection |
WO2019007524A1 (en) * | 2017-07-06 | 2019-01-10 | Toyota Motor Europe | Tracking objects in sequences of digital images |
CN109214346A (en) * | 2018-09-18 | 2019-01-15 | 中山大学 | Picture human motion recognition method based on hierarchical information transmitting |
CN109255351A (en) * | 2018-09-05 | 2019-01-22 | 华南理工大学 | Bounding box homing method, system, equipment and medium based on Three dimensional convolution neural network |
CN109460702A (en) * | 2018-09-14 | 2019-03-12 | 华南理工大学 | Passenger's abnormal behaviour recognition methods based on human skeleton sequence |
CN109740537A (en) * | 2019-01-03 | 2019-05-10 | 广州广电银通金融电子科技有限公司 | The accurate mask method and system of pedestrian image attribute in crowd's video image |
-
2019
- 2019-05-27 CN CN201910444189.6A patent/CN110298248A/en active Pending
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101464946A (en) * | 2009-01-08 | 2009-06-24 | 上海交通大学 | Detection method based on head identification and tracking characteristics |
CN105745687A (en) * | 2012-01-06 | 2016-07-06 | 派尔高公司 | Context aware moving object detection |
CN106688011A (en) * | 2014-09-10 | 2017-05-17 | 北京市商汤科技开发有限公司 | Method and system for multi-class object detection |
CN106845373A (en) * | 2017-01-04 | 2017-06-13 | 天津大学 | Towards pedestrian's attribute forecast method of monitor video |
CN107341446A (en) * | 2017-06-07 | 2017-11-10 | 武汉大千信息技术有限公司 | Specific pedestrian's method for tracing and system based on inquiry self-adaptive component combinations of features |
WO2019007524A1 (en) * | 2017-07-06 | 2019-01-10 | Toyota Motor Europe | Tracking objects in sequences of digital images |
CN107766791A (en) * | 2017-09-06 | 2018-03-06 | 北京大学 | A kind of pedestrian based on global characteristics and coarseness local feature recognition methods and device again |
CN109035293A (en) * | 2018-05-22 | 2018-12-18 | 安徽大学 | Method suitable for segmenting remarkable human body example in video image |
CN109145713A (en) * | 2018-07-02 | 2019-01-04 | 南京师范大学 | A kind of Small object semantic segmentation method of combining target detection |
CN109255351A (en) * | 2018-09-05 | 2019-01-22 | 华南理工大学 | Bounding box homing method, system, equipment and medium based on Three dimensional convolution neural network |
CN109460702A (en) * | 2018-09-14 | 2019-03-12 | 华南理工大学 | Passenger's abnormal behaviour recognition methods based on human skeleton sequence |
CN109214346A (en) * | 2018-09-18 | 2019-01-15 | 中山大学 | Picture human motion recognition method based on hierarchical information transmitting |
CN109740537A (en) * | 2019-01-03 | 2019-05-10 | 广州广电银通金融电子科技有限公司 | The accurate mask method and system of pedestrian image attribute in crowd's video image |
Non-Patent Citations (4)
Title |
---|
A. ZANFIR 等: "Deep Learning of Graph Matching", 《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
M. M. KALAYEH 等: "Human Semantic Parsing for Person Re-identification", 《DOI: 10.1109/CVPR.2018.00117 》 * |
ZHONG, JINQIN 等: ""Multi-Targets Tracking Based On Bipartite Graph Matching"", 《CYBERNETICS AND INFORMATION TECHNOLOGIES》 * |
周亮 等: "基于跟踪—关联模块的多目标跟踪方法研究", 《西南大学学报(自然科学版)》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111028154A (en) * | 2019-11-18 | 2020-04-17 | 哈尔滨工程大学 | Rough-terrain seabed side-scan sonar image matching and splicing method |
CN111028154B (en) * | 2019-11-18 | 2023-05-09 | 哈尔滨工程大学 | Side-scan sonar image matching and stitching method for rugged seafloor |
CN111179311A (en) * | 2019-12-23 | 2020-05-19 | 全球能源互联网研究院有限公司 | Multi-target tracking method and device and electronic equipment |
CN111428566A (en) * | 2020-02-26 | 2020-07-17 | 沈阳大学 | Deformation target tracking system and method |
CN111428566B (en) * | 2020-02-26 | 2023-09-01 | 沈阳大学 | Deformation target tracking system and method |
WO2022068522A1 (en) * | 2020-09-30 | 2022-04-07 | 华为技术有限公司 | Target tracking method and electronic device |
CN113643330A (en) * | 2021-10-19 | 2021-11-12 | 青岛根尖智能科技有限公司 | Target tracking method and system based on dynamic semantic features |
CN117876428A (en) * | 2024-03-12 | 2024-04-12 | 金锐同创(北京)科技股份有限公司 | Target tracking method, device, computer equipment and medium based on image processing |
CN117876428B (en) * | 2024-03-12 | 2024-05-17 | 金锐同创(北京)科技股份有限公司 | Target tracking method, device, computer equipment and medium based on image processing |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111259850B (en) | Pedestrian re-identification method integrating random batch mask and multi-scale representation learning | |
CN110298248A (en) | A kind of multi-object tracking method and system based on semantic segmentation | |
Hu et al. | Real-time video fire smoke detection by utilizing spatial-temporal ConvNet features | |
CN109165540B (en) | Pedestrian searching method and device based on prior candidate box selection strategy | |
CN108830188A (en) | Vehicle checking method based on deep learning | |
Xu et al. | Adversarial adaptation from synthesis to reality in fast detector for smoke detection | |
Castro et al. | Evaluation of CNN architectures for gait recognition based on optical flow maps | |
CN104063719A (en) | Method and device for pedestrian detection based on depth convolutional network | |
CN106803265A (en) | Multi-object tracking method based on optical flow method and Kalman filtering | |
CN113420640B (en) | Mangrove hyperspectral image classification method and device, electronic equipment and storage medium | |
CN116342894B (en) | GIS infrared feature recognition system and method based on improved YOLOv5 | |
US20170053172A1 (en) | Image processing apparatus, and image processing method | |
Ibrahem et al. | Real-time weakly supervised object detection using center-of-features localization | |
Abdullah et al. | Vehicle counting using deep learning models: a comparative study | |
Basavaiah et al. | Robust Feature Extraction and Classification Based Automated Human Action Recognition System for Multiple Datasets. | |
CN114283326A (en) | Underwater target re-identification method combining local perception and high-order feature reconstruction | |
Fang et al. | A real-time anti-distractor infrared UAV tracker with channel feature refinement module | |
Wang et al. | A dense-aware cross-splitnet for object detection and recognition | |
CN107886060A (en) | Pedestrian's automatic detection and tracking based on video | |
CN114187546B (en) | Combined action recognition method and system | |
CN113706815B (en) | Vehicle fire identification method combining YOLOv3 and optical flow method | |
Li et al. | Research on hybrid information recognition algorithm and quality of golf swing | |
Lassoued et al. | An efficient approach for video action classification based on 3D Zernike moments | |
Xu et al. | Multiscale edge-guided network for accurate cultivated land parcel boundary extraction from remote sensing images | |
Gao et al. | Research on edge detection and image segmentation of cabinet region based on edge computing joint image detection algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191001 |
|
RJ01 | Rejection of invention patent application after publication |