CN114820699A - Multi-target tracking method, device, equipment and medium - Google Patents

Multi-target tracking method, device, equipment and medium Download PDF

Info

Publication number
CN114820699A
CN114820699A CN202210325675.8A CN202210325675A CN114820699A CN 114820699 A CN114820699 A CN 114820699A CN 202210325675 A CN202210325675 A CN 202210325675A CN 114820699 A CN114820699 A CN 114820699A
Authority
CN
China
Prior art keywords
target
video frame
pose information
targets
tracking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210325675.8A
Other languages
Chinese (zh)
Other versions
CN114820699B (en
Inventor
刘洋
赵雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiaomi Automobile Technology Co Ltd
Original Assignee
Xiaomi Automobile Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiaomi Automobile Technology Co Ltd filed Critical Xiaomi Automobile Technology Co Ltd
Priority to CN202210325675.8A priority Critical patent/CN114820699B/en
Publication of CN114820699A publication Critical patent/CN114820699A/en
Application granted granted Critical
Publication of CN114820699B publication Critical patent/CN114820699B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a multi-target tracking method, which is used for solving the technical problems of uncontrollable multi-model calling efficiency, low characteristic extraction and recognition efficiency, easy target loss and low target tracking accuracy in the prior art, and comprises the following steps: acquiring a first video frame and a second video frame, wherein the first video frame is a current video frame, and the second video frame is a previous video frame adjacent to the first video frame; respectively processing the first video frame and the second video frame through a target detection network, determining a plurality of first targets in the first video frame and a plurality of second targets in the second video frame, and obtaining pose information corresponding to each first target and each second target; respectively calculating the predicted pose information of each second target in the first video frame according to the pose information corresponding to the plurality of second targets; and matching the targets based on the predicted pose information and pose information corresponding to the first targets, and tracking according to the target matching result.

Description

Multi-target tracking method, device, equipment and medium
Technical Field
The invention relates to the technical fields of video image processing, computer vision and the like, in particular to a multi-target tracking method, a multi-target tracking device, multi-target tracking equipment and a multi-target tracking medium.
Background
The sequencing algorithm (Sort) is a traditional Online real-time Tracking algorithm, And decomposes a multi-target Tracking problem into a target detection part, a state prediction part And a data association part. The detection part in the Sort algorithm usually adopts a fast-migration learning target detection (fast-Region-CNN, fast-RCNN) target detection algorithm, and the algorithm has the advantages that the operation speed of the state prediction and data association part is high, and online tracking can be realized. However, it is not considered that the appearance features of each target are matched only by using an IntersecTIon Over Union (IOU), which may cause the phenomenon that id tags are switched frequently in practical use, and the lost target cannot be recovered any more.
In the prior art, in order to solve the above problems, there are two solutions as follows:
the method comprises the following two steps:
the two-stage tracking method based on deep learning introduces a convolutional neural network to extract apparent features of a target on the basis of a sort algorithm, and adds a cascade matching strategy. Although the two-stage method improves the target tracking accuracy, the running time of the method is limited because two networks are required to be calculated in series, the consumed time is the sum of the two network times plus the time of a tracking module, and the number of times of calling of the apparent feature extraction model is multiplied with the increase of the number of targets.
(II) a single-stage method:
the Joint Detection and Embedding method (JDE) integrates a target Detection algorithm and a target re-identification algorithm into one network, and uses the same backbone network to extract features, so that the network can simultaneously output position information and apparent feature vectors of a target. However, the same network is used for training the target detection and the pedestrian re-identification reid features at the same time, and the imbalance of tasks of the target detection and the pedestrian re-identification reid features causes the difficulty in achieving higher accuracy of the model.
Therefore, a multi-target tracking method capable of quickly realizing tracking and easily adjusting the re-recognition characteristics and target detection balance is needed to solve the problems of low recognition efficiency and easy target loss.
Disclosure of Invention
The invention provides a multi-target tracking method, a multi-target tracking device, multi-target tracking equipment and a multi-target tracking medium, which are used for solving the technical problems that in the prior art, the multi-model calling efficiency is uncontrollable, the feature extraction efficiency and the recognition efficiency are low, targets are easy to lose, and the target tracking accuracy is low.
In a first aspect, an embodiment of the present invention provides a multi-target tracking method, which is applied to an automobile, and includes:
acquiring a first video frame and a second video frame, wherein the first video frame is a current video frame, and the second video frame is a previous video frame adjacent to the first video frame;
respectively processing the first video frame and the second video frame through a target detection network, determining a plurality of first targets in the first video frame and a plurality of second targets in the second video frame, and obtaining pose information corresponding to each first target and each second target;
respectively calculating the predicted pose information of each second target in the first video frame according to the pose information corresponding to the plurality of second targets;
and matching the targets based on the predicted pose information and pose information corresponding to the first targets, and tracking according to the target matching result.
In a possible implementation manner, in the method provided by the embodiment of the present invention, processing a first video frame and a second video frame through a target detection network, determining a plurality of first targets in the first video frame and a plurality of second targets in the second video frame, and obtaining pose information corresponding to each of the first targets and the second targets includes:
processing the second video frame through a target detection network to determine a plurality of second targets and obtain pose information and re-identification characteristics corresponding to each second target;
and processing the first video frame through a target detection network, determining a plurality of first targets, and obtaining pose information and re-identification characteristics corresponding to each first target.
In a possible implementation manner, in the method provided by the embodiment of the present invention, performing target matching based on the predicted pose information and pose information corresponding to a plurality of first targets includes:
determining a second target with a distance to any first target being smaller than a preset target threshold value as a third target based on the predicted pose information and pose information corresponding to the plurality of first targets;
and if the re-identification features of the third target are matched with the re-identification features of the first target, determining that the third target is matched with the first target.
In a possible implementation manner, in the method provided in an embodiment of the present invention, the method further includes:
and updating the pose information of a second target matched with the first target by using the pose information corresponding to the first target.
In a possible implementation manner, in the method provided in an embodiment of the present invention, the method further includes:
determining a second target that is not matched as a lost target;
and stopping tracking the lost target, and deleting the pose information corresponding to the lost target.
In a possible implementation manner, in the method provided in the embodiment of the present invention, the target detection network is trained by the following method:
obtaining pose information of the target from a pre-obtained video frame sample containing the target through marking;
and training the target detection network by using the pose information of the target.
In a possible implementation manner, in the method provided in an embodiment of the present invention, the training method further includes:
freezing the main trunk of the target detection network, and carrying out scale fusion on the re-identification feature extraction branches of the target detection network.
In a second aspect, an embodiment of the present invention provides a multi-target tracking apparatus, including:
the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a first video frame and a second video frame, the first video frame is a current video frame, and the second video frame is a previous video frame adjacent to the first video frame;
the processing unit is used for respectively processing the first video frame and the second video frame through a target detection network, determining a plurality of first targets in the first video frame and a plurality of second targets in the second video frame, and obtaining pose information corresponding to each first target and each second target;
the calculation unit is used for respectively calculating the predicted pose information of each second target in the first video frame according to the pose information corresponding to the plurality of second targets;
and the matching unit is used for matching the targets based on the predicted pose information and the pose information corresponding to the first targets and tracking according to the target matching result.
In a possible implementation manner, in the apparatus provided in the embodiment of the present invention, the processing unit is specifically configured to:
processing the second video frame through a target detection network to determine a plurality of second targets and obtain pose information and re-identification characteristics corresponding to each second target;
and processing the first video frame through a target detection network, determining a plurality of first targets, and obtaining pose information and re-identification characteristics corresponding to each first target.
In a possible implementation manner, in the apparatus provided in the embodiment of the present invention, the matching unit is specifically configured to:
determining a second target with a distance to any first target being smaller than a preset target threshold value as a third target based on the predicted pose information and pose information corresponding to the plurality of first targets;
and if the re-identification features of the third target are matched with the re-identification features of the first target, determining that the third target is matched with the first target.
In a possible implementation manner, in the apparatus provided in the embodiment of the present invention, the matching unit is further configured to:
and updating the pose information of a second target matched with the first target by using the pose information corresponding to the first target.
In a possible implementation manner, in the apparatus provided in the embodiment of the present invention, the matching unit is specifically configured to:
determining a second target that is not matched as a lost target;
and stopping tracking the lost target, and deleting the pose information corresponding to the lost target.
In a possible implementation manner, in the apparatus provided in the embodiment of the present invention, the processing unit trains the target detection network by the following method:
obtaining pose information of the target from a pre-obtained video frame sample containing the target through marking;
and training the target detection network by using the pose information of the target.
In a possible implementation manner, in the apparatus provided in this embodiment of the present invention, the processing unit is further configured to:
freezing the backbone of the target detection network, and carrying out scale fusion on the re-identification feature extraction branches of the target detection network.
In a third aspect, an embodiment of the present invention provides an electronic device, including: at least one processor, at least one memory, and computer program instructions stored in the memory, which when executed by the processor, implement a method as provided by the first aspect of an embodiment of the invention.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium, on which computer program instructions are stored, which, when executed by a processor, implement the method as provided by the first aspect of embodiments of the present invention.
In the embodiment of the invention, a first video frame and a second video frame are obtained, then the first video frame and the second video frame are respectively processed through a target detection network, a plurality of first targets in the first video frame and a plurality of second targets in the second video frame are determined, pose information corresponding to each first target and each second target is obtained, predicted pose information of each second target in the first video frame is respectively calculated according to the pose information corresponding to the plurality of second targets, finally, target matching is carried out based on the predicted pose information and the pose information corresponding to the plurality of first targets, and tracking is carried out according to a target matching result. Compared with the prior art, the method solves the problems that the multi-model calling efficiency is uncontrollable, the feature extraction efficiency and the recognition efficiency are low, the target is easy to lose and the target tracking accuracy is low, on the basis of ensuring the running speed of the whole algorithm, the tracking efficiency is stable and robust, and the external target can be effectively sensed in the automatic driving process, so that effective interaction is made, and the driving safety is ensured.
Drawings
Fig. 1 is a schematic flow chart of a multi-target tracking method according to an embodiment of the present invention;
FIG. 2 is a schematic flowchart of a training method for a target detection network in multi-target tracking according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of pre-scale fusion features provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a feature after scale fusion provided by an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a method for predicting a target location in multi-target tracking according to an embodiment of the present invention;
FIG. 6 is a schematic flow chart of a matching tracking method in multi-target tracking according to an embodiment of the present invention;
fig. 7 is a schematic specific flowchart of a multi-target tracking method according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a multi-target tracking apparatus according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Some of the words that appear in the text are explained below:
1. the term "and/or" in the embodiments of the present invention describes an association relationship of associated objects, and indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
2. Pedestrian re-identification (reid) is a technique that uses computer vision techniques to determine whether a particular pedestrian is present in an image or video sequence.
3. multi-Object Tracking (MOT) is to track Multiple objects in continuous video pictures, and the essence of Tracking is to associate the same Object (Object) in the front and rear frames of a video and to assign a unique track id. The main task is to give an image sequence, find moving objects in the image sequence, correspond moving objects in different frames one to one, and then give the motion tracks of different objects. These objects may be any such as pedestrians, vehicles, various animals, and so on. In the three-layer structure of computer vision, target tracking belongs to the middle layer and is the basis of other high-layer tasks (such as action recognition, behavior analysis and the like). The target tracking comprises single target tracking and multi-target tracking. The multi-target tracking problem needs correlation matching between targets besides the problems of illumination, deformation, shielding and the like which can be encountered by single target tracking. In addition, frequent occlusion of the target, unknown track starting and ending moments, large target scale change, similar appearance, interaction between targets, low frame rate and the like are frequently encountered in the multi-target tracking task.
4. The sorting algorithm (Sort) is a traditional Online real-time Tracking algorithm, And decomposes a multi-target Tracking problem into a target detection part, a state prediction part And a data association part.
5. Fast-migratory learning target detection (fast-Region-CNN, fast-RCNN) is the first algorithm to successfully apply deep learning to target detection. The R-CNN realizes a target detection technology based on algorithms such as a Convolutional Neural Network (CNN), linear regression, a Support Vector Machine (SVM) and the like.
6. The IntersecTIon-to-Union ratio (IOU) is an important function of the performance of the target detection algorithm, and the function value is equal to the ratio of the IntersecTIon and the Union of the "predicted frame" and the "real frame".
7. A Joint Detection and Embedding method (JDE) is anchor-based target Detection and is characterized in that target Detection and Embedding learning are integrated in the same network, and the speed is high.
8. The Unscented Kalman Filter (Unscented Kalman Filter, UKF) is a combination of Unscented Transform (UT) and standard Kalman Filter systems, and the Unscented Transform makes the nonlinear system equation suitable for the standard Kalman system under linear assumption.
The sequencing algorithm (Sort) is a traditional Online real-time Tracking algorithm, And decomposes a multi-target Tracking problem into a target detection part, a state prediction part And a data association part. The detection part in the Sort algorithm adopts a fast-RCNN target detection algorithm, the position and the category of an input picture are output through the target detection algorithm, then state prediction and updating are carried out on each detected target through Kalman filtering, and finally, the Hungary is used for matching the predicted target and the target detected by the current frame by taking iou as a cost matrix. The Sort has the advantages that the operation speed of the state prediction and data correlation part is high, and online tracking can be realized. However, the method does not consider that the appearance characteristics of each target are matched only by using the IOU, the phenomenon of switching the id tags often occurs in actual use, and the lost target cannot be found back any more.
In the prior art, in order to solve the above problems, there are two solutions as follows:
the method comprises the following two steps:
the two-stage tracking method based on deep learning introduces a convolutional neural network to extract apparent features of a target on the basis of a sort algorithm, and adds a cascade matching strategy. The specific implementation is that a target re-recognition network is trained independently besides the target detection network. The cascade matching method effectively solves the problem that the target temporarily disappears due to occlusion and the like. And secondly, improving a target apparent feature extraction network on a model and a loss function so as to ensure that the distance in the same id is small enough. Although the two-stage method improves the target tracking accuracy, the running time of the method is limited because two networks are required to be calculated in series, the consumed time is the sum of the two network times plus the time of a tracking module, and the number of times of calling of the apparent feature extraction model is multiplied with the increase of the number of targets.
(II) single-stage method:
the Joint Detection and Embedding method (JDE) integrates a target Detection algorithm and a target re-identification algorithm into one network, and uses the same backbone network to extract features, so that the network can simultaneously output position information and apparent feature vectors of a target. The method effectively improves the running speed of the algorithm. The FairMOT method is based on the idea of the JDE method, and is combined with a target detection algorithm CenterNet which does not need to set an anchor frame, so that the center position, the width and the height, the center point offset and the reid characteristics of a target are directly learned. However, the same network is used for training the target detection and the reid features at the same time, and the model is difficult to achieve high precision due to the imbalance of tasks of the target detection and the reid features.
Therefore, a multi-target tracking method capable of quickly realizing tracking and easily adjusting the re-recognition characteristics and target detection balance is needed to solve the problems of low recognition efficiency and easy target loss.
In the technical scheme, the driving process comprises video image processing, 3d target detection, reid feature extraction, unscented kalman filter prediction and multi-target tracking functions, and the method, the device, the equipment and the medium for multi-target tracking provided by the invention are mainly used for an automatic driving function, and are explained in more detail by combining drawings and embodiments.
The embodiment of the invention provides a multi-target tracking method, as shown in fig. 1, comprising the following steps:
step 101, a first video frame and a second video frame are obtained.
During specific implementation, video frames are acquired in real time through a vehicle-mounted camera or other shooting equipment, the first video frame is a current video frame, and the second video frame is a previous video frame adjacent to the first video frame.
Step 102, the first video frame and the second video frame are processed through a target detection network respectively, a plurality of first targets in the first video frame and a plurality of second targets in the second video frame are determined, and pose information corresponding to each first target and each second target is obtained.
In specific implementation, the second video frames are processed through the target detection network to determine a plurality of second targets and obtain pose information and re-identification features corresponding to each second target, and the same first video frames are processed through the target detection network to determine a plurality of first targets and obtain pose information and re-identification features corresponding to each first target.
In the step, pose information and re-identification characteristics are obtained through a target detection network, and the re-identification characteristics, namely the re-identification characteristics, can effectively solve the problem of id number change when the target is shielded, missed to detect, suddenly changed in direction and the like in multi-target tracking.
And 103, respectively calculating the predicted pose information of each second target in the first video frame according to the pose information corresponding to the plurality of second targets.
In specific implementation, the pose information corresponding to the second targets is obtained through a target prediction method, and the predicted pose information of each second target in the first video frame is predicted according to the pose information.
And 104, performing target matching based on the predicted pose information and the pose information corresponding to the plurality of first targets, and tracking according to a target matching result.
In specific implementation, based on the predicted pose information and pose information corresponding to a plurality of first targets, determining a second target with a distance to any first target smaller than a preset target threshold value as a third target, if the re-recognition features of the third target are matched with the re-recognition features of the first targets, determining that the third target is matched with the first targets, then tracking according to a target matching result, namely updating the pose information of the second target matched with the first targets by using the pose information corresponding to the first targets, and determining the unmatched second target as a lost target,
and stopping tracking the lost target, and deleting the pose information corresponding to the lost target.
As shown in fig. 2, the training process of the target detection network in multi-target tracking provided by the embodiment of the present invention may include the following steps:
step 201, obtaining pose information of the target from a pre-obtained video frame sample containing the target through marking.
And 202, training the target detection network by using the pose information of the target.
And step 203, freezing the backbone of the target detection network, and performing scale fusion on the re-identification feature extraction branches of the target detection network.
In specific implementation, when the reid branch is trained, the detection model trained in step 202 is used as a pre-training model to be input, a main network part is frozen, only the parameters of the scale fusion and the reid related part are updated, and the discriminative reid feature is extracted while the detection precision is not influenced. As shown in fig. 3, which is a feature before scale fusion, and fig. 4, which is a feature after scale fusion, it can be seen from fig. 3 and fig. 4 that semantic features can be enhanced after feature scale fusion.
In the target prediction stage in tracking, a common method is Kalman filtering, but the linear derivation and calculation process cannot be well applied to a nonlinear system. Because the motion of the pedestrian is nonlinear in the automatic driving scene, especially when the frame rate is low, the nonlinear characteristic is more obvious, in order to increase the accuracy of target position prediction, the scheme adopts a method of Unscented Kalman Filtering (UKF), as shown in fig. 5, the target position prediction method in multi-target tracking provided by the embodiment of the invention can comprise the following steps:
step 501, data is initialized.
In particular, the state vector is initialized
Figure BDA0003571571790000101
Sum state covariance matrix P 0 The formula is as follows:
Figure BDA0003571571790000102
step 502, target prediction is performed.
Weight W of Sigma point and average value of state estimation obtained by lossless transformation i m Sum covariance weight W i c
Figure BDA0003571571790000103
Figure BDA0003571571790000111
Then, the state is updated, and the formula is as follows:
Figure BDA0003571571790000112
Figure BDA0003571571790000113
and updating the observation equation, wherein the formula is as follows:
Figure BDA0003571571790000114
Figure BDA0003571571790000115
Figure BDA0003571571790000116
wherein
Figure BDA0003571571790000117
P k The latest filtering result and filtering covariance are respectively. And performing unscented Kalman filtering by the formula to realize pose prediction of the target.
After the pose prediction, target matching is performed based on the predicted pose information and pose information corresponding to a plurality of first targets, and tracking is performed according to a target matching result, as shown in fig. 6, a matching tracking process in multi-target tracking provided by an embodiment of the present invention may include the following steps:
step 601, initializing a tracker for each target.
In the implementation, the first frame is usually processed, the target position and the apparent feature vector are obtained by the detection network, and then a tracker is initialized for each target.
Step 602, predicting the position of the target in the current frame and matching.
During specific implementation, UKF prediction is carried out on a target in the tracker to predict the position of the target in a current frame; obtaining the target position information and the reid characteristic of the current frame by a detection network; calculating a reid cost matrix and a 3d distance cost matrix of the current target and the target in the tracker, and enabling the distance value to be larger than a threshold value T d The value of the appearance cost is assigned to infinity inf; and matching the current target with the target in the tracker by using a Hungarian algorithm. And updating the position, the state and the characteristics of the target successfully matched in the tracker.
Step 603, performing secondary matching.
In specific implementation, the 3d distance cost matrix of the detected and tracked target which is not matched in the step 602 is calculated, secondary matching is performed by the Hungarian algorithm, and the position, state and feature matrix of the target in the tracker are updated for the target which can be matched.
Step 604, perform three matches.
In specific implementation, the 3d distance cost matrix of the detected and unconfirmed tracking target which is not matched in the step 603 is calculated, the Hungarian algorithm is matched, and the position, the state and the feature matrix of the target which can be matched in the tracker are updated.
At step 605, the flag is lost.
In specific implementation, for the target which is not matched in the tracker all the time, the tracking state of the target is marked as lost, and the lost state is greater than a threshold value T t And after the frame, the target is considered to disappear, and the detected target which is not matched up all the time is added into the tracker to be set to be in a state to be confirmed.
As shown in fig. 7, the multi-target tracking method provided in the embodiment of the present invention is explained in detail.
Step 701, a first video frame and a second video frame are obtained.
During specific implementation, video frames are acquired in real time through a vehicle-mounted camera or other shooting equipment, the first video frame is a current video frame, and the second video frame is a previous video frame adjacent to the first video frame.
Step 702, processing the second video frame through the target detection network to determine a plurality of second targets, and obtaining pose information and re-identification characteristics corresponding to each second target.
Step 703, processing the first video frame through the target detection network, determining a plurality of first targets, and obtaining pose information and re-identification features corresponding to each first target.
In specific implementation, in step 702 and step 703, pose information and re-identification features are obtained through the target detection network, and the re-identification features, that is, the re-identification features, can effectively solve the problem of id number change when the target is shielded, missed, suddenly changed in direction, and the like in multi-target tracking.
In this step, an object detection network of Anchor-free is adopted, the object detection problem is changed into a problem of key point regression, namely, the object is represented by the center point of the object box, the offset (offset) of the center point of the object and the width (size) are predicted to obtain the actual box of the object, and the classification information is represented by heatmap. The network can be applied to 2D target detection, and can be expanded to 3D target detection tasks, namely information such as depth, length, width, height and angle of a target, namely posture information is output. Meanwhile, the network is also used to obtain re-identification features, namely reid features, and the training method of the target detection network shown in fig. 2 is used to train the re-identification features in a targeted manner, and the specific process is not repeated here.
And 704, respectively calculating the predicted pose information of each second target in the first video frame according to the pose information corresponding to the plurality of second targets.
In specific implementation, the pose information corresponding to the second targets is obtained through a target prediction method, and the predicted pose information of each second target in the first video frame is predicted according to the pose information. In this step, the target prediction method shown in fig. 5 is selected for prediction, which is not described herein again.
Step 705, performing target matching based on the predicted pose information and pose information corresponding to the plurality of first targets.
In specific implementation, based on the predicted pose information and pose information corresponding to the multiple first targets, a second target with a distance to any one of the first targets being smaller than a preset target threshold is determined as a third target, and if the re-identification features of the third target are matched with those of the first targets, the third target is determined to be matched with the first targets.
And step 706, tracking according to the target matching result.
In specific implementation, tracking is performed according to a target matching result, namely, the pose information of a second target matched with the first target is updated by using the pose information corresponding to the first target, and the unmatched second target is determined as a lost target.
And step 707, stopping tracking the lost target, and deleting the pose information corresponding to the lost target.
As shown in fig. 8, based on the same inventive concept of the multi-target tracking method, the present invention further provides a multi-target tracking apparatus, including:
an obtaining unit 801, configured to obtain a first video frame and a second video frame, where the first video frame is a current video frame, and the second video frame is a previous video frame adjacent to the first video frame;
a processing unit 802, configured to process the first video frame and the second video frame through a target detection network, determine multiple first targets in the first video frame and multiple second targets in the second video frame, and obtain pose information corresponding to each first target and each second target;
a calculating unit 803, configured to calculate predicted pose information of each second target in the first video frame according to the pose information corresponding to the plurality of second targets;
and the matching unit 804 is used for matching the targets based on the predicted pose information and the pose information corresponding to the plurality of first targets and tracking according to the target matching result.
In a possible implementation manner, in the apparatus provided in the embodiment of the present invention, the processing unit 802 is specifically configured to:
processing the second video frame through a target detection network to determine a plurality of second targets and obtain pose information and re-identification characteristics corresponding to each second target;
and processing the first video frame through a target detection network, determining a plurality of first targets, and obtaining pose information and re-identification characteristics corresponding to each first target.
In a possible implementation manner, in the apparatus provided in the embodiment of the present invention, the matching unit 804 is specifically configured to:
determining a second target with a distance to any first target being smaller than a preset target threshold value as a third target based on the predicted pose information and pose information corresponding to the plurality of first targets;
and if the re-identification features of the third target are matched with the re-identification features of the first target, determining that the third target is matched with the first target.
In a possible implementation manner, in the apparatus provided in the embodiment of the present invention, the matching unit 804 is further configured to:
and updating the pose information of a second target matched with the first target by using the pose information corresponding to the first target.
In a possible implementation manner, in the apparatus provided in the embodiment of the present invention, the matching unit 804 is specifically configured to:
determining a second target that is not matched as a lost target;
and stopping tracking the lost target, and deleting the pose information corresponding to the lost target.
In a possible implementation manner, in the apparatus provided in this embodiment of the present invention, the processing unit 802 trains the target detection network by the following method:
obtaining pose information of the target from a pre-obtained video frame sample containing the target through marking;
and training the target detection network by using the pose information of the target.
In a possible implementation manner, in the apparatus provided in this embodiment of the present invention, the processing unit 802 is further configured to:
freezing the backbone of the target detection network, and carrying out scale fusion on the re-identification feature extraction branches of the target detection network.
In addition, the multi-target tracking method and apparatus of the embodiments of the invention described in conjunction with fig. 2-8 may be implemented by an electronic device. Fig. 9 is a schematic diagram illustrating a hardware structure of an electronic device according to an embodiment of the present invention.
The electronic device may comprise a processor 901 and a memory 902 storing computer program instructions.
Specifically, the processor 901 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured as one or more Integrated circuits implementing the embodiments of the present invention.
Memory 902 may include mass storage for data or instructions. By way of example, and not limitation, memory 902 may include a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, tape, or Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 902 may include removable or non-removable (or fixed) media, where appropriate. The memory 902 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 902 is a non-volatile solid-state memory. In a particular embodiment, the memory 902 includes Read Only Memory (ROM). Where appropriate, the ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically Alterable ROM (EAROM), or flash memory or a combination of two or more of these.
The processor 901 realizes any one of the multi-target tracking methods in the above embodiments by reading and executing computer program instructions stored in the memory 902.
In one example, the electronic device can also include a communication interface 903 and a bus 910. As shown in fig. 9, the processor 901, the memory 902, and the communication interface 903 are connected via a bus 910 to complete communication with each other.
The communication interface 903 is mainly used for implementing communication between modules, apparatuses, units and/or devices in the embodiments of the present invention.
The bus 910 includes hardware, software, or both to couple the components of the electronic device to each other. By way of example, and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hypertransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus or a combination of two or more of these. Bus 910 can include one or more buses, where appropriate. Although specific buses have been described and shown in the embodiments of the invention, any suitable buses or interconnects are contemplated by the invention.
The electronic device may execute the multi-target tracking method in the embodiment of the present invention based on the received video frame, thereby implementing the multi-target tracking method and apparatus described with reference to fig. 2 to 8.
In addition, in combination with the electronic device in the above embodiments, the embodiments of the present invention may be implemented by providing a computer-readable storage medium. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the multi-target tracking methods of the embodiments described above.
In the embodiment of the invention, a first video frame and a second video frame are obtained, then the first video frame and the second video frame are respectively processed through a target detection network, a plurality of first targets in the first video frame and a plurality of second targets in the second video frame are determined, pose information corresponding to each first target and each second target is obtained, predicted pose information of each second target in the first video frame is respectively calculated according to the pose information corresponding to the plurality of second targets, finally, target matching is carried out based on the predicted pose information and the pose information corresponding to the plurality of first targets, and tracking is carried out according to a target matching result. Compared with the prior art, the method solves the problems that the multi-model calling efficiency is uncontrollable, the efficiency of feature extraction and recognition is low, the target is easy to lose and the tracking accuracy of the target is low, on the basis of ensuring the running speed of the whole algorithm, the tracking efficiency is stable and robust, and the external target can be effectively sensed in the automatic driving process, so that effective interaction is made and the driving safety is ensured.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all changes and modifications that fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (16)

1. A multi-target tracking method is characterized by comprising the following steps:
acquiring a first video frame and a second video frame, wherein the first video frame is a current video frame, and the second video frame is a previous video frame adjacent to the first video frame;
respectively processing the first video frame and the second video frame through a target detection network, determining a plurality of first targets in the first video frame and a plurality of second targets in the second video frame, and obtaining pose information corresponding to each first target and each second target;
respectively calculating the predicted pose information of each second target in the first video frame according to the pose information corresponding to the plurality of second targets;
and matching the targets based on the predicted pose information and the pose information corresponding to the first targets, and tracking according to the target matching result.
2. The multi-target tracking method according to claim 1, wherein the processing the first video frame and the second video frame through a target detection network to determine a plurality of first targets in the first video frame and a plurality of second targets in the second video frame and obtain pose information corresponding to each of the first targets and the second targets comprises:
processing the second video frame through the target detection network to determine a plurality of second targets and obtain pose information and re-identification characteristics corresponding to each second target;
and processing the first video frame through a target detection network, determining a plurality of first targets, and obtaining pose information and re-identification characteristics corresponding to each first target.
3. The multi-target tracking method according to claim 2, wherein the performing target matching based on the predicted pose information and pose information corresponding to the plurality of first targets comprises:
determining a second target with a distance to any first target smaller than a preset target threshold value as a third target based on the predicted pose information and pose information corresponding to the plurality of first targets;
and if the re-identification feature of the third target is matched with the re-identification feature of the first target, determining that the third target is matched with the first target.
4. The multi-target tracking method of claim 3, further comprising:
and updating the pose information of a second target matched with the first target by using the pose information corresponding to the first target.
5. The multi-target tracking method of claim 4, further comprising:
determining a second target that is not matched as a lost target;
stopping tracking the lost target, and deleting the pose information corresponding to the lost target.
6. The multi-target tracking method according to any one of claims 1-5, wherein the target detection network is trained by:
obtaining pose information of the target from a pre-obtained video frame sample containing the target through marking;
and training the target detection network by using the pose information of the target.
7. The multi-target tracking method of claim 6, wherein the training method further comprises:
and freezing the backbone of the target detection network, and carrying out scale fusion on the re-identification feature extraction branches of the target detection network.
8. A multi-target tracking apparatus, comprising:
the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a first video frame and a second video frame, the first video frame is a current video frame, and the second video frame is a previous video frame adjacent to the first video frame;
the processing unit is used for respectively processing the first video frame and the second video frame through a target detection network, determining a plurality of first targets in the first video frame and a plurality of second targets in the second video frame, and obtaining pose information corresponding to each first target and each second target;
the calculation unit is used for calculating the predicted pose information of each second target in the first video frame according to the pose information corresponding to the plurality of second targets;
and the matching unit is used for matching targets based on the predicted pose information and the pose information corresponding to the plurality of first targets and tracking according to the target matching result.
9. The multi-target tracking device of claim 8, wherein the processing unit is specifically configured to:
processing the second video frame through the target detection network to determine a plurality of second targets and obtain pose information and re-identification characteristics corresponding to each second target;
and processing the first video frame through a target detection network, determining a plurality of first targets, and obtaining pose information and re-identification characteristics corresponding to each first target.
10. The multi-target tracking device of claim 9, wherein the matching unit is specifically configured to:
determining a second target with a distance to any first target smaller than a preset target threshold value as a third target based on the predicted pose information and pose information corresponding to the plurality of first targets;
and if the re-identification feature of the third target is matched with the re-identification feature of the first target, determining that the third target is matched with the first target.
11. The multi-target tracking device of claim 10, wherein the matching unit is further configured to:
and updating the pose information of a second target matched with the first target by using the pose information corresponding to the first target.
12. The multi-target tracking device of claim 11, wherein the matching unit is specifically configured to:
determining a second target that is not matched as a lost target;
stopping tracking the lost target, and deleting the pose information corresponding to the lost target.
13. The multi-target tracking device of any one of claims 8-12, wherein the processing unit trains the target detection network by:
obtaining pose information of the target from a pre-obtained video frame sample containing the target through marking;
and training the target detection network by using the pose information of the target.
14. The multi-target tracking device of claim 13, wherein the processing unit is further configured to:
and freezing the backbone of the target detection network, and carrying out scale fusion on the re-identification feature extraction branches of the target detection network.
15. An electronic device, comprising: at least one processor, at least one memory, and computer program instructions stored in the memory that, when executed by the processor, implement the method of any of claims 1-7.
16. A computer-readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 1-7.
CN202210325675.8A 2022-03-29 2022-03-29 Multi-target tracking method, device, equipment and medium Active CN114820699B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210325675.8A CN114820699B (en) 2022-03-29 2022-03-29 Multi-target tracking method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210325675.8A CN114820699B (en) 2022-03-29 2022-03-29 Multi-target tracking method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN114820699A true CN114820699A (en) 2022-07-29
CN114820699B CN114820699B (en) 2023-07-18

Family

ID=82532726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210325675.8A Active CN114820699B (en) 2022-03-29 2022-03-29 Multi-target tracking method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN114820699B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908498A (en) * 2022-12-27 2023-04-04 清华大学 Multi-target tracking method and device based on category optimal matching

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286774A1 (en) * 2016-04-04 2017-10-05 Xerox Corporation Deep data association for online multi-class multi-object tracking
CN110197502A (en) * 2019-06-06 2019-09-03 山东工商学院 A kind of multi-object tracking method that identity-based identifies again and system
CN110276783A (en) * 2019-04-23 2019-09-24 上海高重信息科技有限公司 A kind of multi-object tracking method, device and computer system
CN110399808A (en) * 2019-07-05 2019-11-01 桂林安维科技有限公司 A kind of Human bodys' response method and system based on multiple target tracking
CN111914664A (en) * 2020-07-06 2020-11-10 同济大学 Vehicle multi-target detection and track tracking method based on re-identification
WO2021017291A1 (en) * 2019-07-31 2021-02-04 平安科技(深圳)有限公司 Darkflow-deepsort-based multi-target tracking detection method, device, and storage medium
CN112419368A (en) * 2020-12-03 2021-02-26 腾讯科技(深圳)有限公司 Method, device and equipment for tracking track of moving target and storage medium
CN112767443A (en) * 2021-01-18 2021-05-07 深圳市华尊科技股份有限公司 Target tracking method, electronic equipment and related product
CN113139620A (en) * 2021-05-14 2021-07-20 重庆理工大学 End-to-end multi-target detection and tracking joint method based on target association learning
CN113313736A (en) * 2021-06-10 2021-08-27 厦门大学 Online multi-target tracking method for unified target motion perception and re-identification network
CN113343985A (en) * 2021-06-28 2021-09-03 展讯通信(上海)有限公司 License plate recognition method and device
CN113724293A (en) * 2021-08-23 2021-11-30 上海电科智能***股份有限公司 Vision-based intelligent internet public transport scene target tracking method and system
CN113807187A (en) * 2021-08-20 2021-12-17 北京工业大学 Unmanned aerial vehicle video multi-target tracking method based on attention feature fusion
CN114067270A (en) * 2021-11-18 2022-02-18 华南理工大学 Vehicle tracking method and device, computer equipment and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286774A1 (en) * 2016-04-04 2017-10-05 Xerox Corporation Deep data association for online multi-class multi-object tracking
CN110276783A (en) * 2019-04-23 2019-09-24 上海高重信息科技有限公司 A kind of multi-object tracking method, device and computer system
CN110197502A (en) * 2019-06-06 2019-09-03 山东工商学院 A kind of multi-object tracking method that identity-based identifies again and system
CN110399808A (en) * 2019-07-05 2019-11-01 桂林安维科技有限公司 A kind of Human bodys' response method and system based on multiple target tracking
WO2021017291A1 (en) * 2019-07-31 2021-02-04 平安科技(深圳)有限公司 Darkflow-deepsort-based multi-target tracking detection method, device, and storage medium
CN111914664A (en) * 2020-07-06 2020-11-10 同济大学 Vehicle multi-target detection and track tracking method based on re-identification
CN112419368A (en) * 2020-12-03 2021-02-26 腾讯科技(深圳)有限公司 Method, device and equipment for tracking track of moving target and storage medium
CN112767443A (en) * 2021-01-18 2021-05-07 深圳市华尊科技股份有限公司 Target tracking method, electronic equipment and related product
CN113139620A (en) * 2021-05-14 2021-07-20 重庆理工大学 End-to-end multi-target detection and tracking joint method based on target association learning
CN113313736A (en) * 2021-06-10 2021-08-27 厦门大学 Online multi-target tracking method for unified target motion perception and re-identification network
CN113343985A (en) * 2021-06-28 2021-09-03 展讯通信(上海)有限公司 License plate recognition method and device
CN113807187A (en) * 2021-08-20 2021-12-17 北京工业大学 Unmanned aerial vehicle video multi-target tracking method based on attention feature fusion
CN113724293A (en) * 2021-08-23 2021-11-30 上海电科智能***股份有限公司 Vision-based intelligent internet public transport scene target tracking method and system
CN114067270A (en) * 2021-11-18 2022-02-18 华南理工大学 Vehicle tracking method and device, computer equipment and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
文元桥 等著: "《水面无人艇的体系结构与运动控制》", 31 July 2019, 武汉:武汉理工大学出版社, pages: 137 - 142 *
翟光 著: "《空间目标相对导航与滤波技术》", 29 February 2020, 北京:北京理工大学出版社, pages: 88 - 92 *
陈诗煊: "基于Anchor Free模型和Attention机制的车辆多目标跟踪检测研究", 《中国优秀硕士学位论文全文数据库 工程科技II辑(月刊)》 *
陈诗煊: "基于Anchor Free模型和Attention机制的车辆多目标跟踪检测研究", 《中国优秀硕士学位论文全文数据库 工程科技II辑(月刊)》, no. 12, 15 December 2021 (2021-12-15), pages 1 - 81 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908498A (en) * 2022-12-27 2023-04-04 清华大学 Multi-target tracking method and device based on category optimal matching
CN115908498B (en) * 2022-12-27 2024-01-02 清华大学 Multi-target tracking method and device based on category optimal matching

Also Published As

Publication number Publication date
CN114820699B (en) 2023-07-18

Similar Documents

Publication Publication Date Title
Hassaballah et al. Vehicle detection and tracking in adverse weather using a deep learning framework
CN112669349B (en) Passenger flow statistics method, electronic equipment and storage medium
Chan et al. Vehicle detection and tracking under various lighting conditions using a particle filter
KR101912914B1 (en) Method and system for recognition of speed limit sign using front camera
CN109754009B (en) Article identification method, article identification device, vending system and storage medium
CN112966697A (en) Target detection method, device and equipment based on scene semantics and storage medium
CN112529934B (en) Multi-target tracking method, device, electronic equipment and storage medium
Jadhav et al. Aerial multi-object tracking by detection using deep association networks
CN112434566A (en) Passenger flow statistical method and device, electronic equipment and storage medium
Liu et al. Multi-type road marking recognition using adaboost detection and extreme learning machine classification
Wang et al. Deep learning-based raindrop quantity detection for real-time vehicle-safety application
CN111062971A (en) Cross-camera mud head vehicle tracking method based on deep learning multi-mode
CN114972410A (en) Multi-level matching video racing car tracking method and system
CN114820699B (en) Multi-target tracking method, device, equipment and medium
CN115375736A (en) Image-based pedestrian trajectory tracking method and device
Li et al. Time-spatial multiscale net for vehicle counting and traffic volume estimation
CN111382606A (en) Tumble detection method, tumble detection device and electronic equipment
Tsai et al. Joint detection, re-identification, and LSTM in multi-object tracking
CN107256382A (en) Virtual bumper control method and system based on image recognition
Dai et al. A driving assistance system with vision based vehicle detection techniques
Hsu et al. Developing an on-road obstacle detection system using monovision
CN114359572A (en) Training method and device of multi-task detection model and terminal equipment
Nguyen et al. An algorithm using YOLOv4 and DeepSORT for tracking vehicle speed on highway
CN113221604A (en) Target identification method and device, storage medium and electronic equipment
CN113496188B (en) Apparatus and method for processing video content analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant