CN112927267A - Target tracking method under multi-camera scene - Google Patents

Target tracking method under multi-camera scene Download PDF

Info

Publication number
CN112927267A
CN112927267A CN202110275199.9A CN202110275199A CN112927267A CN 112927267 A CN112927267 A CN 112927267A CN 202110275199 A CN202110275199 A CN 202110275199A CN 112927267 A CN112927267 A CN 112927267A
Authority
CN
China
Prior art keywords
target
tracking
data set
target object
pictures
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110275199.9A
Other languages
Chinese (zh)
Inventor
卢新彪
杭帆
唐紫婷
刘雅童
李芳�
李亦勤
张弛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN202110275199.9A priority Critical patent/CN112927267A/en
Publication of CN112927267A publication Critical patent/CN112927267A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/292Multi-camera tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a target tracking method in a multi-camera scene, which realizes splicing of different camera pictures by using YOLO-V4 in combination with an improved Deepsort algorithm and an image splicing algorithm, and finally realizes multi-target tracking in a spliced video. In the aspect of a data set, a self-made intelligent trolley data set and a self-made vehicle re-identification data set containing an intelligent trolley are adopted. According to the invention, through constructing abundant data sets, improving models and splicing and fusing pictures, multi-target tracking under a multi-camera scene is realized, and the accuracy rate of vehicle re-identification is improved better.

Description

Target tracking method under multi-camera scene
Technical Field
The invention relates to a target tracking method, in particular to a target tracking method in a multi-camera scene.
Background
Target detection and tracking are a research hotspot in the field of computer vision at present, and have wide application in the fields of video monitoring, automatic driving, man-machine interaction, intelligent home furnishing and the like. The moving target tracking belongs to the content of video analysis, and the video analysis integrates the middle-level and high-level processing stages in the field of computer vision research, namely, the image sequence is processed, so that the rule of the moving target is researched, or semantic and non-semantic information support including motion detection, target classification, target tracking, event detection and the like is provided for decision alarm of a system. The research and application of the video target tracking method is an important branch in the field of computer vision, and is increasingly and widely applied to various fields of scientific technology, national defense construction, aerospace, medicine and health and national economy, so that the research target tracking technology has great practical value and wide development prospect.
With the development of neural networks, neural networks for object detection and tracking have been developed from machine learning to deep learning. Currently, target detection algorithms are broadly divided into two categories: one is two stages, the detection problem is divided into two stages by the two-stage detection algorithm, firstly candidate regions (region prosages) are generated and then classified, and typical representatives of the one are R-CNN, Fast R-CNN and Master R-CNN families. The recognition error rate and the recognition missing rate of the images are low, but the speed is low, and the real-time detection scene cannot be met. Another type of method is called a one-stage detection algorithm, which does not require a stage of generating a candidate region, directly generates a class probability and a position coordinate value of an object, and can directly obtain a final detection result through a single detection, so that the detection speed is faster, and more typical algorithms such as YOLO, SSD, YOLOv3, YOLO-V4, CenterNet, and the like are available. The main task of multi-Object Tracking, i.e. Multiple Object Tracking (MOT), is to provide a sequence of images, find moving objects in the sequence of images, and identify moving objects in different frames, i.e. to provide a certain accurate id, although these objects may be arbitrary, such as pedestrians, vehicles, various animals, etc. Currently, the mainstream target Tracking strategy studied in the industry of the academic community is TBD (Tracking-by-detectino), that is, target detection is performed in each frame, and then target Tracking is performed by using the result of the target detection, and the more classical algorithms include SORT and deep SORT. The image splicing technology is a technology for carrying out space matching alignment on a group of image sequences of mutually overlapped parts, forming a complete new image of a wide-view-angle scene containing information of each image sequence after resampling and synthesis, and can make up the defect of limited shooting content of a single-point camera, expand the experience range of equipment and represent algorithms of SIFT, SURF and ORB.
For the application of target tracking in various scenes, many scholars have already made many better research results, but the research of multi-target cross-camera tracking technology, most of academic research nowadays focuses on finding the overlapping part between the images captured by the cameras, and the overlapping part is used as a basis for tracking different cameras respectively. The optimized SURF algorithm is utilized to match the overlapped parts of the images of the two cameras, the target handover of the multiple cameras is completed, and the cross-camera tracking is realized. Moreover, when multiple targets appear in different camera pictures, cross-camera tracking of overlapped picture parts can only be realized, a determined accurate id is allocated to the overlapped picture parts, and how the id of target tracking of the non-overlapped part is allocated is not provided with a proper solution. In the aspect of target detection, in the prior art, a frame difference method is adopted, namely a difference operation is performed on two continuous frames of images of a video image sequence to obtain a moving target contour, the algorithm is simple to implement, low in programming complexity and high in running speed, but the algorithm is seriously dependent on a selected inter-frame time interval and a selected segmentation threshold, poor in universality and easy to be restricted by scenes.
In order to realize multi-target tracking among multiple cameras and enable target detection and tracking to have better effects, a method of YOLO-V4 combined with Deepsort is adopted, a target appearance feature extraction network in the Deepsort is improved, and an attention mechanism is introduced to obtain better matching features.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to provide a multi-target and high-identification-accuracy target tracking method in a multi-camera scene.
The technical scheme is as follows: the invention discloses a target tracking method under a multi-camera scene, which comprises the following steps:
s1: shooting a picture of a target object, and labeling the picture to obtain a first target object data set;
s2: the first target object data set and the collected target object associated data set are scattered and mixed to obtain a total data set, and a YOLO-V4 model is trained by adopting the total data set;
s3: shooting each target object at multiple angles to obtain pictures of each target object at different angles and obtain a second target object data set;
s4: an attention mechanism is introduced to improve a target appearance characteristic extraction network in a Deepsort algorithm;
s5: training the improved target appearance characteristic extraction network by using a second target object data set;
s6: combining the trained YOLO-V4 model with an improved Deepsort algorithm, obtaining a detection frame of a target object by using the YOLO-V4 model, and tracking the detected target object by using the improved Deepsort algorithm to obtain a target object tracking model;
s7: the method comprises the steps of adopting a plurality of cameras arranged at different positions, and tracking a target object to be tracked by applying a target object tracking model.
Has the advantages that: compared with the prior art, the invention has the following remarkable advantages:
the invention obtains the video spliced by multiple cameras by applying the SURF image splicing algorithm, and finally realizes multi-target tracking by applying the YOLO-V4 model in combination with the improved Deepsort algorithm in the video. The target appearance information extraction network in the Deepsort algorithm is optimized, a channel attention mechanism is introduced, and the vehicle weight identification accuracy is improved by 1.1% compared with the prior art. In conclusion, the multi-target tracking method and the multi-target tracking system realize multi-target tracking in a multi-camera scene, and the accuracy rate of vehicle weight identification is improved well.
Drawings
FIG. 1 is a comparison graph of the effect of the target appearance information extraction network and the improved target appearance information extraction network in the original Deepsort algorithm trained by the self-made vehicle re-identification data set according to the present invention.
FIG. 2 is a diagram of the tracking effect of multiple intelligent trolleys in a single-camera scene by applying a YOLO-V4 model and an improved Deepsort algorithm.
FIG. 3 is a diagram of a plurality of intelligent vehicle tracking effects in a scene with two cameras fused by applying a YOLO-V4 model and an improved Deepsort algorithm.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
According to the method, different camera pictures are spliced by using YOLO-V4 in combination with an improved Deepsort algorithm and an image splicing algorithm, and finally multi-target tracking is realized in the spliced video. In the aspect of a data set, a self-made intelligent trolley data set and a self-made vehicle re-identification data set containing an intelligent trolley are adopted. The method comprises the following specific steps:
s1: shooting a photo of the intelligent vehicle, and marking the photo to obtain an intelligent vehicle data set;
s2: combining the intelligent vehicle data set with the collected vehicle data set to obtain a total data set, and training a YOLO-V4 model by using the data set;
s3: shooting each intelligent vehicle at multiple angles to obtain pictures of each intelligent vehicle at different angles, taking out the part of the pictures containing the intelligent vehicles, and combining the collected vehicle heavy identification data sets to obtain a vehicle heavy identification data set containing the intelligent vehicles;
s4: the target appearance characteristic extraction network in the deep sort algorithm is improved, and an attention mechanism is introduced;
s5: training an improved target appearance characteristic extraction network in the Deepsort algorithm by using a vehicle weight recognition data set;
s6: and combining the trained YOLO-V4 model with an improved Deepsort algorithm to obtain a model capable of tracking the intelligent trolley.
S7: and splicing videos shot by a plurality of cameras by using a SURF algorithm to obtain spliced videos, and tracking the intelligent vehicle in the videos by using a YOLO-V4 model and an improved Deepsort algorithm.
In step S1, the target detection effect of the YOLO-V4 model is closely related to the data set, and therefore, the data set must be sufficient. In the process of data set production, all the situations that the intelligent vehicle can appear in the scene need to be considered. And shooting pictures of the intelligent trolley from different angles, different shooting distances and different scenes to obtain 560 pictures containing the intelligent trolley. And finally, labeling the intelligent trolleys in the pictures by using data set labeling software to obtain the label file corresponding to each picture, wherein the quantity, the size and the angle of the intelligent trolleys contained in each picture are different. And combining the collected partial car data set with the self-made intelligent car data set to obtain a final data set. Wherein, 80% of the data set is used for a training set, 10% is used for a verification set, and the last 10% is used as a test set.
In step S2, the target detection network used is a YOLO-V4 network. The YOLO-V4 network is a YOLO series one-stage target detection network, is an improved version of YOLOV3, is subjected to a lot of small improvements on the basis of YOLOV3, and achieves great improvement of target detection accuracy while keeping the recognition rate not to be reduced. The major improvements of YOLO-V4 are as follows: 1. the method improves the Yolov3 backbone extraction network Darknet53, modifies the activation function of Darknet Conv2D from LeakyReLU to Mish, and the formula of the Mish function is as follows:
Mish=x*tanh(ln(1+ex))
the network architecture in Darknet53 was then modified to use the CSPnet architecture, with Darknet53 modified to CSPDarknet 53. 2. The SPP structure and the PANet structure are used. The SPP structure is connected to the last feature layer of the CSPdark net53 to perform convolution for three times, then the SPP structure is processed by using the maximum pooling of four different scales, the size of the pooling kernel of the maximum pooling is respectively 13x13, 9x9, 5x5 and 1x1, and the repeated extraction of features is realized by using an up-sampling and down-sampling network of the PANet. 3. The training part adopts a Mosaic data enhancement method, 4 pictures are read each time, four pictures are respectively turned, zoomed, changed in color gamut and the like, and the pictures are well arranged according to four directions to form new pictures. 4. CIOU is used as regression to optimize LOSS. The CIOU takes the distance between the target and the prior frame, the overlapping rate, the scale and the penalty term into consideration, so that the regression of the target frame becomes more stable, and the calculation formula is as follows:
Figure BDA0002976347380000041
where IOU is the intersection/union of areas between the prediction box and the actual box, ρ2(b,bgt) C represents the diagonal distance of the minimum closure area which can contain the prediction frame and the real frame at the same time. And the calculation formula of alpha and v is as follows:
Figure BDA0002976347380000042
Figure BDA0002976347380000043
w, h and wgt,hgtRepresenting the width and height of the real frame and the prediction frame, 1-CIOU can obtain the corresponding LOSS, and the calculation formula is as follows:
Figure BDA0002976347380000051
in step S3, the quality of the extraction capability of the target appearance feature extraction network in deep sort is closely related to the data set used for training the network, and therefore, a vehicle re-identification data set needs to be created. Every dolly all need take the picture of different angles separately, draws the position of the intelligent vehicle in the picture alone, and every dolly is about taking 40 pictures, then combines together the vehicle heavy identification data set that collects with the intelligent car heavy identification data set of self-control, obtains the vehicle heavy identification data set that contains the intelligent vehicle, and the data set contains 585 different vehicles, and every kind of vehicle possess about 40 pictures. Taking 90% as training set and 10% as testing set.
In steps S4-S5, a Deepsort target tracking algorithm is used and improved. The deep Sort algorithm is an improvement of the Sort algorithm, the Sort tracking method is to input IOU conditions of a detection frame and a tracking frame into the Hungarian algorithm for linear distribution to associate inter-frame IDs, and although the method is high in tracking precision and accuracy, ID switching is easily caused. Therefore, the Deepsort adds the appearance information of the target into the matching calculation, so that the ID can be correctly matched under the condition that the target is shielded and appears later, the frequent ID switch can be effectively reduced, the 128-dimensional feature vector corresponding to the detection frame is calculated through the convolutional neural network for extracting the appearance information of the target, and the effect of tracking the target is directly influenced by the extracting effect of the convolutional neural network on the appearance information of the target. For the patent, the identified main target is the intelligent vehicle, and for this purpose, a proper convolutional neural network needs to be trained to extract the target appearance information of the intelligent vehicle. In order to enable the feature extraction capability of the network to be better, the method improves the feature extraction network of the Deepsort, and introduces a channel attention mechanism network ECA-Net behind the original residual error network. ECA-Net provides a local cross-channel interaction strategy without dimension reduction and a method for adaptively selecting the size of a one-dimensional convolution kernel, thereby realizing the improvement on performance.
In step S6, the YOLO-V4 model is combined with the improved Deepsort algorithm, and multi-target tracking is carried out in the scene of a single camera.
In step S7, in order to make the stitching have good accuracy and robustness and have good real-time performance, the SURF algorithm is used to extract the feature points of the image sequence. The SURF algorithm has the advantages of high speed and high matching degree in the current mainstream image splicing algorithm, so the SURF algorithm is also called as a rapid robust feature. The idea of multi-camera video stitching is as follows: firstly, reading pictures captured by each camera, then splicing the captured pictures by using a SURF algorithm to obtain spliced pictures, and finally fusing all the spliced pictures to obtain a final multi-camera fused video. And tracking the intelligent vehicle in the video by using a YOLO-V4 model and combining a modified version of Deepsort algorithm.
In order to better embody the technical effect of the invention, the classification performance of the trained YOLO-V4 network is counted, and as a result, as shown in tables 1 and 2, the AP in table 1 is the average accuracy, and the identification accuracy of the YOLO-V4 network under a single category is reflected. The mAP is the average of the AP values under all categories, reflecting the accuracy of the YOLO-V4 network under all categories. F1 is a comprehensive evaluation index of the model and reflects the classification effect of the YOLO-V4 network.
TABLE 1 Classification Performance index of trained YOLO-V4 network
Figure BDA0002976347380000061
TABLE 2 Deepsort target appearance information extraction network comparison before and after improvement
Figure BDA0002976347380000062
Accuracy in table 2 is the Accuracy, and the larger the Accuracy is, the stronger the extraction capability of the target appearance information extraction network is. The Loss is a calculated value of a Loss function, and the smaller the Loss is, the stronger the extraction capability of the target appearance information extraction network is.

Claims (5)

1. A target tracking method under a multi-camera scene is characterized by comprising the following steps:
s1: shooting a picture of a target object, and labeling the picture to obtain a first target object data set;
s2: the first target object data set and the collected target object associated data set are scattered and mixed to obtain a total data set, and a YOLO-V4 model is trained by adopting the total data set;
s3: shooting each target object at multiple angles to obtain pictures of each target object at different angles and obtain a second target object data set;
s4: an attention mechanism is introduced to improve a target appearance characteristic extraction network in a Deepsort algorithm;
s5: training the improved target appearance characteristic extraction network by using a second target object data set;
s6: combining the trained YOLO-V4 model with an improved Deepsort algorithm, obtaining a detection frame of a target object by using the YOLO-V4 model, and tracking the detected target object by using the improved Deepsort algorithm to obtain a target object tracking model;
s7: the method comprises the steps of adopting a plurality of cameras arranged at different positions, and tracking a target object to be tracked by applying a target object tracking model.
2. The method for tracking the target under the multi-camera scene according to claim 1, wherein the step S2 of building the YOLO-V4 model comprises:
(1) the method improves the Yolov3 backbone extraction network Darknet53, modifies the activation function of Darknet Conv2D from LeakyReLU to Mish, and the formula of the Mish function is as follows:
Mish=x*tanh(ln(1+ex))
then modifying the network structure in Darknet53, using CSPnet structure, modifying Darknet53 into CSPDarknet 53;
(2) after the SPP structure is connected to the last feature layer of the CSPdakrnet 53 to carry out convolution for three times, the SPP structure is respectively used for processing by utilizing the maximum pooling of four different scales, the size of the pooling kernel of the maximum pooling is respectively 13x13, 9x9, 5x5 and 1x1, and the repeated extraction of the features is realized by utilizing an up-sampling network and a down-sampling network of the PANET;
(3) the training part adopts a Mosaic data enhancement method, and the method reads a plurality of pictures each time, respectively turns over, zooms, changes color gamut and the like, and places the pictures in various directions to form new pictures;
(4) CIOU is used as regression optimization LOSS, and the calculation formula is as follows:
Figure FDA0002976347370000011
where IOU is the intersection/union of the areas between the predicted and actual boxes, ρ 2(b, b)gt) C represents the diagonal distance of the minimum closure area which can simultaneously contain the prediction frame and the real frame; the calculation formula of alpha and v is as follows:
Figure FDA0002976347370000021
Figure FDA0002976347370000022
w, h and wgt,hgtRepresenting the width and height of the real frame and the prediction frame, 1-CIOU can obtain the corresponding LOSS, and the calculation formula is as follows:
Figure FDA0002976347370000023
3. the method for tracking the target under the multi-camera scenario as claimed in claim 1, wherein the attention mechanism of step S4 is a channel attention mechanism network ECA-Net.
4. The method for tracking the target under the multi-camera scene according to claim 1, wherein the step S7 further comprises the steps of: firstly, reading pictures captured by each camera, splicing the captured pictures by using a SURF algorithm, and then fusing all the spliced pictures to obtain a final multi-camera fused video.
5. The method for tracking the target under the multi-camera scene as claimed in claim 1, wherein the target is an intelligent car, and the target-related data is other data with a shape similar to the intelligent car.
CN202110275199.9A 2021-03-15 2021-03-15 Target tracking method under multi-camera scene Withdrawn CN112927267A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110275199.9A CN112927267A (en) 2021-03-15 2021-03-15 Target tracking method under multi-camera scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110275199.9A CN112927267A (en) 2021-03-15 2021-03-15 Target tracking method under multi-camera scene

Publications (1)

Publication Number Publication Date
CN112927267A true CN112927267A (en) 2021-06-08

Family

ID=76174965

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110275199.9A Withdrawn CN112927267A (en) 2021-03-15 2021-03-15 Target tracking method under multi-camera scene

Country Status (1)

Country Link
CN (1) CN112927267A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114882351A (en) * 2022-03-31 2022-08-09 河海大学 Multi-target detection and tracking method based on improved YOLO-V5s
CN114882393A (en) * 2022-03-29 2022-08-09 华南理工大学 Road reverse running and traffic accident event detection method based on target detection
CN115035251A (en) * 2022-06-16 2022-09-09 中交第二航务工程局有限公司 Bridge deck vehicle real-time tracking method based on domain-enhanced synthetic data set
CN116993779A (en) * 2023-08-03 2023-11-03 重庆大学 Vehicle target tracking method suitable for monitoring video

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114882393A (en) * 2022-03-29 2022-08-09 华南理工大学 Road reverse running and traffic accident event detection method based on target detection
CN114882351A (en) * 2022-03-31 2022-08-09 河海大学 Multi-target detection and tracking method based on improved YOLO-V5s
CN114882351B (en) * 2022-03-31 2024-04-26 河海大学 Multi-target detection and tracking method based on improved YOLO-V5s
CN115035251A (en) * 2022-06-16 2022-09-09 中交第二航务工程局有限公司 Bridge deck vehicle real-time tracking method based on domain-enhanced synthetic data set
CN115035251B (en) * 2022-06-16 2024-04-09 中交第二航务工程局有限公司 Bridge deck vehicle real-time tracking method based on field enhanced synthetic data set
CN116993779A (en) * 2023-08-03 2023-11-03 重庆大学 Vehicle target tracking method suitable for monitoring video
CN116993779B (en) * 2023-08-03 2024-05-14 重庆大学 Vehicle target tracking method suitable for monitoring video

Similar Documents

Publication Publication Date Title
Sun et al. Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning
Asha et al. Vehicle counting for traffic management system using YOLO and correlation filter
CN104244113B (en) A kind of video abstraction generating method based on depth learning technology
CN112927267A (en) Target tracking method under multi-camera scene
CN111914664A (en) Vehicle multi-target detection and track tracking method based on re-identification
Lyu et al. Small object recognition algorithm of grain pests based on SSD feature fusion
Jadhav et al. Aerial multi-object tracking by detection using deep association networks
Tang et al. Integrated feature pyramid network with feature aggregation for traffic sign detection
CN116402850A (en) Multi-target tracking method for intelligent driving
Zhang et al. Exploiting Offset-guided Network for Pose Estimation and Tracking.
CN108280844A (en) A kind of video object localization method based on the tracking of region candidate frame
Adeli et al. A component-based video content representation for action recognition
Shehzadi et al. 2d object detection with transformers: a review
CN112232240A (en) Road sprinkled object detection and identification method based on optimized intersection-to-parallel ratio function
Kalva et al. Smart Traffic monitoring system using YOLO and deep learning techniques
Wang et al. Summary of object detection based on convolutional neural network
Hassan et al. Multi-object tracking: a systematic literature review
Alomari et al. Smart real-time vehicle detection and tracking system using road surveillance cameras
CN114821482A (en) Vector topology integrated passenger flow calculation method and system based on fisheye probe
Rabecka et al. Assessing the performance of advanced object detection techniques for autonomous cars
CN113420660A (en) Infrared image target detection model construction method, prediction method and system
Akdag et al. Transformer-based fusion of 2D-pose and spatio-temporal embeddings for distracted driver action recognition
Tian et al. Pedestrian multi-target tracking based on YOLOv3
Kovbasiuk et al. Detection of vehicles on images obtained from unmanned aerial vehicles using instance segmentation
Quang et al. Character time-series matching for robust license plate recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210608