CN113077496A - Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium - Google Patents

Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium Download PDF

Info

Publication number
CN113077496A
CN113077496A CN202110413744.6A CN202110413744A CN113077496A CN 113077496 A CN113077496 A CN 113077496A CN 202110413744 A CN202110413744 A CN 202110413744A CN 113077496 A CN113077496 A CN 113077496A
Authority
CN
China
Prior art keywords
tracking
algorithm
vehicle
vehicles
lightweight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110413744.6A
Other languages
Chinese (zh)
Inventor
李智军
程琦云
李国欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202110413744.6A priority Critical patent/CN113077496A/en
Publication of CN113077496A publication Critical patent/CN113077496A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a real-time vehicle detection and tracking method based on lightweight YOLOv3, which comprises the following steps: step 1: detecting the vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles in the video by using a priori frames; step 2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm; and step 3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized. The whole scheme of the invention has strong robustness and low omission factor, is easy to expand to various vehicle categories, and meets the requirements of vehicle detection and continuous tracking in the monitoring video.

Description

Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium
Technical Field
The invention relates to the technical field of deep learning, in particular to a real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and a medium.
Background
With the rapid growth of automobile reserves, a large number of scholars carry out intensive research on advanced assistant driving systems, vehicle detection becomes a key point in ADAS research, effective detection and tracking of vehicles on the front road are important components for making judgment and early warning of a safety assistant driving system, and miniaturization of a detection algorithm model becomes a premise and means for rapid and real-time running of vehicle-mounted embedded equipment.
In a target detection algorithm based on a traditional method, a classifier of a feature extractor has poor generalization performance, different features need to be designed and selected in different scenes, reasonable feature extraction difficulty is high, operation complexity is high, and practical application is limited.
Detection algorithms based on deep learning can be classified into two methods, namely a region-based method and a regression-based method. The region-based method generates candidate regions through a selective search algorithm and then uses a convolutional neural network for classification, and the main methods include R-CNN, Fast R-CNN and the like. The region-based method is used for detecting in two steps, has high detection precision, but has the defects of complex network and low detection speed. Regression methods such as SSD and YOLO take the target detection problem as a regression problem, and can directly regress the object class probability and the coordinate position. However, the network structure of the algorithm is still large, and the defects that the operation speed of transplanting and deploying the algorithm to the real vehicle embedded equipment is low, the actual deployment cost is high and the like exist.
Disclosure of Invention
In view of the defects in the prior art, the present invention aims to provide a real-time vehicle detection and tracking method and system and medium based on lightweight YOLOv 3.
The invention provides a real-time vehicle detection and tracking method based on lightweight YOLOv3, which comprises the following steps:
step 1: detecting the vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles in the video by using a priori frames;
step 2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm;
and step 3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized.
Preferably, the backbone network of the lightweight YOLOv3 algorithm employs one 7-layer convolutional layer, and in the residual network structure of each convolutional layer, fewer repeating residual units are employed.
Preferably, the step 1 comprises the steps of:
step 1.1: performing K-means + + clustering on the vehicle frames in the training set, selecting three types of vehicle frames with different sizes, selecting three vehicle frames with different shapes in each type of size, and taking nine vehicle frame shapes as prior frames;
step 1.2: acquiring a single-frame video image in a traffic video;
step 1.3: and predicting the coordinates of the candidate vehicles by adopting a target detection algorithm based on the lightweight YOLOv3, and framing all the candidate vehicles by using a priori frames according with the sizes of the vehicles.
Preferably, in step 2, a kalman filter algorithm is used to predict the position of the frame at the next time, and the state of the kalman filter is updated.
The invention also provides a real-time vehicle detection and tracking system based on the lightweight YOLOv3, which comprises the following modules:
module M1: detecting vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles by using a priori frame;
module M2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm;
module M3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized.
Preferably, the backbone network of the lightweight YOLOv3 algorithm employs one 7-layer convolutional layer, and in the residual network structure of each convolutional layer, fewer repeating residual units are employed.
Preferably, the module M1 includes the following modules:
module M1.1: performing K-means + + clustering on the vehicle frames in the training set, selecting three types of vehicle frames with different sizes, selecting three vehicle frames with different shapes in each type of size, and taking nine vehicle frame shapes as prior frames;
module M1.2: acquiring a single-frame video image in a traffic video;
module M1.3: and predicting the coordinates of the candidate vehicles by adopting a target detection algorithm based on the lightweight YOLOv3, and framing all the candidate vehicles by using a priori frames according with the sizes of the vehicles.
Preferably, the module M2 uses a kalman filter algorithm to predict the position of the box at the next time, and updates the state of the kalman filter.
The invention also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method as set forth above.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the invention, the light-weight YOLOv3 algorithm is used for vehicle detection, vehicles with different sizes can be detected and the positions of the vehicles can be framed, the characteristics are not required to be designed manually, the process of characteristic selection is omitted, the quality of extracted characteristics is better, the vehicle detection method is more robust in the face of complex scenes, and the omission ratio is low;
2. the invention improves the traditional YOLOv3 algorithm, reduces the network parameters under the condition that the average accuracy rate is basically kept unchanged, reduces the network model to 1/4 of the original YOLOv3, and doubles the detection speed;
3. the method adopts a Kalman filtering algorithm to track the vehicle position at the next moment after target detection;
4. on the basis of adopting a Kalman filtering algorithm, a Hungarian matching algorithm is utilized to carry out association matching on vehicles in adjacent frames of a video, the unique label ID of a detected target is determined, accurate positioning and tracking of a plurality of targets are realized, and the unstable detection conditions such as detection discontinuity, omission, target occlusion and the like are improved;
5. the detection tracking algorithm has stronger robustness in a complex road environment, and can meet the requirements on the precision and the speed of vehicle detection tracking in the actual intelligent driving process.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a diagram of a neural network architecture of the present invention;
FIG. 3 is a diagram of the residual network architecture of the present invention;
FIG. 4 is a diagram illustrating the effect of the present invention on the KITTI training set;
FIG. 5 is a diagram of the effect of the model of the present invention on the validation set.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
Referring to fig. 1, a real-time vehicle detection and tracking method in a traffic video is divided into a vehicle detection module and a vehicle tracking module.
Referring to fig. 3, in a vehicle detection algorithm, a target detection algorithm based on a lightweight YOLOv3 algorithm is invented, the algorithm is improved on a traditional YOLOv3 network structure, a darknet53 is not used as a backbone network, but a YOLOv3-tiny structure similar to the darknet19 is used for reference, the backbone network uses 7 convolutional layers, in a residual error network structure of each convolutional layer, fewer repeated residual error units are used, and the network depth is reduced by reducing the convolutional layers. And selecting a proper prior frame size by using a modified K-means + + algorithm in the selection of the prior frame size. On the final network output, a logistic regression is used to enable the output prediction box to better cover the vehicle to be detected.
The improved lightweight YOLOv3 network structure is shown in fig. 2 and can be divided into a backbone network and a detection head; firstly, a backbone network is used for extracting the characteristics of an image; the input size of the backbone network is 416 multiplied by 3, and a convolution layer containing 16 convolution kernels multiplied by 3 is adopted to carry out primary feature extraction on the input network. The convolution formula for feature extraction is:
Figure BDA0003024971590000041
in the formula, ai,jA value representing the coordinate (i, j) in the feature map; w is ac,m,nThe c channel coordinate is the value of (m, n) for the convolution kernel; x is the number ofc,i+m,j+nInputting a value with the c channel coordinate being (i + m, j + n); w is abIs the bias term of the convolution kernel.
The extracted feature layer network output of the previous layer is used as the input of the next layer, then 32 convolution kernels with the step size of 2 are adopted for filtering, and a residual block is adopted for reinforcing the network feature extraction, and as shown in fig. 3, the residual block is formed by connecting 16 convolution kernels with the step size of 1 × 1 and 32 convolution kernels with the step size of 3 × 3. The residual network formula is:
out=f2[f1(in)]+in,
where in is the input, f1Is a 1X 1 convolutional layer, f23 × 3 convolutional layers, out is the output.
By adopting the same principle method, filtering is carried out on subsequent network structures by respectively sequentially adopting 64, 28, 256, 512 and 1024 3 multiplied by 3 convolution kernels with the step length of 2, so that the length and the width of a feature extraction layer are reduced, the depth is increased, and deeper features can be extracted. Meanwhile, 2, 4 and 4 residual blocks are sequentially adopted for connecting the first 4 subsequent different convolution kernels, feature extraction is enhanced, a large number of network layers are reduced compared with YOLOv3, the network structure is miniaturized, and the learning and network forward reasoning running speed can be increased.
Referring to fig. 4 and 5, next is a detection head for performing target detection on a feature map given by the backbone network and giving the position of a detection frame.
The detection head extracts the feature extraction convolution layers with the sizes of 52 x 52, 26 x 26 and 13 x 13, wherein 3 layers are extracted separately, and the depths of the feature extraction convolution layers are 128, 256 and 1024 respectively. The 13 × 13 feature layer is convolved with 512 1 × 1 convolution kernels, 1024 3 × 3 convolution kernels, and 18 1 × 1 convolution kernels in this order, so that the input of the feature layer network is 13 × 13 × 18. Each vector of 1 × 1 × 18 is responsible for determining whether the object in the receptive field is an automobile, and if so, the central coordinate and the size of the prediction frame are given at the same time.
The 13 × 13 feature extraction layer not only outputs the result alone, but also amplifies the 13 × 13 features to 26 × 26 size by means of upsampling, connects the up-sampled features with the original 26 × 26 feature layer, and outputs the up-sampled features simultaneously after convolving the 26 × 26 feature layer. The original 26 × 26 layers are also connected with 52 × 52 for output in the same way, so as to obtain multi-scale target detection.
And for the size of the output prediction box, selecting a proper prior box size in the training set by using a clustering algorithm to predict the automobiles with different scales in different videos. The invention adopts the K-means + + algorithm to calculate the clusters, and the K-means + + algorithm improves the selection of the initial points of the K-means algorithm, so that the distance between the cluster centers is as far as possible. The key steps of the K-means + + algorithm are as follows:
step S1: randomly taking a sample from the data set as an initial clustering center u1
Step S2: calculate the distance D (X) of each sample xi of the data X from the nearest cluster centeri),
D(xi)=argmin|xi-ur|;
Step S3: calculating the probability of each sample being selected as the next cluster center
Figure BDA0003024971590000051
Selecting a next clustering center;
step S4: and repeating the step 2 and the step 3 until k clustering centers are selected.
After 9 prior frames are obtained according to the clustering center, the larger 52 × 52 feature extraction convolutional layer adopts the smaller 3 prior frames and has the largest receptive field, the middle 26 × 26 feature extraction convolutional layer adopts the middle 3 prior frames, and the smaller 13 × 13 feature extraction convolutional layer adopts the larger 3 prior frames and has the smallest receptive field.
After the neural network outputs the prediction frame, in order to enable the prediction frame to better cover the detected target, a logistic regression function is used for carrying out confidence regression on each prior frame on different scales, the frame value of the object is predicted, and the most appropriate target category area is selected according to the confidence. The logistic regression function prediction formula is:
Figure BDA0003024971590000052
in the formula, cx,cyCoordinate offset of grid coordinates relative to the center of the image; p is a radical ofw,phThe length of the length and the width of the prior frame is the side length; t is tx,ty,tw,thA target value for deep web learning; bx,by,bw,bhAnd the coordinate values of the predicted frame are finally calculated by a formula.
In a vehicle tracking module, the invention uses Kalman filtering algorithm and Hungarian algorithm to realize the accurate positioning and tracking of a plurality of targets.
And calculating the centroid coordinate of the frame according to the prior frame detected by the lightweight YOLOv3 target detection algorithm, predicting the position of the frame at the next moment by using a Kalman filtering algorithm, and updating the state of a Kalman filter.
And the Kalman filtering algorithm predicts the coordinate position of the target at the moment according to the coordinate position of the vehicle detected at the last moment. Firstly, calculating the coordinates (X, y) of the centroid of the detected object according to the coordinates of the vehicle frame detected by the deep learning detection algorithm, and expressing the coordinates as the current state X of the targett|t,Xt-1|t-1Is the target state at the last moment, Xt|t-1Predicting the target state of the current time, the observation state ZtFor the coordinates of the actually detected object centroid, Pt|tEstimating the error covariance, P, for the current time instantt|t-1And marking the estimation error covariance of the current moment predicted by the previous moment. A is a state transition matrix, H is an observation matrix, KtGain matrix, W, for Kalman filteringt-1|t-1For the excitation noise at the previous time instant, Q, R are the covariance matrices of the excitation noise and the observation noise, respectively. The Kalman filtering tracking collective formula is as follows:
Xt|t-1=AXt-1|t-1+Wt-1|t-1, (1)
Pt|t-1=APt-1|t-1AT+Wt-1|t-1, (2)
Xt|t=Xt|t-1+Kt(Zt-Xt-1|t-1), (3)
Pt|t=Pt|t-1-KtHPt|t-1, (4)
Kt=Pt|t-1HT(HPt|t-1HT+R)-1 (5)
the position of the vehicle detected at the previous time at the current time is predicted using equations (1), (2), and the state of the kalman filter is updated using equations (3), (4), (5).
On the basis of tracking the vehicle position at the next moment after the target is detected by adopting a Kalman filtering algorithm, vehicles in adjacent frames of the video are associated and matched by utilizing a Hungarian matching algorithm, the unique ID label of the detected target is determined, and accurate positioning and tracking of a plurality of targets are realized.
The Hungarian optimal matching algorithm adopts Euclidean distance of two different set coordinates as a cost matrix when performing association matching, and then adopts the Hungarian algorithm to perform characteristic association, namely, the minimum value d of the Euclidean distance between the centroid coordinate predicted at different previous moments and the detection coordinate at the current moment is solvedminPredicting coordinates and time of the last momentAnd assigning and associating the detection coordinates. The Euclidean distance calculation formula is as follows:
Figure BDA0003024971590000061
Figure BDA0003024971590000062
set of predicted box centroid coordinates predicted for multiple previous moments
Figure BDA0003024971590000063
Is the set of the coordinates of the centroid of the prediction frame detected at the moment.
And when the detection value at the moment is not distributed to any predicted value at the previous moment, namely the number of the predicted coordinates at the previous moment is less than the number of the detected coordinates at the moment, tracking the detection value as a new target. The specific formula is as follows:
nt-1<nt
wherein the number of predicted coordinates at time t-1 is nt-1And the number of actual detection coordinates at the time t is nt
In the actual tracking situation, considering the situations of missing detection and tracking failure, when the calculated Euclidean distance exceeds a set threshold value or a plurality of frames fail to detect the vehicle object, the tracking loss is determined. The specific formula is as follows:
f>fmax∨d>dmax
in the formula, f is the number of target frames which are not continuously detected; f. ofmaxThe maximum number of lost frames; d is the Euclidean distance; dmaxIs the maximum distance threshold.
The invention also provides a real-time vehicle detection and tracking system based on the lightweight YOLOv3, which comprises the following modules: module M1: detecting vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles by using a priori frame; module M2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm; module M3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized.
The backbone network of the lightweight YOLOv3 algorithm employs one 7-layer convolutional layer, and in the residual network structure of each convolutional layer, fewer repeated residual units are employed.
Module M1 includes the following modules: module M1.1: performing K-means + + clustering on the vehicle frames in the training set, selecting three types of vehicle frames with different sizes, selecting three vehicle frames with different shapes in each type of size, and taking nine vehicle frame shapes as prior frames; module M1.2: acquiring a single-frame video image in a traffic video; module M1.3: and predicting the coordinates of the candidate vehicles by adopting a target detection algorithm based on the lightweight YOLOv3, and framing all the candidate vehicles by using a priori frames according with the sizes of the vehicles.
The kalman filter algorithm is used in block M2 to predict the position of the box at the next time while updating the state of the kalman filter.
The invention also provides a computer-readable storage medium having a computer program stored thereon, which, when being executed by a processor, carries out the steps of the method as described above.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (9)

1. A real-time vehicle detection and tracking method based on lightweight YOLOv3 is characterized by comprising the following steps:
step 1: detecting the vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles in the video by using a priori frames;
step 2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm;
and step 3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized.
2. The method as claimed in claim 1, wherein the backbone network of the lightweight YOLOv3 algorithm employs 7 convolutional layers, and in the residual network structure of each convolutional layer, fewer repeated residual units are employed.
3. The method for detecting and tracking the vehicle in real time based on the lightweight YOLOv3 as claimed in claim 1, wherein the step 1 comprises the following steps:
step 1.1: performing K-means + + clustering on the vehicle frames in the training set, selecting three types of vehicle frames with different sizes, selecting three vehicle frames with different shapes in each type of size, and taking nine vehicle frame shapes as prior frames;
step 1.2: acquiring a single-frame video image in a traffic video;
step 1.3: and predicting the coordinates of the candidate vehicles by adopting a target detection algorithm based on the lightweight YOLOv3, and framing all the candidate vehicles by using a priori frames according with the sizes of the vehicles.
4. The method as claimed in claim 1, wherein the step 2 uses kalman filter algorithm to predict the position of the frame at the next time, and updates the state of the kalman filter.
5. A real-time vehicle detection and tracking system based on lightweight YOLOv3 is characterized by comprising the following modules:
module M1: detecting vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles by using a priori frame;
module M2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm;
module M3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized.
6. The system of claim 5, wherein the backbone network of the lightweight YOLOv3 algorithm employs 7 convolutional layers, and fewer repeating residual units are employed in the residual network structure of each convolutional layer.
7. The system for real-time vehicle detection and tracking based on the lightweight YOLOv3 of claim 5, wherein the module M1 comprises the following modules:
module M1.1: performing K-means + + clustering on the vehicle frames in the training set, selecting three types of vehicle frames with different sizes, selecting three vehicle frames with different shapes in each type of size, and taking nine vehicle frame shapes as prior frames;
module M1.2: acquiring a single-frame video image in a traffic video;
module M1.3: and predicting the coordinates of the candidate vehicles by adopting a target detection algorithm based on the lightweight YOLOv3, and framing all the candidate vehicles by using a priori frames according with the sizes of the vehicles.
8. The system for real-time vehicle detection and tracking based on the lightweight YOLOv3 of claim 5, wherein the module M2 utilizes a kalman filter algorithm to predict the position of the frame at the next time and update the state of the kalman filter.
9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 4.
CN202110413744.6A 2021-04-16 2021-04-16 Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium Pending CN113077496A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110413744.6A CN113077496A (en) 2021-04-16 2021-04-16 Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110413744.6A CN113077496A (en) 2021-04-16 2021-04-16 Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium

Publications (1)

Publication Number Publication Date
CN113077496A true CN113077496A (en) 2021-07-06

Family

ID=76618134

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110413744.6A Pending CN113077496A (en) 2021-04-16 2021-04-16 Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium

Country Status (1)

Country Link
CN (1) CN113077496A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113129336A (en) * 2021-03-31 2021-07-16 同济大学 End-to-end multi-vehicle tracking method, system and computer readable medium
CN116778224A (en) * 2023-05-09 2023-09-19 广州华南路桥实业有限公司 Vehicle tracking method based on video stream deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110472467A (en) * 2019-04-08 2019-11-19 江西理工大学 The detection method for transport hub critical object based on YOLO v3
CN111126152A (en) * 2019-11-25 2020-05-08 国网信通亿力科技有限责任公司 Video-based multi-target pedestrian detection and tracking method
CN111476826A (en) * 2020-04-10 2020-07-31 电子科技大学 Multi-target vehicle tracking method based on SSD target detection
CN112241969A (en) * 2020-04-28 2021-01-19 北京新能源汽车技术创新中心有限公司 Target detection tracking method and device based on traffic monitoring video and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110472467A (en) * 2019-04-08 2019-11-19 江西理工大学 The detection method for transport hub critical object based on YOLO v3
CN111126152A (en) * 2019-11-25 2020-05-08 国网信通亿力科技有限责任公司 Video-based multi-target pedestrian detection and tracking method
CN111476826A (en) * 2020-04-10 2020-07-31 电子科技大学 Multi-target vehicle tracking method based on SSD target detection
CN112241969A (en) * 2020-04-28 2021-01-19 北京新能源汽车技术创新中心有限公司 Target detection tracking method and device based on traffic monitoring video and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
何丹妮: "基于深度学习的多车辆检测及跟踪算法研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *
刘军等: "基于增强Tiny+YOLOV3算法的车辆实时检测与跟踪", 《农业工程学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113129336A (en) * 2021-03-31 2021-07-16 同济大学 End-to-end multi-vehicle tracking method, system and computer readable medium
CN116778224A (en) * 2023-05-09 2023-09-19 广州华南路桥实业有限公司 Vehicle tracking method based on video stream deep learning

Similar Documents

Publication Publication Date Title
US10733755B2 (en) Learning geometric differentials for matching 3D models to objects in a 2D image
CN108805083B (en) Single-stage video behavior detection method
CN110033002B (en) License plate detection method based on multitask cascade convolution neural network
CN108509859B (en) Non-overlapping area pedestrian tracking method based on deep neural network
Dewangan et al. RCNet: road classification convolutional neural networks for intelligent vehicle system
CN112101221B (en) Method for real-time detection and identification of traffic signal lamp
CN108197326B (en) Vehicle retrieval method and device, electronic equipment and storage medium
CN110175649B (en) Rapid multi-scale estimation target tracking method for re-detection
CN109800692B (en) Visual SLAM loop detection method based on pre-training convolutional neural network
CN108171112A (en) Vehicle identification and tracking based on convolutional neural networks
CN111667512B (en) Multi-target vehicle track prediction method based on improved Kalman filtering
CN113378890B (en) Lightweight pedestrian vehicle detection method based on improved YOLO v4
US11741368B2 (en) Image segmentation
CN103984948B (en) A kind of soft double-deck age estimation method based on facial image fusion feature
CN112052802B (en) Machine vision-based front vehicle behavior recognition method
CN111640136B (en) Depth target tracking method in complex environment
CN113077496A (en) Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium
CN113762209A (en) Multi-scale parallel feature fusion road sign detection method based on YOLO
CN112738470B (en) Method for detecting parking in highway tunnel
CN111626120B (en) Target detection method based on improved YOLO-6D algorithm in industrial environment
CN112990065A (en) Optimized YOLOv5 model-based vehicle classification detection method
Mayr et al. Self-supervised learning of the drivable area for autonomous vehicles
CN113963333B (en) Traffic sign board detection method based on improved YOLOF model
CN113792631B (en) Aircraft detection and tracking method based on multi-scale self-adaption and side-domain attention
CN116630932A (en) Road shielding target detection method based on improved YOLOV5

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210706

RJ01 Rejection of invention patent application after publication