CN113076898A - Traffic vehicle target detection method, device, equipment and readable storage medium - Google Patents

Traffic vehicle target detection method, device, equipment and readable storage medium Download PDF

Info

Publication number
CN113076898A
CN113076898A CN202110387355.0A CN202110387355A CN113076898A CN 113076898 A CN113076898 A CN 113076898A CN 202110387355 A CN202110387355 A CN 202110387355A CN 113076898 A CN113076898 A CN 113076898A
Authority
CN
China
Prior art keywords
traffic
detection
clustering
layer
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110387355.0A
Other languages
Chinese (zh)
Other versions
CN113076898B (en
Inventor
陈婷
姚大春
高涛
王松涛
刘占文
李永会
陈友静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changan University
Original Assignee
Changan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changan University filed Critical Changan University
Priority to CN202110387355.0A priority Critical patent/CN113076898B/en
Publication of CN113076898A publication Critical patent/CN113076898A/en
Application granted granted Critical
Publication of CN113076898B publication Critical patent/CN113076898B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23211Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with adaptive number of clusters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method, a device, equipment and a readable storage medium for detecting a traffic vehicle target, wherein vehicle labeling is carried out on a preprocessed traffic image; the method has the advantages that a network layer is added to a shallow layer part of a main network based on feature extraction to extract shallow layer features, the processing capacity of detail features is improved, on the premise of keeping high detection precision, CSPDarknet26 is adopted to extract features of multiple layers of residual error networks, the redundancy of a convolutional layer can be avoided, the utilization rate of a memory is improved, the training speed is improved, multi-scale detection is carried out after the features are extracted through the multiple layers of residual error networks, rich detail information is extracted, the problem of degradation of the detection effect of small targets can be avoided by CRB, and the applicability and the accuracy of the detection of single targets of vehicles under the strong rain density traffic environment are improved.

Description

Traffic vehicle target detection method, device, equipment and readable storage medium
Technical Field
The invention belongs to the technical field of traffic vehicle detection, and particularly relates to a traffic vehicle target detection method, a traffic vehicle target detection device, traffic vehicle target detection equipment and a readable storage medium.
Background
In recent years, the automobile industry has been developed rapidly, and automobiles have become an indispensable tool for transportation. With the increasing number of global motor vehicles, traffic jam and traffic accidents become more serious, which not only causes environmental pollution to the country, but also causes loss to families. The development of vehicle detection technology based on computer vision can enable related traffic departments to master traffic flow in real time, so that related traffic guidance policies are formulated to alleviate traffic problems.
Vehicle detection algorithms are mainly divided into traditional detection algorithms and detection algorithms based on deep learning. The traditional vehicle detection algorithm firstly selects an interested region and extracts a candidate region through a sliding window, secondly manually extracts features of the candidate region, and finally uses a classifier for classification and identification. For example, some scholars adopt the classical haar feature and carry out detection through a sliding window searching strategy, so that the false alarm rate is effectively reduced; to obtain fine-grained features, some scholars extract features by Histogram of Oriented Gradients (HOG). In addition, some researchers have combined the HOG with a Support Vector Machine (SVM) to propose a Deformable Part Model (DPM). However, the extracted image features are essentially manually extracted features, so that the target detection has the defects of low efficiency, low precision, poor generalization, complex flow and the like.
With the rapid development of the deep learning technology in the field of target detection, researchers begin to discuss the target detection problem by replacing the traditional technology which is tedious and low in precision with the deep learning technology. According to different importance degrees of real-time performance and accuracy, deep learning target detection algorithms are mainly divided into two types, one type is a two-stage target detector represented by R-CNN (Region-conditional Neural Networks), namely a two-stage (two-stage) detection model, which is also called a Region-based detection method and mainly comprises R-CNN, Fast R-CNN, R-FCN (full conditional network) and the like. Although the detection precision of the two-stage method is high, the time consumption is too long, and the effect of real-time detection is difficult to achieve. In order to balance the detection speed and the precision, a single-stage (one-stage) detection model is proposed. The single-stage detection method is also called as a regression-based method, a prediction result is directly obtained from an image without a region detection process, and end-to-end target detection is realized. Mainly comprises a series of methods such as YOLO (you Only Look one), SSD (Single Shot detector), YOLOv2, YOLOv3, YOLOv4 and the like.
The YOLOv4 target detection algorithm maximally achieves the balance between precision and speed. However, for the problem of single-class target detection, the original network appears to be redundant of convolution layers, so that the memory utilization rate is low, and when a small target with a color similar to that of the background is detected, the detection effect is poor.
Disclosure of Invention
The invention aims to provide a method, a device and equipment for detecting a traffic vehicle target and a readable storage medium, so as to overcome the defects of the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for detecting a traffic vehicle target comprises the following steps:
s1, preprocessing the traffic image to obtain traffic images of different scenes;
s2, carrying out vehicle labeling on the preprocessed traffic images, carrying out dimension clustering on the labeled traffic images, setting a threshold value for clustering Anchor Box, and selecting a relatively far point as a next initial clustering center;
s3, adding a network layer to a shallow part of the main network for shallow feature extraction based on CSPDarknet26 feature extraction, and extracting features of a deep part of the main network by adopting a multilayer residual error network;
and S4, performing multi-scale detection after performing feature extraction through a multi-layer residual error network, and realizing detection of different traffic vehicle targets in the traffic image.
Further, the traffic data set picture is from web crawlers and field shooting, and the traffic vehicle data set is divided into a training set, a testing set and a verification set; and sequentially carrying out image inversion and symmetrical processing on the traffic vehicle data set to realize the expansion of the data set.
Further, an IOU is used as a target clustering analysis, the IOU is used as a spatial distance calculation, and a clustering formula is as follows:
Di(xj)=1-IOU(xj,ci) (1)
xj∈X={x1,x2,...,cndenotes a ground truth sample; c. Ci∈{c1,c2,...,cnRepresents the cluster center; k represents the number of anchor boxes, the clustering objective function represents the minimum value of the sum of the distances from each sample to the clustering center, and the calculation formula is as follows:
Figure BDA0003014533500000031
further, the clustering target is analyzed by the contour coefficient method to select the optimal clustering number K, K is 9, and the height and width are (13, 11), (23, 17), (31, 28), (41, 20), (51, 33), (63, 51), (102, 61), (166, 116), (388, 244), respectively.
Furthermore, the first convolutional layer filters an input image with 416 × 416 resolution by using 32 convolutional kernels with the size of 3 × 3, then takes the output of the previous convolutional layer as the input of the next convolutional layer, and performs convolution operation by using 64 convolutional kernels with the size of 3 × 3 pixels and the step size of 2 pixels, so as to realize 2 times of downsampling and obtain a feature map with 208 × 208 resolution; then, 5 sets of 2 × Resblock _ bodies are added and executed in the network, and after 4 times of downsampling, feature maps with the sizes of 104 × 104, 52 × 52, 26 × 26 and 13 × 13 are obtained respectively.
Furthermore, the size of the input image is adjusted to 416 × 416, and the input image is subjected to 5 times of downsampling to obtain feature maps of 52 × 52, 26 × 26 and 13 × 13 in three different scales for multi-scale detection.
Further, all cars were labeled Car using the laboratory software.
A traffic vehicle object detecting device comprising:
the preprocessing module is used for preprocessing the traffic images to obtain traffic images of different scenes and marking vehicles on the preprocessed traffic images;
the clustering module is used for carrying out dimension clustering on the traffic images marked by the vehicles, setting a threshold value for clustering Anchor Box, and selecting a relatively far point as a next initial clustering center;
the characteristic extraction module detection module is used for adding a network layer to a shallow layer part to extract shallow layer characteristics, extracting the characteristics by adopting a multilayer residual error network based on a deep layer network, extracting the characteristics by the multilayer residual error network and then carrying out multi-scale detection, thereby realizing the detection of different traffic vehicle targets in the traffic image.
A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of a method of detecting a traffic vehicle object as described above when executing the computer program.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of a method of detecting an object in a vehicle as set out above.
Compared with the prior art, the invention has the following beneficial technical effects:
the invention relates to a traffic vehicle target detection method, which comprises the steps of preprocessing traffic vehicle images to obtain vehicle images of different scenes; then, carrying out vehicle labeling on the preprocessed traffic vehicle image, carrying out dimension clustering on the traffic vehicle image subjected to vehicle labeling, setting a threshold value to carry out clustering Anchor Box, and selecting a relatively far point as a next initial clustering center to improve the clustering precision; based on CSPDarknet26 feature extraction, a trunk network is added with a network layer at a shallow part for shallow feature extraction, so that the processing capacity of detail features is improved, on the premise of keeping higher detection precision, CSPDarknet26 is adopted for carrying out multi-layer residual error network feature extraction, the redundancy of a convolutional layer can be avoided, the utilization rate of a memory is improved, the training speed is improved, then multi-scale detection is carried out after the multi-layer residual error network is used for carrying out feature extraction, rich detail information is extracted, the problem of small target detection effect degradation is avoided, and the applicability and the accuracy of vehicle single-class target detection of an algorithm in a strong rain density traffic environment are improved.
Furthermore, the IOU is used as a space distance, and the target detection requirement under the condition that the vehicles are distributed loosely can be met.
Further, after the composite residual block processing, the composite residual block is sent to a large-scale detection convolutional layer to realize the detection of a large target, and the detection conditions of vehicles with different scales are well balanced through multi-scale detection.
The traffic vehicle target detection device can quickly realize vehicle detection and provide better service for traffic industry.
Drawings
Fig. 1 is a schematic diagram of a network detection process in an embodiment of the present invention.
Fig. 2 is a graph showing a variation in loss value in the embodiment of the present invention.
FIG. 3 is a comparison graph of predictions using a prior art method and a method of the present invention in an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings:
a method for detecting a traffic vehicle target comprises the following steps:
s1, preprocessing the traffic image to obtain traffic images of different scenes, and expanding a traffic image data set;
the traffic vehicle data set picture is from web crawlers and field shooting, and the traffic vehicle data set is divided into a training set, a testing set and a verification set;
specifically, the traffic vehicle data set is subjected to image inversion, symmetry, data enhancement, cutting and angle transformation in sequence to obtain traffic vehicle images in different scenes, so that the data set is expanded, and the robustness of subsequent network training is improved.
S2, performing vehicle labeling on the preprocessed traffic image, performing dimension clustering on the traffic image subjected to the vehicle labeling, setting a threshold value to perform clustering Anchor Box, and selecting a relatively far point as a next initial clustering center, namely a point which is far away from the previous clustering center and is not in the previous dimension clustering;
the invention adopts IOU as target clustering analysis, and uses IOU as space Distance (Spatial Distance Calculation) Calculation, thereby reducing the error generated by initial anchor frames with different sizes, and the clustering formula is as follows:
Di(xj)=1-IOU(xj,ci) (1)
xj∈X={x1,x2,...,cndenotes the group Truth sample; c. Ci∈{c1,c2,...,cnRepresents the cluster center; k represents the number of anchor point frames, the clustering objective function represents the minimum value of the sum of the distances from each sample to the clustering center, and the calculation formula is as follows:
Figure BDA0003014533500000061
the Method is adopted to carry out dimension clustering on the Anchor Box, a Contour Coefficient Method (Contour Coefficient Method) is used for analyzing a clustering target to select the optimal clustering number K, and when the K is less than a true value of 3, J (K) is greatly reduced; when K reaches a true value of 3, J (K) is stopped quickly, and the clustering effect is reduced; as K increases, J (K) becomes more and more stable. Thus, the initial optimal anchor box cluster number for the vehicle data set is 9, and the height and width are (13, 11), (23, 17), (31, 28), (41, 20), (51, 33), (63, 51), (102, 61), (166, 116), (388, 244).
Specifically, all cars were labeled Car using the laboratory software.
S3, adding a network layer to the shallow part based on a feature extraction backbone network (CSPDarknet26) to perform shallow feature extraction on the traffic images after dimension clustering, and performing feature extraction on the deep part by adopting a multilayer residual error network;
specifically, as shown in table 1; compressing the feature extraction depth, and adding a network layer in the shallow layer part, wherein the shallow layer convolution feature has small receptive field and small background noise and is suitable for extracting features with small resolution; secondly, on the premise of keeping higher detection precision, the deep network is properly cut down; the idea of a multi-layer residual network; specifically, the first convolutional layer filters an input image with 416 × 416 resolution by using 32 convolutional kernels with the size of 3 × 3, then performs convolution operation by using 64 convolutional kernels with the size of 3 × 3 pixels and the step size of 2 pixels by using the output of the previous convolutional layer as input, so as to realize down-sampling by 2 times and obtain a feature map with the resolution of 208 × 208; then, 5 sets of 2 × Resblock _ bodies are added and executed in the network, and after 4 times of downsampling, feature maps with the sizes of 104 × 104, 52 × 52, 26 × 26 and 13 × 13 are obtained respectively.
TABLE 1 CSPDarknet26 network architecture
Figure BDA0003014533500000071
Figure BDA0003014533500000081
And S4, performing multi-scale detection after performing feature extraction through a multi-layer residual error network, and realizing detection of the vehicle target in different traffic scenes in the traffic image.
Specifically, the size of the input image is adjusted to 416 × 416, and the input image is subjected to 5 times of downsampling to obtain feature maps of 52 × 52, 26 × 26 and 13 × 13 in three different scales for multi-scale detection. As shown in fig. 1, after a 52 × 52 feature map of a 92-layer network is processed, a smaller prior frame is allocated to achieve detection of a small target; meanwhile, downsampling 97 layers of 52 × 52 feature maps, performing feature fusion on the layer and the feature maps at the previous 87 layers, processing the layer by a Composite Residues Block (CRB), sending the layer to a detection medium-scale convolutional layer, and distributing a medium-size Anchor Box to realize detection of a medium target; finally, the 26 × 26 feature maps of 108 layers are subjected to downsampling operation through convolution of 3 × 3, then subjected to Concate feature fusion operation with the 13 × 13 feature maps of 77 layers, subjected to CRB processing, and sent to a convolution layer with large detection scale to realize large-target detection. The multi-scale detection well balances the accuracy and speed of vehicle detection of different scales.
The method is carried out by using Pythroch in a Linux environment, an operating system is Ubuntu16.04, a CPU is configured to be Intel Xeon E3-1225 v6, a GPU is Nvidia Quadro p4000, a video memory is 8GB, the batch size is set to be 64, the resolution condition of a computer is considered, the transmission frequency is set to be 32, the maximum iteration frequency is 50200, a momentum parameter is set to be 0.949, the initialized learning rate is 0.001, the attenuation coefficient is 0.0005, a step mode is selected to update the learning rate, and when the iteration frequency reaches 40160 and 45180, the learning rate is respectively reduced to 10% and 10% of the initial learning rate.
In one embodiment of the present invention, a terminal device is provided that includes a processor and a memory, the memory storing a computer program comprising program instructions, the processor executing the program instructions stored by the computer storage medium. The processor is a Central Processing Unit (CPU), or other general purpose processor, Digital Signal Processor (DSP), Application Specific Integrated Circuit (ASIC), ready-made programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, etc., which is a computing core and a control core of the terminal, and is adapted to implement one or more instructions, and in particular, to load and execute one or more instructions to implement a corresponding method flow or a corresponding function; the processor of the embodiment of the invention can be used for the operation of the traffic vehicle target detection method.
Example (b): a traffic vehicle target detection device can be used for realizing the traffic vehicle target detection method in the embodiment, and specifically comprises a preprocessing module, a clustering module and a feature extraction module detection module;
the preprocessing module is used for preprocessing the traffic image to obtain a clear traffic image and marking the preprocessed traffic image with vehicles;
the clustering module is used for carrying out dimension clustering on the traffic images marked by the vehicles, setting a threshold value for carrying out clustering, and selecting a relatively far point as a next initial clustering center;
the characteristic extraction module detection module is used for adding a network layer to a shallow layer part to extract shallow layer characteristics, extracting the characteristics by adopting a multilayer residual error network based on a deep layer network, extracting the characteristics by the multilayer residual error network and then carrying out multi-scale detection, thereby realizing the detection of different traffic vehicle targets in the traffic image.
In still another embodiment of the present invention, the present invention further provides a storage medium, specifically a computer-readable storage medium (Memory), which is a Memory device in the terminal device and is used for storing programs and data. The computer-readable storage medium includes a built-in storage medium in the terminal device, provides a storage space, stores an operating system of the terminal, and may also include an extended storage medium supported by the terminal device. Also, one or more instructions, which may be one or more computer programs (including program code), are stored in the memory space and are adapted to be loaded and executed by the processor. It should be noted that the computer-readable storage medium may be a high-speed RAM memory, or may be a Non-volatile memory (Non-volatile memory), such as at least one disk memory. One or more instructions stored in the computer-readable storage medium may be loaded and executed by a processor to implement the corresponding steps of the method for detecting a target of a transportation vehicle in the above-described embodiments.
The method is carried out by using Pythroch in a Linux environment, an operating system is Ubuntu16.04, a CPU is configured to be Intel Xeon E3-1225 v6, a GPU is Nvidia Quadro p4000, a video memory is 8GB, the batch size is set to be 64, the resolution condition of a computer is considered, the transmission frequency is set to be 32, the maximum iteration frequency is 50200, a momentum parameter is set to be 0.949, the initialized learning rate is 0.001, the attenuation coefficient is 0.0005, a step mode is selected to update the learning rate, and when the iteration frequency reaches 40160 and 45180, the learning rate is respectively reduced to 10% and 10% of the initial learning rate.
Since the research content belongs to multi-target detection, while the integrated accuracy (Precision), Recall (Recall), F1 value (F1-measure, F1) and detection speed are used as evaluation criteria, the Average Precision (AP) is also used for integrated comparison:
Figure BDA0003014533500000101
Figure BDA0003014533500000102
Figure BDA0003014533500000103
Figure BDA0003014533500000104
wherein: t isPIndicating that the Positive class is correctly predicted as the Positive class (True Positive), FPIndicating that the negative class error is predicted as a Positive example (False Positive), FNIndicating that the positive class was mispredicted as a Negative class (False Negative). AP is the average accuracy, using the standards in VOC2007, setting a set of thresholds, [0, 0.1, 0.2]Then, for each threshold value for which Recall is greater than, the corresponding maximum Precision is obtained, and the AP is the average value of these maximum precisions.
The invention is trained by adopting the optimal parameters set in the above and compares the optimal parameters with the processing effect of the prior YOLOv4 algorithm. From the training loss dynamics diagram of fig. 2, it can be seen that the method of the present invention significantly improves the training speed. Before the training is carried out for 500 steps, the training loss is almost linearly reduced to about 10 from 1900, and after the training is carried out for 50200 steps, the loss value is obviously reduced to 1.69, so that the method can avoid the redundancy of the convolution layer and the waste of the memory space on the premise of achieving good detection effect, and obviously improve the training speed. In addition, the method of the present invention was compared with available YOLOv3, YOLOv4, YOLOv4-C (representing CSPDarknet26) on the test set, respectively. As is apparent from fig. 3, YOLOv3 has serious missing detection and false detection, YOLOv4 algorithm has low detection precision and also has false detection and missing detection, YOLOv4 has increased number of YOLOv3 detections, but has missing detection and low accuracy and precision; compared with other algorithms, the method provided by the invention has the advantages that the accuracy rate and the accuracy rate are obviously improved, the vehicle target can be detected with high precision, and the problem of degradation of the detection effect of the small target is avoided.
TABLE 2 comparison of test results for different algorithms
Figure BDA0003014533500000111
Table 2 qualitatively compares the method of the invention with YOLOv3, YOLOv 4-C. The images were scaled to 416 x 416 before inspection. Firstly, the method of the invention is verified on a vehicle data set with other algorithms, and four inspected quantities are selected as comparison, namely, an accuracy P value, a recall ratio R value, a F1 value, an AP value and a speed. Compared with the YOLOv3 and YOLOv4 algorithm speeds are respectively increased to 13.46f/s and 15.63f/s, so that the improved algorithm obviously improves the training speed and the memory utilization rate; compared with the YOLOv4-C, the precision of the method is improved by about 10 percent, and the problem of small target detection effect degradation is solved. Meanwhile, as can be seen from table 2, the AP value of the method of the present invention reaches 93.61%, the accuracy reaches 0.96, the recall rate is 0.91, and the average accuracy is 0.857. Compared with other algorithms, all indexes of the method are superior to those of other algorithms, and the method is fully proved to be high in detection precision and high in detection speed.

Claims (10)

1. A method for detecting a traffic vehicle target is characterized by comprising the following steps:
s1, preprocessing the traffic image to obtain traffic images of different scenes;
s2, marking vehicles on the preprocessed traffic images in different scenes, performing dimension clustering on the marked traffic images, setting a threshold value to perform clustering Anchor Box, and selecting a relatively far point as a next initial clustering center;
s3, based on the feature extraction backbone network, adding a network layer to the shallow part to perform shallow feature extraction on the traffic images after dimension clustering, and performing feature extraction on the deep part by adopting a multilayer residual error network;
and S4, performing multi-scale detection after performing feature extraction through a multi-layer residual error network, and realizing detection of vehicle targets of different traffic scenes in the traffic image.
2. The method of claim 1, wherein the traffic images are obtained by web crawlers and live filming, and the traffic image set is divided into a training set, a testing set and a verification set; and sequentially carrying out image inversion and symmetrical processing on the traffic vehicle data set to realize the expansion of the data set.
3. The method of claim 1, wherein the IOU is used as a target cluster analysis, the IOU is used as a spatial distance calculation, and the cluster formula is as follows:
Di(xj)=1-IOU(xj,ci) (1)
xj∈X={x1,x2,...,cndenotes the group Truth sample; c. Ci∈{c1,c2,...,cnRepresents the cluster center; k represents the number of anchor boxes, the clustering objective function represents the minimum value of the sum of the distances from each sample to the clustering center, and the calculation formula is as follows:
Figure FDA0003014533490000011
4. a method as claimed in claim 3, wherein the clustering objects are analyzed by contour coefficient method to select the optimal number of clusters K, K being 9, and the height and width are (13, 11), (23, 17), (31, 28), (41, 20), (51, 33), (63, 51), (102, 61), (166, 116), (388, 244), respectively.
5. The method of claim 1, wherein the first convolutional layer filters an input image with 416 x 416 resolution using 32 convolutional kernels with size of 3 x 3, and then takes the output of the previous convolutional layer as the input of the next convolutional layer, and performs convolution operation using 64 convolutional kernels with size of 3 x 3 pixels and with step size of 2 pixels, so as to realize down-sampling by 2 times, and obtain a feature map with 208 x 208 resolution; then, 5 sets of 2 × Resblock _ bodies are added and executed in the network, and after 4 times of downsampling, feature maps with the sizes of 104 × 104, 52 × 52, 26 × 26 and 13 × 13 are obtained respectively.
6. The method as claimed in claim 1, wherein the input image is adjusted to 416 x 416, and 5 times of downsampling are performed to obtain 52 x 52, 26 x 26 and 13 x 13 feature maps with different scales for multi-scale detection.
7. The method of claim 1, wherein all cars are labeled Car using Labeling software.
8. A traffic vehicle object detecting device, comprising:
the preprocessing module is used for preprocessing the traffic images to obtain traffic images of different scenes and marking vehicles on the preprocessed traffic images;
the clustering module is used for carrying out dimension clustering on the traffic images marked by the vehicles, setting a threshold value for clustering Anchor Box, and selecting a relatively far point as a next initial clustering center;
and the characteristic extraction module detection module is used for adding a network layer to the shallow layer part to perform shallow layer characteristic extraction, the deep layer network part adopts a multilayer residual error network to perform characteristic extraction, and multi-scale detection is performed after the characteristic extraction is performed through the multilayer residual error network, so that the detection of different traffic vehicle targets in the traffic image is realized.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented when the computer program is executed by the processor.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202110387355.0A 2021-04-09 2021-04-09 Traffic vehicle target detection method, device, equipment and readable storage medium Active CN113076898B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110387355.0A CN113076898B (en) 2021-04-09 2021-04-09 Traffic vehicle target detection method, device, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110387355.0A CN113076898B (en) 2021-04-09 2021-04-09 Traffic vehicle target detection method, device, equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN113076898A true CN113076898A (en) 2021-07-06
CN113076898B CN113076898B (en) 2023-09-15

Family

ID=76617243

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110387355.0A Active CN113076898B (en) 2021-04-09 2021-04-09 Traffic vehicle target detection method, device, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN113076898B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766098A (en) * 2019-11-07 2020-02-07 中国石油大学(华东) Traffic scene small target detection method based on improved YOLOv3
WO2020140371A1 (en) * 2019-01-04 2020-07-09 平安科技(深圳)有限公司 Deep learning-based vehicle damage identification method and related device
WO2020181685A1 (en) * 2019-03-12 2020-09-17 南京邮电大学 Vehicle-mounted video target detection method based on deep learning
CN111695514A (en) * 2020-06-12 2020-09-22 长安大学 Vehicle detection method in foggy days based on deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020140371A1 (en) * 2019-01-04 2020-07-09 平安科技(深圳)有限公司 Deep learning-based vehicle damage identification method and related device
WO2020181685A1 (en) * 2019-03-12 2020-09-17 南京邮电大学 Vehicle-mounted video target detection method based on deep learning
CN110766098A (en) * 2019-11-07 2020-02-07 中国石油大学(华东) Traffic scene small target detection method based on improved YOLOv3
CN111695514A (en) * 2020-06-12 2020-09-22 长安大学 Vehicle detection method in foggy days based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
胡臣辰;陈贤富;: "基于YOLO改进残差网络结构的车辆检测方法", 信息技术与网络安全, no. 09 *
许小伟;陈乾坤;钱枫;李浩东;唐志鹏;: "基于小型化YOLOv3的实时车辆检测及跟踪算法", 公路交通科技, no. 08 *

Also Published As

Publication number Publication date
CN113076898B (en) 2023-09-15

Similar Documents

Publication Publication Date Title
CN109816024B (en) Real-time vehicle logo detection method based on multi-scale feature fusion and DCNN
CN111461083A (en) Rapid vehicle detection method based on deep learning
Cepni et al. Vehicle detection using different deep learning algorithms from image sequence
CN111695514A (en) Vehicle detection method in foggy days based on deep learning
CN111079604A (en) Method for quickly detecting tiny target facing large-scale remote sensing image
CN108960074B (en) Small-size pedestrian target detection method based on deep learning
CN104537359A (en) Vehicle object detection method and device
CN117496384B (en) Unmanned aerial vehicle image object detection method
CN111915583A (en) Vehicle and pedestrian detection method based on vehicle-mounted thermal infrared imager in complex scene
CN114049572A (en) Detection method for identifying small target
CN116824543A (en) Automatic driving target detection method based on OD-YOLO
Gao et al. Traffic signal image detection technology based on YOLO
Li et al. Method research on ship detection in remote sensing image based on Yolo algorithm
CN115937736A (en) Small target detection method based on attention and context awareness
Uzar et al. Performance analysis of YOLO versions for automatic vehicle detection from UAV images
CN117593623A (en) Lightweight vehicle detection method based on improved YOLOv8n model
CN114943903B (en) Self-adaptive clustering target detection method for aerial image of unmanned aerial vehicle
CN113076898B (en) Traffic vehicle target detection method, device, equipment and readable storage medium
Wang et al. YOLO-ERF: lightweight object detector for UAV aerial images
CN113947723B (en) High-resolution remote sensing scene target detection method based on size balance FCOS
CN115761667A (en) Unmanned vehicle carried camera target detection method based on improved FCOS algorithm
Zhou et al. Research on Vehicle Tracking Algorithm Based on Deep Learning
Shao et al. Research on yolov5 vehicle object detection algorithm based on attention mechanism
Liao Road Damage Intelligent Detection with Deep Learning Techniques
Zhang et al. Research on traffic target detection method based on improved yolov3

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant