CN112052826A - Intelligent enforcement multi-scale target detection method, device and system based on YOLOv4 algorithm and storage medium - Google Patents

Intelligent enforcement multi-scale target detection method, device and system based on YOLOv4 algorithm and storage medium Download PDF

Info

Publication number
CN112052826A
CN112052826A CN202010989852.3A CN202010989852A CN112052826A CN 112052826 A CN112052826 A CN 112052826A CN 202010989852 A CN202010989852 A CN 202010989852A CN 112052826 A CN112052826 A CN 112052826A
Authority
CN
China
Prior art keywords
data
scale
model
yolov4
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010989852.3A
Other languages
Chinese (zh)
Inventor
练镜锋
孙少峰
赵文超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Hantele Communication Co ltd
Original Assignee
Guangzhou Hantele Communication Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Hantele Communication Co ltd filed Critical Guangzhou Hantele Communication Co ltd
Priority to CN202010989852.3A priority Critical patent/CN112052826A/en
Publication of CN112052826A publication Critical patent/CN112052826A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an intelligent law enforcement multi-scale target detection method, a detection device, a detection system and a storage medium based on a YOLOv4 algorithm, which are based on a YOLOv4 algorithm and face to an intelligent law enforcement multi-scale target object for detection and alarm. The method comprises a data collection step, a data integration step, a data annotation step, a data division step, a multi-scale feature map distribution step, a Yolov4 model training step, a Yolov4 model verification step and a target detection step. The method provided by the invention has the advantages of high speed and good effect, can process multi-scale and large-scale data, supports multiple languages, supports user-defined loss functions and the like, and is a better alternative scheme for monitoring the multi-scale target by the intelligent enforcement.

Description

Intelligent enforcement multi-scale target detection method, device and system based on YOLOv4 algorithm and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a multi-scale object target detection method, a detection device, a detection system and a storage medium.
Background
With the development of economy, the number of mobile or unfixed illegal vendor motor vehicles is increased rapidly, and aiming at the management of the illegal vendor motor vehicles which are not operated, mobile and unfixed, a video monitoring system needs to be built at a key part in a jurisdiction area for real-time image monitoring. Meanwhile, a mobile video monitoring system is equipped for urban management law enforcement vehicles, and a command center can monitor and manage the positions of the law enforcement vehicles through a GPS (global positioning system), monitor the inside and outside states of the vehicles in real time through videos, and achieve the purposes of mobile monitoring, non-long-term fixed place monitoring and law enforcement team management. The intelligent early warning of the fortification area is to perform key fortification on the intelligent law enforcement management area, and perform intelligent early warning on various abnormal conditions through a video monitoring system, so that the safety and stability of the law enforcement management process are guaranteed.
On one hand, on the basis of a multi-scale object target detection method, a traditional image processing method and a deep learning detection method are mainly used. The target detection method based on the traditional image processing mainly comprises the steps of HOG, HOG + SVM, SURF, SIFT and the like; under a target detection method based on traditional image processing, when a large-focus monitoring scene is encountered, a near-end target and a far-end target are very different, and targets with multiple scales exist in the same scene. When the target prediction area is selected, the size and the length-width ratio of the sliding window cannot be effectively set by adopting the sliding window mode, so the exhaustion mode of the sliding window has long time consumption and high redundancy. And the target detection method based on deep learning mainly comprises R-CNN, Fast R-CNN, Faster R-CNN and the like. Most of target detection methods based on deep learning use a mode based on fixed anchor regression, and the fixed anchor cannot adapt to the condition of considering the size difference of multiple scales of targets, so that a detection network cannot be converged or the quality of a training network is low, and missed detection and false detection of the targets are easily caused.
On the other hand, in the multi-scale target detection scene of intelligent law enforcement monitoring management, more identification is still carried out by means of human, but because the intelligent law enforcement monitoring environment relates to urban road traffic conditions, if manual judgment is carried out, long-time observation is needed, the labor cost of monitoring in a defense area is high easily, and visual fatigue easily occurs in manual work, so that the situations of erroneous judgment and missed judgment are generated.
Disclosure of Invention
In order to overcome the defects of the prior art, one of the objectives of the present invention is to provide a method for detecting an intelligent law enforcement multi-scale target based on the YOLOv4 algorithm, which has the advantages of high speed, good effect, capability of processing multi-scale and large-scale data, supporting multiple languages, supporting custom loss functions, etc., and is a better alternative for monitoring the intelligent law enforcement multi-scale target.
The second objective of the present invention is to provide an intelligent law enforcement multi-scale target detection device based on the YOLOv4 algorithm.
The invention further aims to provide an intelligent law enforcement multi-scale target detection system based on the YOLOv4 algorithm.
It is a further object of the present invention to provide a computer readable storage medium.
One of the purposes of the invention is realized by adopting the following technical scheme:
a method for detecting an intelligent law-enforcement multi-scale target based on a YOLOv4 algorithm comprises the following steps:
a multi-scale object target detection method comprises the following steps:
a data collection step: collecting video image data of different time points and different angles of a multi-scale object target scene;
a data integration step: integrating the collected video image data;
data labeling: marking the integrated video image data and forming source data;
a data dividing step: dividing the source data into a training data set and a verification data set according to a preset proportion;
and (3) multi-scale feature map allocation step: aiming at the picture data set, the size of a prior frame is obtained by adopting a K-means clustering algorithm, and the flow of the K-means clustering algorithm is as follows: randomly selecting 9 prior frame center points from a data set as a centroid; calculating the Euclidean distance between the center point of each prior frame and the centroid, and dividing the closer the distance is, the corresponding set is obtained; after the sets are grouped, 3 sets exist, and the mass center of each set is recalculated; setting thresholds with different sizes according to different resolutions of large, medium and small, if the distance between the new centroid and the original centroid is smaller than the set threshold, terminating the algorithm, otherwise iterating the steps 2-4; finally, clustering prior frames with 9 sizes according to different scales;
YOLOv4 model training step: training learning is carried out on the training data set by using Yolov4, and the operation is as follows: inputting the extracted multi-size feature map into a CSPDarknet53 backbone network, wherein a CSPDarknet53 is a CSPNet network added on the basis of a Darknet53 backbone network of YOLOv3, the CSP network is called a cross-level connection part network, and the accuracy can be ensured while the calculated amount is reduced by integrating the gradient change into the feature map from head to tail, so that a feature pyramid under multiple scales is obtained; inputting the multi-scale features into the SPPNet network, wherein the SPPNet network is called a space pyramid pooling network, and aims to increase the receptive field of the network (namely the identification area of the target), and the characteristic dimension reduction is realized by alternately connecting the convolution layer and the pooling layer, so that the dimension-reduced features are obtained; accelerating information fusion of the shallow feature and the deep feature through a PANet path aggregation network to obtain fusion features of different scales; fourthly, finally outputting a training result through the full connection layer, wherein the training result comprises frame regression coordinates, a target classification result and confidence degrees; calculating a loss function value according to a corresponding result, wherein the loss function comprises three contents: frame regression loss, classification loss and confidence loss, wherein the maximum iteration number set by parameters is 50000 times, the initial learning rate is 0.01, the batch processing size is 32, the attenuation rate is 0.0005 and the momentum rate is 0.9, the learning rate and the batch processing size are properly adjusted according to the descending trend of the loss value, the training is stopped until the loss function value output by the training data set is less than or equal to the threshold value or the set maximum iteration number is reached, and a trained network model is obtained and recorded as a prediction model;
YOLOv4 model verification step: verifying the prediction model through the verification data set to obtain a model score, evaluating the model, screening out the model with the optimal prediction performance through model evaluation, and marking the model as a final model;
and a target detection step: and monitoring the multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored.
Further, in the data collection step, the multi-scale object target scene is a traffic law enforcement monitoring management area; the different time points at least comprise 6 time points of morning congestion, morning unobstructed, afternoon congestion, afternoon unobstructed, evening congestion and evening unobstructed; the congested and unobstructed scene conditions are divided according to the number of the vehicle targets of the road conditions, the congested road section is divided when the number of the snapshotted vehicles reaches more than 50 in the scene of snapshotting by the monitoring camera, the congested road section is generally divided in the peak hours of getting on and off duty in the city center, such as 8: 00-10: 00, 16: 00-18: 00 and 18: 00-20: 00, and the rest time of the city or the places near the suburb are generally unobstructed. The different points in time preferably comprise different weather and different seasons of the location of the multi-scale object target scene. The objects in the video image data include: any combination of a dolly, police car, taxi, minibus, bus for passenger, single-person electric vehicle, express electric vehicle, truck, sanitation vehicle, tank car, engineering vehicle, fire truck, ambulance, police motorcycle, other non-motor vehicle and pedestrian.
Further, in the data integration step, the collected video image data are placed in the same folder; in the YOLOv4 model verification step, model evaluation is performed by three indexes, including: recall, accuracy and average accuracy.
Further, in the data annotation step, depth model training annotation is performed on the integrated video image data to form source data, and an annotation range includes: the position of the image, the image name, the image width and height, the image dimension, the labeled object name and the xy coordinate value of the bbox; the labeled object name comprises: any combination of a dolly, police car, taxi, minibus, bus for passenger, single-person electric vehicle, express electric vehicle, truck, sanitation vehicle, tank car, engineering vehicle, fire truck, ambulance, police motorcycle, other non-motor vehicle and pedestrian. Preferably, the data annotation can be selected from an annotation tool LabelImg which basically contains all information of the object detection scene, including the picture name, the picture size, the picture storage path, the target position coordinates and the target category name. Of course, other labeling tools, such as Labelme, may be used.
Further, in the data partitioning step, the ratio of the training data set to the validation data set is 3:1, 7:3, 8:2, or 98: 2. The division of the training data set and the verification data set is generally divided into 7:3 or 8:2 and the like for a small sample (such as 10000); for large samples (e.g., 1000000), the proportion of the validation data set may be correspondingly smaller, e.g., 98:2, etc.
Further, in the data dividing step, samples are randomly sampled from each video division, the number of randomly sampled samples of each scene is consistent, uniform data distribution is achieved, and the sampling ratio of the training set to the testing set of each video is 3: 1.
Further, in the multi-scale feature map allocation step, large, medium and small multi-scale feature maps are dynamically allocated for different vehicle types.
Further, in the multi-scale feature map allocation step, the sizes of prior frames are obtained by adopting K-means clustering, 3 prior frames are set for each downsampling scale, and 9 prior frames with sizes are clustered in total; in the COCO dataset these 9 prior boxes are: (10x13), (16x30), (33x23), (30x61), (62x45), (59x119), (116x90), (156x198), and (373x 326); dynamic assignment, the 13x13 feature map applies a priori blocks (116x90), (156x198), (373x 326); the 26x26 signature applies a priori blocks (30x61), (62x45), (59x 119); the 52x52 signature applies a priori boxes (10x13), (16x30), (33x 23).
Further, in the target detection step, the traffic law enforcement monitoring management area is monitored by using the final model, a camera is used as the input of the model, and when target objects with different sizes are identified, alarm information is generated.
Further, in the YOLOv4 model prediction step, a model takes a picture, a video or a camera ip address as an input, and a model output is a target detection result. The input mode is different, but the principle is to process and analyze the picture. If the input is a video or camera IP address, reading one frame of picture from the video or camera, taking the frame of picture as the input of the model, further outputting a target detection result, and continuously reading the next frame of picture for analysis after the analysis is finished.
Further, in the YOLOv4 model verification step, model evaluation is performed through three indexes, including: recall, accuracy and average accuracy.
The second purpose of the invention is realized by adopting the following technical scheme:
an intelligent law enforcement multi-scale target detection device based on a YOLOv4 algorithm, comprising: one or more processors, and memory for storing one or more computer programs which, when executed by the one or more processors, perform the object detection step of one of the purposes: monitoring the multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored;
the process of establishing the final model comprises a data collection step, a data integration step, a data annotation step, a data division step, a multi-scale feature map distribution step, a Yolov4 model training step and a Yolov4 model verification step which are one of purposes.
The third purpose of the invention is realized by adopting the following technical scheme:
an intelligent law enforcement multi-scale target detection system based on a YOLOv4 algorithm, comprising:
an image acquisition device for acquiring image data to be analyzed;
a computing device coupled with the image acquisition device and comprising: one or more processors, and memory for storing one or more computer programs which, when executed by the one or more processors, perform the object detection step of one of the purposes: monitoring the multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored;
an alert device coupled with the computing device and configured to alert the alert information;
the process of establishing the final model comprises a data collection step, a data integration step, a data annotation step, a data division step, a multi-scale feature map distribution step, a Yolov4 model training step and a Yolov4 model verification step which are one of purposes.
The fourth purpose of the invention is realized by adopting the following technical scheme:
a computer readable storage medium having one or more computer programs stored thereon, wherein the one or more computer programs, when executed by one or more processors, perform the object detection step of one of the purposes: monitoring the multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored;
the process of establishing the final model comprises a data collection step, a data integration step, a data annotation step, a data division step, a multi-scale feature map distribution step, a Yolov4 model training step and a Yolov4 model verification step which are one of purposes.
Compared with the prior art, the invention has the beneficial effects that:
(1) the intelligent law enforcement multi-scale target detection method based on the YOLOv4 algorithm is suitable for complex scenes, can be applied to the field of intelligent law enforcement monitoring management, and is a multi-scale object target detection and alarm method for complex scene intelligent law enforcement monitoring management. The present invention employs the Prior detection (Prior detection) system of YOLOv4 to re-use classifiers or locators for performing detection tasks, assigning model dynamic applications to multiple locations and scales of the image. In addition, a completely different approach is used with respect to other object detection methods, applying a single neural network to the entire image, which network divides the image into different regions, thus predicting the bounding box and probability of each block of regions, which bounding boxes will be weighted by the predicted probability. The model has some advantages over classifier-based systems. Unlike R-CNN, which requires thousands of single target images, Yolov4 predicts through a single network evaluation, which makes Yolov4 very Fast, typically 1000 times faster than R-CNN, 100 times faster than Fast R-CNN.
(2) The method for detecting the intelligent law enforcement multi-scale target based on the YOLOv4 algorithm has the advantages of high speed, good effect, capability of processing multi-scale and large-scale data, supporting multiple languages, supporting custom loss functions and the like, and is a better alternative scheme for monitoring the intelligent law enforcement multi-scale target. The specific analysis is as follows:
the speed is high: discarding softmax, and performing multi-scale prediction by using Anchor bbox;
cross-platform: the method is suitable for Windows, Linux, macOS and a plurality of cloud platforms;
multilingual: support C + +, Python, R, Java, Scala, Julia, etc.;
the effect is good: wins many data science and machine learning challenges and is available for production by many companies.
(3) The intelligent law enforcement multi-scale target detection method based on the YOLOv4 algorithm further has the following advantages:
dynamically allocating anchors: and acquiring training data, performing data fitting on a training target, dynamically analyzing the characteristics of the anchor in different scales through big data fitting, and dynamically setting the value of the anchor.
Design network structure YOLOv 4: the target detection multi-scale branch in the YOLOv4 is designed, the problem of missed detection and false detection of target detection is solved through the input of multi-scale features, the accuracy of target detection can be effectively improved, and the overall effect of target identification is improved. Relative to YOLOv3, its backbone network meets the following requirements: firstly, the input resolution is high, and the detection accuracy of small objects is improved; secondly, more layers are provided, and the receptive field is improved to adapt to the increase of input; and thirdly, more parameters are used for improving the capability of detecting single-image multi-size targets. Overall, the accuracy is improved by nearly 10 points, and the speed is improved by a small amount.
Drawings
Fig. 1 is a flowchart of an intelligent-implementation multi-scale target detection method based on YOLOv4 algorithm according to embodiment 1 of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and the detailed description, and it should be noted that any combination of the embodiments or technical features described below can be used to form a new embodiment without conflict.
Example 1
As shown in fig. 1, a law enforcement intelligence oriented multi-scale object target detection method based on YOLOv4 algorithm includes the following steps:
a data collection step: collecting video image data of the intelligent law enforcement monitoring scene at different time points and different angles; the specific operation mode is as follows:
firstly, collecting time points: the method is divided into 6 scenes including morning congestion, morning unobstructed, afternoon congestion, afternoon unobstructed, evening congestion and evening unobstructed according to the scenes. The congested and unobstructed scene conditions are divided according to the number of the vehicle targets of the road conditions, the congested road section is divided when the number of the snapshotted vehicles reaches more than 50 in the scene of snapshotting by the monitoring camera, the congested road section is generally divided in the peak hours of getting on and off duty in the city center, such as 8: 00-10: 00, 16: 00-18: 00 and 18: 00-20: 00, and the rest time of the city or the places near the suburb are generally unobstructed.
Collecting places: near the construction area, near the city center, near the station and on other overpasses (such as fire departments, environmental protection departments, areas near hospitals and the like), because the shooting range is more consistent with the height of the intelligent traffic law enforcement camera.
Collecting modes and quantity: 30-second videos including road one-way and two-way angles and lateral angles are shot on the overpasses at all places. Because the frame rate of the video is generally about 1 second and 30 frames, the frame taking frequency is set to be 1 second and 3 frames are taken, and the total number of samples is at least 100 or more, namely the number of samples is at least about 10000. The number of the data set pictures is about 10000 video frame pictures, and the number of the sample videos is at least more than 100.
Acquiring data in the video image comprises the following steps: any combination of a dolly, police car, taxi, minibus, bus for passenger, single-person electric vehicle, express electric vehicle, truck, sanitation vehicle, tank car, engineering vehicle, fire truck, ambulance, police motorcycle, other non-motor vehicle and pedestrian.
A data integration step: integrating the collected video image data; the VOC2007 folder is newly created under the catalogue, and three folders of antagonists, ImageSets and JPEGImages are created under the VOC2007 folder. Then build new Main folder under ImageSets. And copying the collected data set picture to a JPEGImages directory.
Data labeling: carrying out depth model training and labeling on the integrated video image data and forming source data, wherein the specific operation mode is as follows:
the tool comprises: the used labeling tool is labellimg, and an xml labeling file is generated;
data set numbering: planning data, reducing the possibility of errors, randomly numbering about 10000 video frame pictures, and coding a reasonable sequence number for the pictures, such as 000001-000999;
marking data: and labeling data by using labelimg software, wherein each picture name corresponds to an xml label file with a corresponding name, such as a picture 000001.jpg, and the label file is 000001. xml. The range of labels includes: the position of the image, the name of the image (such as 000001.jpg), the width and height of the image, the dimension of the image, the name of the annotated object and the xy coordinate value of bbox; the annotated object names include: 1. a trolley car; 2. police car police-car; 3. taxi; 4. a minibus van; 5. bus of the bus; 6. minibus; 7. coach bus coach for passenger transport; 8. electric-vehicle of single person; 9. express-vehicle; 10. truck; 11. sanitation vehicle sanitation-truck; 12. tank truck tanker-truck; 13. engineering-truck; 14. fire-truck; 15. ambulance; 16. police motorcycle police-motorcycle; 17. other non-motor vehicles others; 18. pedestrian.
A data dividing step: dividing the source data according to a preset proportion, wherein a training data set accounts for 75%, and a verification data set accounts for 25%;
and (3) multi-scale feature map allocation step: and (3) obtaining the sizes of prior frames by adopting K-means clustering for the vehicle picture data set, setting 3 prior frames for each downsampling scale, and clustering the prior frames with 9 sizes in total. In the COCO dataset these 9 prior boxes are: (10x13), (16x30), (33x23), (30x61), (62x45), (59x119), (116x90), (156x198), (373x 326). In dynamic assignment, applying larger a priori boxes (116x90), (156x198), (373x326) on the smallest 13x13 feature map (with the largest receptive field) is suitable for detecting larger objects. Medium prior boxes (30x61), (62x45), (59x119) were applied on the medium 26x26 signature (medium receptive field), suitable for detecting medium sized objects. Smaller a priori boxes (10x13), (16x30) and (33x23) are applied to the larger 52x52 signature (smaller receptive field), which is suitable for detecting smaller objects.
TABLE 1 eigenmap multiscale assignment
Figure BDA0002690500250000111
YOLOv4 model training step: training learning is performed on the training data set by using Yolov4, and the operation is as follows: inputting the extracted multi-size feature map into a CSPDarknet53 backbone network, wherein a CSPDarknet53 is a CSPNet network added on the basis of a Darknet53 backbone network of YOLOv3, the CSP network is called a cross-level connection part network, and the accuracy can be ensured while the calculated amount is reduced by integrating the gradient change into the feature map from head to tail, so that a feature pyramid under multiple scales is obtained; inputting the multi-scale features into the SPPNet network, wherein the SPPNet network is called a space pyramid pooling network, and aims to increase the receptive field of the network (namely the identification area of the target), and the characteristic dimension reduction is realized by alternately connecting the convolution layer and the pooling layer, so that the dimension-reduced features are obtained; accelerating information fusion of the shallow feature and the deep feature through a PANet path aggregation network to obtain fusion features of different scales; fourthly, finally outputting a training result through the full connection layer, wherein the training result comprises frame regression coordinates, a target classification result and confidence degrees; calculating a loss function value according to a corresponding result, wherein the loss function comprises three contents: frame regression loss, classification loss and confidence loss, wherein the maximum iteration number set by parameters is 50000 times, the initial learning rate is 0.01, the batch processing size is 32, the attenuation rate is 0.0005 and the momentum rate is 0.9, the learning rate and the batch processing size are properly adjusted according to the descending trend of the loss value, the training is stopped until the loss function value output by the training data set is less than or equal to the threshold value or the set maximum iteration number is reached, and a trained network model is obtained and recorded as a prediction model;
YOLOv4 model verification step: and verifying the prediction model through the verification data set to obtain a model score, evaluating the model, and screening out the model with the optimal prediction performance through model evaluation. Model evaluation is performed through three indexes to verify the quality of the model, including:
recall (R: call): i.e. how many positive samples of the samples are predicted correctly;
precision (P: precision): i.e. how many of the samples predicted to be positive are true;
mean average precision (mAP): the mAP is the average of all classes of AP (average precision), and the average degree of goodness in all classes is measured.
Their calculation formulas are respectively as follows:
R=TP/(TP+FN);P=TP/(TP+FP);mAP=∫P(R)dR
in the formula, the first and second organic solvents are,
TP (true Positives): true positive samples (i.e., positive samples are correctly predicted as positive samples);
TN (true neurons): true negative examples (i.e., negative examples are correctly predicted as negative examples);
FP (false positives): false positive samples (i.e., negative samples are mispredicted as positive samples);
FN (false negatives): false negative samples (i.e., positive samples are mispredicted as negative samples);
p: the accuracy rate;
r: the recall ratio is as follows:
AP: area under PR curve (PR curve: Precision-Recall curve) measures whether detection is good or bad for one class, and mAP measures whether detection is good or bad for a plurality of classes.
The AP formula is as follows:
Figure BDA0002690500250000121
in this example, the accuracy was 94.23%, the recall was 93.82%, and the average accuracy value was 89.35%.
Figure BDA0002690500250000122
Figure BDA0002690500250000131
And a target detection step: the final model is utilized to monitor the intelligent law enforcement multi-scale target, a camera for intelligent law enforcement monitoring is used as the input of the model, and when the threatened object targets of various motor vehicles (such as a trolley, a police car, a taxi, a minibus, a bus, a minibus, a passenger bus, a single electric vehicle, an express electric vehicle, a truck, a sanitation vehicle, an oil tank truck, an engineering vehicle, a fire truck, an ambulance, a police motorcycle and other non-motor vehicles) with different scales are identified, the warning information is pushed to achieve the effect of monitoring the intelligent law enforcement multi-scale target.
According to the method for detecting the multi-scale object target facing the intelligent law enforcement based on the YOLOv4 algorithm, when the multi-scale feature map is distributed, 3 prior frames are subjected to size division in an unsupervised learning mode of kmeans clustering according to vehicle pictures in a sample scene, the size of the target is divided into large, medium and small, the size of the target corresponds to the size of the 3 prior frames respectively, and the method can be used for detecting the target at far and near distances in multiple scenes. However, the conventional target detection network generally only detects targets with similar sizes, and easily ignores targets with small object sizes in a large scene.
Example 2
An intelligent law enforcement multi-scale target detection device based on a YOLOv4 algorithm, comprising: one or more processors, and memory for storing one or more computer programs, the one or more computer programs when executed by the one or more processors, performing the object detecting step of: monitoring a multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored;
wherein, the establishment process of the final model is as follows:
a data collection step: collecting video image data of different time points and different angles of a multi-scale object target scene;
a data integration step: integrating the collected video image data;
data labeling: marking the integrated video image data and forming source data;
a data dividing step: dividing the source data into a training data set and a verification data set according to a preset proportion;
and (3) multi-scale feature map allocation step: aiming at the picture data set, the size of a prior frame is obtained by adopting a K-means clustering algorithm, and the flow of the K-means clustering algorithm is as follows: randomly selecting 9 prior frame center points from a data set as a centroid; calculating the Euclidean distance between the center point of each prior frame and the centroid, and dividing the closer the distance is, the corresponding set is obtained; after the sets are grouped, 3 sets exist, and the mass center of each set is recalculated; setting thresholds with different sizes according to different resolutions of large, medium and small, if the distance between the new centroid and the original centroid is smaller than the set threshold, terminating the algorithm, otherwise iterating the steps 2-4; finally, clustering prior frames with 9 sizes according to different scales.
YOLOv4 model training step: training learning is carried out on the training data set by using Yolov4, and the operation is as follows: inputting the extracted multi-size feature map into a CSPDarknet53 backbone network, wherein a CSPDarknet53 is a CSPNet network added on the basis of a Darknet53 backbone network of YOLOv3, the CSP network is called a cross-level connection part network, and the accuracy can be ensured while the calculated amount is reduced by integrating the gradient change into the feature map from head to tail, so that a feature pyramid under multiple scales is obtained; inputting the multi-scale features into the SPPNet network, wherein the SPPNet network is called a space pyramid pooling network, and aims to increase the receptive field of the network (namely the identification area of the target), and the characteristic dimension reduction is realized by alternately connecting the convolution layer and the pooling layer, so that the dimension-reduced features are obtained; accelerating information fusion of the shallow feature and the deep feature through a PANet path aggregation network to obtain fusion features of different scales; fourthly, finally outputting a training result through the full connection layer, wherein the training result comprises frame regression coordinates, a target classification result and confidence degrees; calculating a loss function value according to a corresponding result, wherein the loss function comprises three contents: frame regression loss, classification loss and confidence loss, wherein the maximum iteration number set by parameters is 50000 times, the initial learning rate is 0.01, the batch processing size is 32, the attenuation rate is 0.0005 and the momentum rate is 0.9, the learning rate and the batch processing size are properly adjusted according to the descending trend of the loss value, the training is stopped until the loss function value output by the training data set is less than or equal to the threshold value or the set maximum iteration number is reached, and a trained network model is obtained and recorded as a prediction model;
YOLOv4 model verification step: and verifying the prediction model through a verification data set, screening out a model with the optimal prediction performance through model evaluation, and marking as a final model.
In some embodiments, the processor may include various processing circuitry, such as, but not limited to, one or more of a central processor or a communications processor. The processor may perform control of at least one other component of the multi-scale object target detection apparatus, and/or perform operations or data processing related to communications. The memory may include volatile and/or non-volatile memory. The multi-scale object detection device may include, for example, a smart phone, a tablet computer, a desktop computer, an e-book reader, an MP3 player, an electronic bracelet, a smart watch, and the like.
Example 3
An intelligent law enforcement multi-scale target detection system based on a YOLOv4 algorithm, comprising:
an image acquisition device for acquiring image data to be analyzed;
a computing device coupled with the image acquisition device and comprising: one or more processors, and memory for storing one or more computer programs, the one or more computer programs when executed by the one or more processors, performing the object detecting step of: monitoring a multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored;
a warning device coupled to the computing device and configured to warn of the warning information.
Wherein, the establishment process of the final model is as follows:
a data collection step: collecting video image data of different time points and different angles of a multi-scale object target scene;
a data integration step: integrating the collected video image data;
data labeling: marking the integrated video image data and forming source data;
a data dividing step: dividing the source data into a training data set and a verification data set according to a preset proportion;
and (3) multi-scale feature map allocation step: aiming at the picture data set, the size of a prior frame is obtained by adopting a K-means clustering algorithm, and the flow of the K-means clustering algorithm is as follows: randomly selecting 9 prior frame center points from a data set as a centroid; calculating the Euclidean distance between the center point of each prior frame and the centroid, and dividing the closer the distance is, the corresponding set is obtained; after the sets are grouped, 3 sets exist, and the mass center of each set is recalculated; setting thresholds with different sizes according to different resolutions of large, medium and small, if the distance between the new centroid and the original centroid is smaller than the set threshold, terminating the algorithm, otherwise iterating the steps 2-4; finally, clustering prior frames with 9 sizes according to different scales.
YOLOv4 model training step: training learning is carried out on the training data set by using Yolov4, and the operation is as follows: inputting the extracted multi-size feature map into a CSPDarknet53 backbone network, wherein a CSPDarknet53 is a CSPNet network added on the basis of a Darknet53 backbone network of YOLOv3, the CSP network is called a cross-level connection part network, and the accuracy can be ensured while the calculated amount is reduced by integrating the gradient change into the feature map from head to tail, so that a feature pyramid under multiple scales is obtained; inputting the multi-scale features into the SPPNet network, wherein the SPPNet network is called a space pyramid pooling network, and aims to increase the receptive field of the network (namely the identification area of the target), and the characteristic dimension reduction is realized by alternately connecting the convolution layer and the pooling layer, so that the dimension-reduced features are obtained; accelerating information fusion of the shallow feature and the deep feature through a PANet path aggregation network to obtain fusion features of different scales; fourthly, finally outputting a training result through the full connection layer, wherein the training result comprises frame regression coordinates, a target classification result and confidence degrees; calculating a loss function value according to a corresponding result, wherein the loss function comprises three contents: frame regression loss, classification loss and confidence loss, wherein the maximum iteration number set by parameters is 50000 times, the initial learning rate is 0.01, the batch processing size is 32, the attenuation rate is 0.0005 and the momentum rate is 0.9, the learning rate and the batch processing size are properly adjusted according to the descending trend of the loss value, the training is stopped until the loss function value output by the training data set is less than or equal to the threshold value or the set maximum iteration number is reached, and a trained network model is obtained and recorded as a prediction model;
YOLOv4 model verification step: and verifying the prediction model through a verification data set, screening out a model with the optimal prediction performance through model evaluation, and marking as a final model.
In some embodiments, the computing device may be in wired or wireless connection with the image acquisition device. The warning device may be integrated with the computing device or the warning device and the computing device may be 2 separate components. The computing device may include, for example, a smart phone, a tablet computer, a desktop computer, an electronic book reader, an MP3 player, an electronic bracelet, a smart watch, and the like.
Example 4
A computer readable storage medium having one or more computer programs stored thereon, wherein the one or more computer programs, when executed by one or more processors, implement the object detection steps of: and monitoring a multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored.
Wherein, the establishment process of the final model is as follows:
a data collection step: collecting video image data of different time points and different angles of a multi-scale object target scene;
a data integration step: integrating the collected video image data;
data labeling: marking the integrated video image data and forming source data;
a data dividing step: dividing the source data into a training data set and a verification data set according to a preset proportion;
and (3) multi-scale feature map allocation step: aiming at the picture data set, the size of a prior frame is obtained by adopting a K-means clustering algorithm, and the flow of the K-means clustering algorithm is as follows: randomly selecting 9 prior frame center points from a data set as a centroid; calculating the Euclidean distance between the center point of each prior frame and the centroid, and dividing the closer the distance is, the corresponding set is obtained; after the sets are grouped, 3 sets exist, and the mass center of each set is recalculated; setting thresholds with different sizes according to different resolutions of large, medium and small, if the distance between the new centroid and the original centroid is smaller than the set threshold, terminating the algorithm, otherwise iterating the steps 2-4; finally, clustering prior frames with 9 sizes according to different scales.
YOLOv4 model training step: training learning is carried out on the training data set by using Yolov4, and the operation is as follows: inputting the extracted multi-size feature map into a CSPDarknet53 backbone network, wherein a CSPDarknet53 is a CSPNet network added on the basis of a Darknet53 backbone network of YOLOv3, the CSP network is called a cross-level connection part network, and the accuracy can be ensured while the calculated amount is reduced by integrating the gradient change into the feature map from head to tail, so that a feature pyramid under multiple scales is obtained; inputting the multi-scale features into the SPPNet network, wherein the SPPNet network is called a space pyramid pooling network, and aims to increase the receptive field of the network (namely the identification area of the target), and the characteristic dimension reduction is realized by alternately connecting the convolution layer and the pooling layer, so that the dimension-reduced features are obtained; accelerating information fusion of the shallow feature and the deep feature through a PANet path aggregation network to obtain fusion features of different scales; fourthly, finally outputting a training result through the full connection layer, wherein the training result comprises frame regression coordinates, a target classification result and confidence degrees; calculating a loss function value according to a corresponding result, wherein the loss function comprises three contents: frame regression loss, classification loss and confidence loss, wherein the maximum iteration number set by parameters is 50000 times, the initial learning rate is 0.01, the batch processing size is 32, the attenuation rate is 0.0005 and the momentum rate is 0.9, the learning rate and the batch processing size are properly adjusted according to the descending trend of the loss value, the training is stopped until the loss function value output by the training data set is less than or equal to the threshold value or the set maximum iteration number is reached, and a trained network model is obtained and recorded as a prediction model;
YOLOv4 model verification step: and verifying the prediction model through a verification data set, screening out a model with the optimal prediction performance through model evaluation, and marking as a final model.
In some embodiments, the computer readable medium may include, for example, a hard disk, a floppy disk, a magnetic medium, an optical recording medium, a DVD, a magneto-optical medium, and the like.
The above embodiments are only preferred embodiments of the present invention, and the protection scope of the present invention is not limited thereby, and any insubstantial changes and substitutions made by those skilled in the art based on the present invention are within the protection scope of the present invention.

Claims (10)

1. A method for detecting an intelligent law-enforcement multi-scale target based on a YOLOv4 algorithm is characterized by comprising the following steps:
a data collection step: collecting video image data of different time points and different angles of a multi-scale object target scene;
a data integration step: integrating the collected video image data;
data labeling: marking the integrated video image data and forming source data;
a data dividing step: dividing the source data into a training data set and a verification data set according to a preset proportion;
and (3) multi-scale feature map allocation step: adopting a K-means clustering algorithm to obtain the sizes of prior frames aiming at the picture data set, and clustering the prior frames with 9 sizes according to different scales;
YOLOv4 model training step: training learning is carried out on the training data set by using Yolov4, and the operation is as follows: inputting the extracted 9 size characteristic maps into a CSPDarknet53 backbone network, wherein the CSPDarknet53 is a CSPNet network added on the basis of a Darknet53 backbone network of YOLOv 3; inputting the multi-scale features into the SPPNet network; accelerating information fusion of the shallow feature and the deep feature through a PANet path aggregation network to obtain fusion features of different scales; fourthly, finally outputting the training result through the full connection layer; calculating a loss function value according to a corresponding result, adjusting the learning rate and the batch processing size according to the descending trend of the loss function value, stopping training until the loss function value output by the training data set is less than or equal to a threshold value or reaches a set maximum iteration number, obtaining a trained network model, and marking as a prediction model;
YOLOv4 model verification step: verifying the prediction model through the verification data set, screening out a model with optimal prediction performance through model evaluation, and marking as a final model;
and a target detection step: and monitoring the multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored.
2. The method for intelligently enforcing multi-scale object detection based on YOLOv4 algorithm according to claim 1, wherein in the data collecting step, the multi-scale object scene is a traffic enforcement monitoring management area; the different time points at least comprise 6 time points of morning congestion, morning unobstructed, afternoon congestion, afternoon unobstructed, evening congestion and evening unobstructed; the objects in the video image data include: any combination of a dolly, police car, taxi, minibus, bus for passenger, single-person electric vehicle, express electric vehicle, truck, sanitation vehicle, tank car, engineering vehicle, fire truck, ambulance, police motorcycle, other non-motor vehicle and pedestrian.
3. The method of claim 1, wherein in the data integration step, the collected video image data are placed in the same folder; in the YOLOv4 model verification step, model evaluation is performed by three indexes, including: recall, accuracy and average accuracy.
4. The method as claimed in claim 2, wherein in the step of data annotation, the integrated video image data is annotated and source data is formed, and the annotation range includes: the position of the image, the image name, the image width and height, the image dimension, the labeled object name and the xy coordinate value of the bbox; the labeled object name comprises: any combination of a dolly, police car, taxi, minibus, bus for passenger, single-person electric vehicle, express electric vehicle, truck, sanitation vehicle, tank car, engineering vehicle, fire truck, ambulance, police motorcycle, other non-motor vehicle and pedestrian.
5. The method of claim 1, wherein in the step of data partitioning, the ratio of the training data set to the validation data set is 3:1, 7:3, 8:2 or 98: 2.
6. The method for intelligently enforcing the multi-scale object detection based on the YOLOv4 algorithm of claim 1, wherein in the step of assigning the multi-scale feature map, K-means clustering is used to obtain the sizes of the prior frames, 3 prior frames are set for each downsampling scale, and 9 prior frames are clustered together; in the COCO dataset these 9 prior boxes are: (10x13), (16x30), (33x23), (30x61), (62x45), (59x119), (116x90), (156x198), and (373x 326); dynamic assignment, the 13x13 feature map applies a priori blocks (116x90), (156x198), (373x 326); the 26x26 signature applies a priori blocks (30x61), (62x45), (59x 119); the 52x52 signature applies a priori boxes (10x13), (16x30), (33x 23).
7. The method of claim 1, wherein in the step of detecting the target, a traffic enforcement monitoring management area is monitored by using the final model, a camera is used as an input of the model, and when various target objects with different sizes are identified, warning information is generated.
8. An intelligent law enforcement multi-scale target detection device based on a YOLOv4 algorithm, comprising: one or more processors, and memory for storing one or more computer programs which, when executed by the one or more processors, perform the object detection steps of claim 1: monitoring the multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored;
the process of establishing the final model comprises the steps of data collection, data integration, data annotation, data division, multi-scale feature map allocation, Yolov4 model training and Yolov4 model verification as claimed in claim 1.
9. An intelligent law enforcement multi-scale target detection system based on a YOLOv4 algorithm, comprising:
an image acquisition device for acquiring image data to be analyzed;
a computing device coupled with the image acquisition device and comprising: one or more processors, and memory for storing one or more computer programs which, when executed by the one or more processors, perform the object detection steps of claim 1: monitoring the multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored;
an alert device coupled with the computing device and configured to alert the alert information;
the process of establishing the final model comprises the steps of data collection, data integration, data annotation, data division, multi-scale feature map allocation, Yolov4 model training and Yolov4 model verification as claimed in claim 1.
10. A computer readable storage medium, having one or more computer programs stored thereon, wherein the one or more computer programs, when executed by one or more processors, implement the object detection step of claim 1: monitoring the multi-scale object target scene by using the final model, and generating alarm information when a specific target object is monitored;
the process of establishing the final model comprises the steps of data collection, data integration, data annotation, data division, multi-scale feature map allocation, Yolov4 model training and Yolov4 model verification as claimed in claim 1.
CN202010989852.3A 2020-09-18 2020-09-18 Intelligent enforcement multi-scale target detection method, device and system based on YOLOv4 algorithm and storage medium Pending CN112052826A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010989852.3A CN112052826A (en) 2020-09-18 2020-09-18 Intelligent enforcement multi-scale target detection method, device and system based on YOLOv4 algorithm and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010989852.3A CN112052826A (en) 2020-09-18 2020-09-18 Intelligent enforcement multi-scale target detection method, device and system based on YOLOv4 algorithm and storage medium

Publications (1)

Publication Number Publication Date
CN112052826A true CN112052826A (en) 2020-12-08

Family

ID=73604140

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010989852.3A Pending CN112052826A (en) 2020-09-18 2020-09-18 Intelligent enforcement multi-scale target detection method, device and system based on YOLOv4 algorithm and storage medium

Country Status (1)

Country Link
CN (1) CN112052826A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308054A (en) * 2020-12-29 2021-02-02 广东科凯达智能机器人有限公司 Automatic reading method of multifunctional digital meter based on target detection algorithm
CN112465072A (en) * 2020-12-22 2021-03-09 浙江工业大学 Excavator image identification method based on YOLOv4 model
CN112560816A (en) * 2021-02-20 2021-03-26 北京蒙帕信创科技有限公司 Equipment indicator lamp identification method and system based on YOLOv4
CN112785557A (en) * 2020-12-31 2021-05-11 神华黄骅港务有限责任公司 Belt material flow detection method and device and belt material flow detection system
CN112802302A (en) * 2020-12-31 2021-05-14 国网浙江省电力有限公司双创中心 Electronic fence method and system based on multi-source algorithm
CN112989606A (en) * 2021-03-16 2021-06-18 上海哥瑞利软件股份有限公司 Data algorithm model checking method, system and computer storage medium
CN112990131A (en) * 2021-04-27 2021-06-18 广东科凯达智能机器人有限公司 Method, device, equipment and medium for acquiring working gear of voltage change-over switch
CN113033604A (en) * 2021-02-03 2021-06-25 淮阴工学院 Vehicle detection method, system and storage medium based on SF-YOLOv4 network model
CN113158962A (en) * 2021-05-06 2021-07-23 北京工业大学 Swimming pool drowning detection method based on YOLOv4
CN113221646A (en) * 2021-04-07 2021-08-06 山东捷讯通信技术有限公司 Method for detecting abnormal objects of urban underground comprehensive pipe gallery based on Scaled-YOLOv4
CN113298032A (en) * 2021-06-16 2021-08-24 武汉卓目科技有限公司 Unmanned aerial vehicle visual angle image vehicle target detection method based on deep learning
CN113420607A (en) * 2021-05-31 2021-09-21 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-scale target detection and identification method for unmanned aerial vehicle
CN113516643A (en) * 2021-07-13 2021-10-19 重庆大学 Method for detecting retinal vessel bifurcation and intersection points in OCTA image

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615868A (en) * 2018-12-20 2019-04-12 北京以萨技术股份有限公司 A kind of video frequency vehicle based on deep learning is separated to stop detection method
CN110059554A (en) * 2019-03-13 2019-07-26 重庆邮电大学 A kind of multiple branch circuit object detection method based on traffic scene
CN110136449A (en) * 2019-06-17 2019-08-16 珠海华园信息技术有限公司 Traffic video frequency vehicle based on deep learning disobeys the method for stopping automatic identification candid photograph
CN110889324A (en) * 2019-10-12 2020-03-17 南京航空航天大学 Thermal infrared image target identification method based on YOLO V3 terminal-oriented guidance

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615868A (en) * 2018-12-20 2019-04-12 北京以萨技术股份有限公司 A kind of video frequency vehicle based on deep learning is separated to stop detection method
CN110059554A (en) * 2019-03-13 2019-07-26 重庆邮电大学 A kind of multiple branch circuit object detection method based on traffic scene
CN110136449A (en) * 2019-06-17 2019-08-16 珠海华园信息技术有限公司 Traffic video frequency vehicle based on deep learning disobeys the method for stopping automatic identification candid photograph
CN110889324A (en) * 2019-10-12 2020-03-17 南京航空航天大学 Thermal infrared image target identification method based on YOLO V3 terminal-oriented guidance

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ALEXEY BOCHKOVSKIY ET.AL: "YOLOv4: Optimal Speed and Accuracy of Object Detection", 《ARXIV:2004.10934V1》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465072A (en) * 2020-12-22 2021-03-09 浙江工业大学 Excavator image identification method based on YOLOv4 model
CN112465072B (en) * 2020-12-22 2024-02-13 浙江工业大学 Excavator image recognition method based on YOLOv4 model
CN112308054A (en) * 2020-12-29 2021-02-02 广东科凯达智能机器人有限公司 Automatic reading method of multifunctional digital meter based on target detection algorithm
CN112785557A (en) * 2020-12-31 2021-05-11 神华黄骅港务有限责任公司 Belt material flow detection method and device and belt material flow detection system
CN112802302A (en) * 2020-12-31 2021-05-14 国网浙江省电力有限公司双创中心 Electronic fence method and system based on multi-source algorithm
CN113033604A (en) * 2021-02-03 2021-06-25 淮阴工学院 Vehicle detection method, system and storage medium based on SF-YOLOv4 network model
CN113033604B (en) * 2021-02-03 2022-11-15 淮阴工学院 Vehicle detection method, system and storage medium based on SF-YOLOv4 network model
CN112560816A (en) * 2021-02-20 2021-03-26 北京蒙帕信创科技有限公司 Equipment indicator lamp identification method and system based on YOLOv4
CN112989606A (en) * 2021-03-16 2021-06-18 上海哥瑞利软件股份有限公司 Data algorithm model checking method, system and computer storage medium
CN113221646A (en) * 2021-04-07 2021-08-06 山东捷讯通信技术有限公司 Method for detecting abnormal objects of urban underground comprehensive pipe gallery based on Scaled-YOLOv4
CN112990131A (en) * 2021-04-27 2021-06-18 广东科凯达智能机器人有限公司 Method, device, equipment and medium for acquiring working gear of voltage change-over switch
CN113158962A (en) * 2021-05-06 2021-07-23 北京工业大学 Swimming pool drowning detection method based on YOLOv4
CN113420607A (en) * 2021-05-31 2021-09-21 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-scale target detection and identification method for unmanned aerial vehicle
CN113298032A (en) * 2021-06-16 2021-08-24 武汉卓目科技有限公司 Unmanned aerial vehicle visual angle image vehicle target detection method based on deep learning
CN113516643A (en) * 2021-07-13 2021-10-19 重庆大学 Method for detecting retinal vessel bifurcation and intersection points in OCTA image

Similar Documents

Publication Publication Date Title
CN112052826A (en) Intelligent enforcement multi-scale target detection method, device and system based on YOLOv4 algorithm and storage medium
CN109816024B (en) Real-time vehicle logo detection method based on multi-scale feature fusion and DCNN
Al-qaness et al. An improved YOLO-based road traffic monitoring system
Kenk et al. DAWN: vehicle detection in adverse weather nature dataset
CN109241349B (en) Monitoring video multi-target classification retrieval method and system based on deep learning
CN110738857B (en) Vehicle violation evidence obtaining method, device and equipment
CN113033604B (en) Vehicle detection method, system and storage medium based on SF-YOLOv4 network model
CN111507989A (en) Training generation method of semantic segmentation model, and vehicle appearance detection method and device
CN113822247B (en) Method and system for identifying illegal building based on aerial image
CN106446150A (en) Method and device for precise vehicle retrieval
CN104615986A (en) Method for utilizing multiple detectors to conduct pedestrian detection on video images of scene change
CN107862072B (en) Method for analyzing vehicle urban-entering fake plate crime based on big data technology
CN115424217A (en) AI vision-based intelligent vehicle identification method and device and electronic equipment
CN112785610B (en) Lane line semantic segmentation method integrating low-level features
Kiew et al. Vehicle route tracking system based on vehicle registration number recognition using template matching algorithm
EP3244344A1 (en) Ground object tracking system
CN116413740B (en) Laser radar point cloud ground detection method and device
CN110765900B (en) Automatic detection illegal building method and system based on DSSD
Kamenetsky et al. Aerial car detection and urban understanding
CN111832463A (en) Deep learning-based traffic sign detection method
Amala Ruby Florence et al. Accident Detection System Using Deep Learning
Caballo et al. YOLO-based Tricycle Detection from Traffic Video
Hou et al. Application of YOLO V2 in construction vehicle detection
CN109800685A (en) The determination method and device of object in a kind of video
CN112052824A (en) Gas pipeline specific object target detection alarm method, device and system based on YOLOv3 algorithm and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201208

RJ01 Rejection of invention patent application after publication