CN114842428A - Smart traffic-oriented complex multi-target hierarchical combined accurate detection method - Google Patents

Smart traffic-oriented complex multi-target hierarchical combined accurate detection method Download PDF

Info

Publication number
CN114842428A
CN114842428A CN202210337923.0A CN202210337923A CN114842428A CN 114842428 A CN114842428 A CN 114842428A CN 202210337923 A CN202210337923 A CN 202210337923A CN 114842428 A CN114842428 A CN 114842428A
Authority
CN
China
Prior art keywords
dimensional
prediction
target
detection model
category
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210337923.0A
Other languages
Chinese (zh)
Inventor
张晖
滕婷婷
赵海涛
朱洪波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202210337923.0A priority Critical patent/CN114842428A/en
Priority to JP2022077903A priority patent/JP7320307B1/en
Publication of CN114842428A publication Critical patent/CN114842428A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Geophysics And Detection Of Objects (AREA)

Abstract

The invention provides a complex multi-target layered and graded combined accurate detection method for intelligent traffic, which comprises the steps of firstly constructing three types of target detection models for a single-dimensional scene, then selecting a corresponding single-dimensional time detection model and a corresponding single-dimensional weather detection model according to the current time type and the current weather type, then carrying out layered combined detection on the two models and m3 single-dimensional target detection models, finally adopting a layered combined machine to manufacture a layered combined standard, and sequentially determining which target type the model belongs to according to two different grades of dimension and probability for the first layer combination; and for the second-layer joint detection, determining the target category according to two different grades of conformity and probability. The invention can be widely applied to the layered and graded combined accurate detection facing to the complex multi-target traffic in the field of machine vision, can realize the all-weather accurate detection of various targets all the day while ensuring the cost, and has very wide application prospect.

Description

Smart traffic-oriented complex multi-target hierarchical combined accurate detection method
Technical Field
The invention relates to a complex multi-target layered and graded combined accurate detection method for intelligent traffic, and belongs to the field of machine vision.
Background
In recent years, with the rapid development of deep learning and the appearance of high-performance graphics cards, the development of computer vision technology is greatly promoted. The target detection based on the deep learning can automatically extract the target characteristics, manual analysis, design and extraction are not needed to be performed by wasting much time, and the detection precision and the scene applicability are improved to a great extent, so that the vehicle and pedestrian detection technology based on the deep learning has raised a research enthusiasm.
The vehicle pedestrian detection algorithm solves the problems that: all vehicles and pedestrians, including positions and sizes, in the image or video frame are found and generally represented by rectangular boxes. Most of the existing vehicle and pedestrian target detection methods innovate a target detection network, a feature extraction method and the like, the methods focus on target behaviors in a picture video, but target behavior information in the picture video is limited, and scene information in the picture video and the relevance between a scene and the target behaviors are often ignored. However, the scene information in the picture video and the correlation between the scene and the target behavior have a direct influence on the accuracy of the vehicle and pedestrian detection algorithm.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a complex multi-target layered and graded combined accurate detection method for intelligent traffic, which can realize all-weather accurate detection of various targets all the day while ensuring the cost.
In order to solve the technical problems, the invention adopts the technical scheme that:
the invention provides a smart traffic-oriented complex multi-target hierarchical combined accurate detection method, wherein a smart traffic environment is a three-dimensional scene space comprising time-class, weather-class and target-class scenes, and the method comprises the following specific steps:
step 1, training YOLOv3 respectively for three single-dimensional scenes, namely a time scene, a weather scene and a target scene, to obtain three types of target detection models for the single-dimensional scenes: the system comprises a single-dimensional time detection model, a single-dimensional weather detection model and a single-dimensional target detection model;
step 2, respectively carrying out target detection on the image to be detected based on the single-dimensional time detection model, the single-dimensional weather detection model and the single-dimensional target detection model;
and 3, performing layered combination on the target detection result obtained in the step 2, and outputting a final detection result of the image to be detected.
Further, three types of target detection models facing the single-dimensional scene are constructed in the step 1, specifically as follows:
(1) according to the requirements, time class single-dimensional scenes are divided into m1 classes, weather class single-dimensional scenes are divided into m2 classes, and target class single-dimensional scenes are divided into m3+1 classes;
(2) marking training samples of each category for each category of the time-class single-dimensional scene, and training YOLOv3 to obtain m1 single-dimensional time detection models; marking training samples of all categories facing to all categories of the weather type single-dimensional scene, and training YOLOv3 to obtain m2 single-dimensional weather detection models; marking training samples of all categories facing to all categories of the target class single-dimensional scene, and training YOLOv3 to obtain m3 single-dimensional target detection models; the single-dimensional time detection model and the single-dimensional weather detection model respectively comprise m3+1 outputs corresponding to m3 type targets and 1 type of other targets, and the single-dimensional target detection model comprises 2 outputs corresponding to 1 type targets and 1 type of other targets.
Further, in the step 2, according to the category of the image to be detected in the time-class single-dimensional scene and the weather-class single-dimensional scene, the corresponding single-dimensional time detection model and the corresponding single-dimensional weather detection model are selected to perform target detection on the image to be detected.
Further, the layering association in step 3 is specifically as follows:
(1) respectively fusing the output results of the m3 single-dimensional target detection models with the output results of the single-dimensional time detection model, and simultaneously fusing the output results of the m3 single-dimensional target detection models with the output results of the single-dimensional weather detection model, which is the first layer combination;
(2) and further fusing the two fusion results of the first layer combination, namely the second layer combination.
Further, when the first layer is combined, output results of m3 single-dimensional target detection models and the single-dimensional time detection model/single-dimensional weather detection model are superposed, and a fusion result is output after the superposition result is processed as follows:
merging the mutually overlapped prediction frames into one prediction frame, and keeping other non-overlapped prediction frames unchanged; the merging principle is as follows:
(1) if the prediction frames corresponding to the single-dimensional time detection model/single-dimensional weather detection model in the mutually overlapped prediction frames are the same as the category of any one prediction frame corresponding to the single-dimensional target detection model and the category of the prediction frame is not other than the other prediction frames, the category to which the combined prediction frame belongs is the same category;
(2) if the types of the prediction frames corresponding to the single-dimensional time detection model/the single-dimensional weather detection model and all the prediction frames corresponding to the single-dimensional target detection model in the mutually overlapped prediction frames are other, the types of the combined prediction frames are other;
(3) if the prediction frames corresponding to the single-dimensional time detection model/single-dimensional weather detection model in the mutually overlapped prediction frames are different from the prediction frames corresponding to any one single-dimensional target detection model in the category, taking the prediction frames
Figure BDA0003575077530000021
P uothe The category corresponding to the maximum value of the three is used as the category to which the combined prediction frame belongs, and the probability of the category to which the combined prediction frame belongs is the maximum value; wherein,
Figure BDA0003575077530000022
the category of the prediction frame representing the corresponding single-dimensional time detection model/single-dimensional weather detection model is j l Probability of j l One of m3+1 outputs representing a one-dimensional time detection model/one-dimensional weather detection model;
Figure BDA0003575077530000023
j represents the category of the prediction frame of the corresponding single-dimensional target detection model k The joint probability of (a) is determined,
Figure BDA0003575077530000024
Figure BDA0003575077530000025
j k represents one of m3 objects, P uother Representing the other joint probability of the class to which the prediction box corresponding to the single-dimensional target detection model belongs,
Figure BDA0003575077530000026
Figure BDA0003575077530000027
Figure BDA0003575077530000028
indicating that the output includes j k The output of the single-dimensional object detection model is j k The joint probability of (a) is determined,
Figure BDA0003575077530000029
indicating that the output includes j k The output of the single-dimensional object detection model of (2) is other probabilities, k is 1,2, …, m 3; (ii) a If the output includes j k The single-dimensional target detection model does not output a prediction frame, then
Figure BDA0003575077530000031
(4) If the prediction frames which are mutually overlapped do not have the prediction frames corresponding to the single-dimensional time detection model/the single-dimensional weather detection model, then: (a) if all the prediction frames belong to the same category, the merged prediction frames belong to other categories; (b) if a prediction frame B with a different category from other prediction frames exists, the category of the combined prediction frame is the category of the prediction frame B;
and (II) for a single prediction box which does not overlap with the prediction box, temporarily saving the prediction box and keeping the probability of the prediction box belonging to the category unchanged.
Further, the second layer is combined specifically as follows: and superposing the two combined fusion results of the first layer, and outputting a final detection result after carrying out the following processing on the superposition result:
combining the mutually overlapped prediction frames into one prediction frame, and judging the category of the combined prediction frame according to the following rules:
(1) if the classes of the mutually overlapped prediction frames are the same, the classes of the combined prediction frames are the same;
(2) if the types of the mutually overlapped prediction frames are different, comparing the corresponding conformity of the prediction frames:
(a) if the conformity is different, the category and the probability of the combined prediction frame are the category and the probability of the prediction frame with smaller conformity;
(b) if the conformity degrees are the same, the category and the probability of the combined prediction frame are the category and the probability of the prediction frame with higher corresponding probability;
and (II) for a single prediction frame which is not overlapped with the single prediction frame, if the probability of the class to which the prediction frame belongs is smaller than the false detection threshold value, deleting the prediction frame, and if not, keeping the prediction frame and keeping the probability unchanged.
Further, when the first layer/the second layer are combined, under the condition that the categories of the mutually overlapped prediction frames are the same, the probability of the category to which the combined prediction frame belongs is updated, and the updated probability is
Figure BDA0003575077530000032
Wherein q represents the number of prediction frames belonging to the same category among the mutually overlapped prediction frames,
Figure BDA0003575077530000033
indicates the probability of the class of the o-th prediction box belonging to the same class, and delta indicates the offset value.
Further, the formula of the false detection threshold d1 is defined as follows:
d1=d+δ×BV
wherein d is a false detection basic threshold value, delta is a coefficient, BV is a background difference amount,
Figure BDA0003575077530000034
Figure BDA0003575077530000035
H Currenti 、S Currenti 、V Currenti respectively representing the number of hue H, saturation S and lightness V of the image to be detected, wherein H is the number of the three components of hue H, saturation S and lightness V which are taken as i Basei 、S Basei 、V Basei H, S, V three components representing the reference image take the number of i, and H + S + V represents the total number of H, S, V three components in the image to be detected and the reference image.
Further, the conformity is divided into two categories, namely time-class single-dimensional scene conformity and weather-class single-dimensional scene conformity:
time class single-dimensional scene conformity APM of image to be detected TCurrent =|ADER T -DER Current |
Weather type single-dimensional scene conformity APM of image to be detected WCurrent =|ADER W -DER Current |
Wherein, ADER T Average dynamic rate of change, ADER, for time-like single-dimensional scenes W Average dynamic rate of change, DER, for weather-like single-dimensional scenes Current The dynamic change rate of the image to be detected;
if APM TCurrent >APM WCurrent And if not, the category to which the combined prediction frame belongs is the category corresponding to the output result of the single-dimensional weather detection model.
Further, the dynamic change rate is an average of the change rates of the gray distribution between the image to be detected and the two frames of images before and after the image to be detected, and the change rates of the gray distribution between the image to be detected and the two frames of images before and after the image to be detected are as follows:
Figure BDA0003575077530000041
wherein, PR grayrC Representing the proportion of the pixel points with the gray scale value r in the image to be detected in all the pixel points of the image to be detected,
Figure BDA0003575077530000042
representing the proportion of pixel points with the gray value r in the front/rear frame images of the image to be detected in all the pixel points of the front/rear frame images;
the average dynamic change rate of the time-class/weather-class single-dimensional scene is the average value of the dynamic change rates among a plurality of groups of continuous 3-frame images.
Has the advantages that: the intelligent traffic-oriented complex multi-target layered and graded combined accurate detection method provided by the invention can realize all-weather accurate detection of various targets all the day while ensuring the cost. Firstly, constructing three types of target detection models facing a single-dimensional scene, then selecting corresponding single-dimensional time detection models and single-dimensional weather detection models according to the current time type and the current weather type by using the scenes of two dimensions of the time type and the weather type as priori knowledge, then carrying out layered joint detection on the two models and m3 single-dimensional target detection models, finally providing a layered joint detection mechanism as a standard of the layered joint detection, and sequentially determining which target type the first-layer joint detection belongs to according to two different levels (priority is sequentially reduced) of the dimension and the probability; for the second layer joint detection, the target category is determined according to two different levels (the priority is reduced in turn) of the conformity and the probability. The invention can be widely applied to layered and graded combined accurate detection for complex traffic and multiple targets in the field of machine vision, can realize accurate detection of various targets while ensuring the cost, and has very wide application prospect.
Drawings
FIG. 1 is a schematic flow chart of three types of detection model construction oriented to a single-dimensional scene;
FIG. 2 is a schematic flow diagram of hierarchical federation;
FIG. 3 is a schematic flow diagram of a first layer association;
FIG. 4 is a schematic flow diagram of a second layer association;
fig. 5 is a schematic flow chart of a complex multi-target hierarchical combined accurate detection method for intelligent traffic.
Detailed Description
In order to describe the intelligent traffic-oriented complex multi-target hierarchical combined accurate detection method in more detail, the following description is further provided with reference to the accompanying drawings and specific embodiments.
In one embodiment, as shown in fig. 5, a method for smart traffic-oriented complex multi-target hierarchical joint accurate detection is provided, in which the smart traffic environment is regarded as a three-dimensional scene space including time-class, weather-class and target-class scenes, and the specific steps are as follows:
step 1, training YOLOv3 respectively for three single-dimensional scenes, namely a time scene, a weather scene and a target scene, and obtaining three types of target detection models facing the single-dimensional scenes: the system comprises a single-dimensional time detection model, a single-dimensional weather detection model and a single-dimensional target detection model;
step 2, respectively carrying out target detection on the image to be detected based on the single-dimensional time detection model, the single-dimensional weather detection model and the single-dimensional target detection model;
and 3, performing layered combination on the target detection result obtained in the step 2, and outputting a final detection result of the image to be detected.
In an embodiment, as shown in fig. 1, three types of target detection models facing a single-dimensional scene are constructed in step 1, specifically as follows:
(1) according to the requirements, time class single-dimensional scenes are divided into m1 classes, weather class single-dimensional scenes are divided into m2 classes, and target class single-dimensional scenes are divided into m3+1 classes;
(2) marking training samples of each category for each category of the time-class single-dimensional scene, and training YOLOv3 to obtain m1 single-dimensional time detection models; marking training samples of all categories facing to all categories of the weather type single-dimensional scene, and training YOLOv3 to obtain m2 single-dimensional weather detection models; marking training samples of all categories facing to all categories of the target class single-dimensional scene, and training YOLOv3 to obtain m3 single-dimensional target detection models; the single-dimensional time detection model and the single-dimensional weather detection model respectively comprise m3+1 outputs corresponding to m3 type targets and 1 type of other targets, and the single-dimensional target detection model comprises 2 outputs corresponding to 1 type targets and 1 type of other targets.
In one embodiment, step 1 comprises the steps of:
s101, constructing a three-dimensional scene space;
aiming at the traffic complex environment: different time scenarios such as day/night/morning/evening, etc., different weather scenarios such as sunny/cloudy/rainy/snowy, etc., and traffic multiple targets: the thought method for constructing a three-dimensional scene space is provided for target types such as motor vehicles, non-motor vehicles, pedestrians and other types, and the three-dimensional scenes are respectively as follows: time, weather and target single-dimensional scenes, wherein the time single-dimensional scenes can be divided into early morning/evening/…, and m1 types are provided; the weather type single-dimensional scenes can be divided into sunny days/cloudy days/…, and m2 types, and the target type single-dimensional scenes can be divided into motor vehicles/non-motor vehicles/pedestrians/…/others, and m3+1 types.
Step S102, sample data is selected;
and respectively selecting corresponding sample data for the time class, the weather class and the target class single-dimensional scenes.
Step S103, marking a sample;
and performing sample labeling on the sample data collected in the step S102 by adopting a data labeling tool labellimg.
Step S104, training a model;
and (3) using the labeled and sorted data set for YOLOv3 model training to obtain m1 single-dimensional time detection models, m2 single-dimensional weather detection models and m3 single-dimensional target detection models. Wherein the outputs of the single-dimensional time detection model/the single-dimensional weather detection model are motor vehicles/non-motor vehicles/pedestrians/…/others, and the total number of the outputs is m3+ 1; the output of the single-dimensional object detection model is one of "automotive/other", "non-automotive/other", "pedestrian/other" ….
In an embodiment, in step 2, according to the category of the image to be detected in the time-class single-dimensional scene and the weather-class single-dimensional scene, a corresponding single-dimensional time detection model and a corresponding single-dimensional weather detection model are selected to perform target detection on the image to be detected.
In one embodiment, as shown in fig. 2, the hierarchical association in step 3 mainly comprises the following steps:
step S201, first layer association
And fusing the output results of the m3 single-dimensional target detection models and the output results of the single-dimensional time detection models, and fusing the output results of the m3 single-dimensional target detection models and the output results of the single-dimensional weather detection models, wherein the first layer of joint detection is performed.
Step S202, second layer association
And further fusing and outputting the two fusion results of the first layer joint detection, wherein the two fusion results are the second layer joint detection.
In one embodiment, because the single-dimensional time detection model, the single-dimensional weather detection model and the single-dimensional target detection model are subjected to sample marking and training for each category of the target-class single-dimensional scene, the single-dimensional target detection model has higher accuracy in identifying the target in the picture to be detected compared with the single-dimensional time detection model/single-dimensional weather detection model, and the detected target is j k The single-dimensional target detection model identifies the target as j k No false detection, missing detection and identification of target not j k There is no false detection but there is a possibility of missed detection (note here, j k Is any one of motor vehicles, non-motor vehicles and pedestrians … but not others, k is more than or equal to 1 and less than or equal to m3), the single-dimensional time detection model/single-dimensional weather detection model can falsely detect m3+1 target categories, but can miss detection only when the categories are others. In the first layer of association, output results of m3 single-dimensional target detection models and the single-dimensional time detection model/single-dimensional weather detection model are superposed, and after the superposition results are processed according to the flow shown in fig. 3, a fusion result is output:
merging the mutually overlapped prediction frames into one prediction frame, and keeping other non-overlapped prediction frames unchanged; the merging principle is as follows:
(1) if the category to which the prediction frame corresponding to the one-dimensional time detection model/one-dimensional weather detection model and any prediction frame corresponding to the one-dimensional target detection model belong in the mutually overlapped prediction frames is A, the category to which the combined prediction frame belongs is A. Here, a is any one of a motor vehicle, a non-motor vehicle, and a pedestrian … but not others.
(2) If the types of the prediction frames corresponding to the single-dimensional time detection model/single-dimensional weather detection model and all the prediction frames corresponding to the single-dimensional target detection model in the mutually overlapped prediction frames are other, the types of the combined prediction frames are other.
In the above two cases (1) and (2), the probability update of the category to which the merged prediction box belongs is:
Figure BDA0003575077530000071
wherein q represents the number of prediction frames belonging to the same category among the mutually overlapped prediction frames,
Figure BDA0003575077530000072
indicates the probability of the class of the o-th prediction box belonging to the same class, and delta indicates the offset value.
(3) If the prediction frames corresponding to the single-dimensional time detection model/single-dimensional weather detection model in the mutually overlapped prediction frames are different from the prediction frames corresponding to any one single-dimensional target detection model in the category, taking the prediction frames
Figure BDA0003575077530000073
P uoth The category corresponding to the maximum value among the three is taken as the category to which the combined prediction frame belongs, and the probability of the category to which the combined prediction frame belongs is the maximum value.
In general, the detection target is j k The output object class is j k Is expressed as
Figure BDA0003575077530000074
The probability of outputting the other object class is expressed as
Figure BDA0003575077530000075
Where other indicates that the output target class is other, 1 ≦ k ≦ m 3. In particular, no prediction box is present when detecting a certain targetOutput, i.e. missing inspection, at which time
Figure BDA0003575077530000076
In general, the output target class of the single-dimensional time detection model is j l Is expressed as
Figure BDA0003575077530000077
Wherein T represents time, j l Refers to one of m3+1 object categories. In particular, when detecting a target, there is no prediction block output, i.e. missing detection, when j l To others, a probability of
Figure BDA0003575077530000078
In general, the output target class of the single-dimensional weather detection model is j l′ Is expressed as
Figure BDA00035750775300000710
Wherein W represents time, j l′ Refers to one of m3+1 object categories. It should be noted here that j l And j l′ May be the same or different. In particular, when detecting a certain target, there is no prediction box output, i.e. missing detection, when j l′ To others, a probability of
Figure BDA00035750775300000711
Thus, the one-dimensional object model output class is j k The joint probability of (c) is calculated as follows:
Figure BDA00035750775300000712
the joint probability calculation for the single-dimensional object model output class as other is as follows:
Figure BDA00035750775300000713
(4) if the prediction frames which are mutually overlapped do not have the prediction frames corresponding to the single-dimensional time detection model/the single-dimensional weather detection model, then: (a) if all the prediction frames belong to the same category, the merged prediction frames belong to other categories; (b) if a prediction frame A with a different category from other prediction frames exists, the category of the combined prediction frame is the category of the prediction frame A;
and (II) for a single prediction box which does not overlap with the prediction box, temporarily saving the prediction box and keeping the probability of the prediction box belonging to the category unchanged.
In one embodiment, the second layer of union superimposes the two fusion results of the first layer of union, and outputs a final detection result after processing the superimposed result according to the flow shown in fig. 4:
combining the mutually overlapped prediction frames into one prediction frame, and judging the category of the combined prediction frame according to the following rules:
(1) if the classes of the mutually overlapped prediction frames are the same, the classes of the combined prediction frames are the same; in this case, the probability update of the category to which the prediction box after merging belongs is:
Figure BDA0003575077530000081
wherein q represents the number of prediction frames belonging to the same category among the mutually overlapped prediction frames,
Figure BDA0003575077530000082
indicates the probability of the class of the o-th prediction box belonging to the same class, and delta indicates the offset value.
(2) If the types of the mutually overlapped prediction frames are different, comparing the corresponding conformity of the prediction frames:
(a) if the conformity is different, the category and the probability of the combined prediction frame are the category and the probability of the prediction frame with smaller conformity;
(b) if the conformity degrees are the same, the category and the probability of the combined prediction frame are the category and the probability of the prediction frame with higher corresponding probability;
and (II) for a single prediction frame which is not overlapped with the single prediction frame, if the probability of the class to which the prediction frame belongs is smaller than the false detection threshold value, deleting the prediction frame, and if not, keeping the prediction frame and keeping the probability unchanged.
In one embodiment, the formula for the false detection threshold d1 is defined as follows:
d1=D(BV)=d+δ×BV
wherein d is the false detection basic threshold, delta is the coefficient, BV is the background difference,
Figure BDA0003575077530000083
Figure BDA0003575077530000084
H Currenti 、S Currenti 、V Currenti respectively representing the number of hue H, saturation S and lightness V of the image to be detected, wherein H is the number of the three components of hue H, saturation S and lightness V which are taken as i Basei 、S Basei 、V Basei H, S, V representing the number of three components of the reference image, which take the value of i; h + S + V represents the total number of H, S, V three components in the image to be detected and the reference image for normalization.
In one embodiment, the conformity is obtained by a dynamic change rate, which is defined as an average value of the gray distribution change rates between the image to be detected and the two frames of images before and after the image to be detected, wherein the gray distribution change rates between the image to be detected and the two frames of images before and after the image to be detected are as follows:
Figure BDA0003575077530000085
wherein PR grayrC Representing the proportion of the pixel points with the gray scale value r in the image to be detected in all the pixel points of the image to be detected,
Figure BDA0003575077530000091
representing the proportion of the pixel point with the gray value r in the front/rear frame image of the image to be detected in all the pixel points of the front/rear frame image. It can be seen that the dynamic rate of change is the rate of change of the gray distribution of successive 3 frames of images relative to each other.
According to the average dynamic change rate ADER of the corresponding time scene of the image to be detected T Average dynamic rate of change ADER of corresponding weather scene W And the dynamic change rate DER of the image to be detected Current Calculating the conformity:
APM TCurrent =|ADER T -DER Current |
APM WCurrent =|ADER W -DER Current |
wherein APM WCurrent For weather-like single-dimensional scene conformity, APM, of the image to be detected TCurrent For temporal single-dimensional scene conformity, ADER, of the image to be detected T Average dynamic rate of change, ADER, for time-like single-dimensional scenes W Average dynamic rate of change, DER, for weather-like single-dimensional scenes Current The dynamic change rate of the image to be detected is obtained. The average dynamic change rate of the time-class/weather-class single-dimensional scene is the average value of the dynamic change rates among a plurality of groups of continuous 3-frame images.
Because the current scene is a scene with two crossed dimensions of time and weather, the dynamic change rate of the current scene is respectively differed with the average dynamic change rate of the corresponding time scene and the corresponding weather scene and then an absolute value is obtained, so that the conformity can be obtained, then whether the image to be detected is closer to the corresponding time scene or the corresponding weather scene is judged, the image to be detected can be more accurately described by the close scene, and therefore, the target can be more accurately predicted by a model established by the close scene, therefore, if the APM (adaptive multi-mode processor) is used for predicting the target, the APM can be used for predicting the target more accurately, and the APM can be used for predicting the target more accurately Tnow >APM Wnow The weather scene output is used as the standard, otherwise, the time scene output is used as the standard.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can understand that the modifications or substitutions should be included in the scope of the present invention, and therefore, the scope of the present invention should be subject to the protection scope of the claims.

Claims (10)

1. The intelligent traffic-oriented complex multi-target hierarchical combined accurate detection method is characterized in that the intelligent traffic environment is a three-dimensional scene space comprising time-class, weather-class and target-class scenes, and the method specifically comprises the following steps:
step 1, training YOLOv3 respectively for three single-dimensional scenes, namely a time scene, a weather scene and a target scene, to obtain three types of target detection models for the single-dimensional scenes: the system comprises a single-dimensional time detection model, a single-dimensional weather detection model and a single-dimensional target detection model;
step 2, respectively carrying out target detection on the image to be detected based on the single-dimensional time detection model, the single-dimensional weather detection model and the single-dimensional target detection model;
and 3, performing layered combination on the target detection result obtained in the step 2, and outputting a final detection result of the image to be detected.
2. The intelligent traffic-oriented complex multi-target hierarchical combined accurate detection method according to claim 1, wherein three types of target detection models oriented to a single-dimensional scene are constructed in the step 1, and specifically the following steps are performed:
(1) according to the requirements, time class single-dimensional scenes are divided into m1 classes, weather class single-dimensional scenes are divided into m2 classes, and target class single-dimensional scenes are divided into m3+1 classes;
(2) marking training samples of each category for each category of the time-class single-dimensional scene, and training YOLOv3 to obtain m1 single-dimensional time detection models; marking training samples of all categories facing to all categories of the weather type single-dimensional scene, and training YOLOv3 to obtain m2 single-dimensional weather detection models; marking training samples of all categories facing to all categories of the target class single-dimensional scene, and training YOLOv3 to obtain m3 single-dimensional target detection models; the single-dimensional time detection model and the single-dimensional weather detection model respectively comprise m3+1 outputs corresponding to m3 type targets and 1 type of other targets, and the single-dimensional target detection model comprises 2 outputs corresponding to 1 type targets and 1 type of other targets.
3. The intelligent traffic-oriented complex multi-target hierarchical combined accurate detection method according to claim 2, wherein in the step 2, according to categories of the images to be detected in the time-class single-dimensional scene and the weather-class single-dimensional scene, corresponding single-dimensional time detection models and single-dimensional weather detection models are selected to perform target detection on the images to be detected.
4. The intelligent traffic-oriented complex multi-target hierarchical joint accurate detection method according to claim 3, wherein the hierarchical joint in the step 3 is specifically as follows:
(1) respectively fusing the output results of the m3 single-dimensional target detection models with the output results of the single-dimensional time detection model, and simultaneously fusing the output results of the m3 single-dimensional target detection models with the output results of the single-dimensional weather detection model, which is the first layer combination;
(2) the two fusion results of the first layer union are further fused, which is the second layer union.
5. The intelligent traffic-oriented complex multi-target hierarchical joint accurate detection method according to claim 4, wherein in the first hierarchical joint, output results of m3 single-dimensional target detection models and the single-dimensional time detection model/single-dimensional weather detection model are superposed, and a fusion result is output after the superposition result is processed as follows:
merging the mutually overlapped prediction frames into one prediction frame, and keeping other non-overlapped prediction frames unchanged; the merging principle is as follows:
(1) if the prediction frames corresponding to the single-dimensional time detection model/single-dimensional weather detection model in the mutually overlapped prediction frames are the same as the category of any one prediction frame corresponding to the single-dimensional target detection model and the category of the prediction frame is not other than the other prediction frames, the category to which the combined prediction frame belongs is the same category;
(2) if the types of the prediction frames corresponding to the single-dimensional time detection model/the single-dimensional weather detection model and all the prediction frames corresponding to the single-dimensional target detection model in the mutually overlapped prediction frames are other, the types of the combined prediction frames are other;
(3) if the prediction frames corresponding to the single-dimensional time detection model/single-dimensional weather detection model in the mutually overlapped prediction frames are different from the prediction frames corresponding to any one single-dimensional target detection model in the category, taking the prediction frames
Figure FDA0003575077520000021
P uothe The category corresponding to the maximum value of the three is used as the category to which the combined prediction frame belongs, and the probability of the category to which the combined prediction frame belongs is the maximum value; wherein,
Figure FDA0003575077520000022
the category of the prediction frame representing the corresponding single-dimensional time detection model/single-dimensional weather detection model is j l Probability of j l One of m3+1 outputs representing a one-dimensional time detection model/one-dimensional weather detection model;
Figure FDA0003575077520000023
j represents the category of the prediction frame of the corresponding single-dimensional target detection model k The joint probability of (a) is determined,
Figure FDA0003575077520000024
Figure FDA0003575077520000025
j k represents one of m3 objects, P uother Representing the other joint probability of the class to which the prediction box corresponding to the single-dimensional target detection model belongs,
Figure FDA0003575077520000026
Figure FDA0003575077520000027
Figure FDA0003575077520000028
indicating that the output includes j k The output of the single-dimensional object detection model is j k The joint probability of (a) is determined,
Figure FDA0003575077520000029
indicating that the output includes j k The output of the single-dimensional object detection model of (2) is other probabilities, k is 1,2, …, m 3; (ii) a If the output includes j k The single-dimensional object detection model of (2) does not output a prediction frame, then
Figure FDA00035750775200000210
(4) If the prediction frames which are mutually overlapped do not have the prediction frames corresponding to the single-dimensional time detection model/the single-dimensional weather detection model, then: (a) if all the prediction frames belong to the same category, the merged prediction frames belong to other categories; (b) if a prediction frame B with a different category from other prediction frames exists, the category of the combined prediction frame is the category of the prediction frame B;
and (II) for a single prediction box which does not overlap with the prediction box, temporarily saving the prediction box and keeping the probability of the prediction box belonging to the category unchanged.
6. The intelligent traffic-oriented complex multi-target hierarchical combined accurate detection method according to claim 4, wherein the second-layer combination is as follows: and superposing the two combined fusion results of the first layer, and outputting a final detection result after performing the following processing on the superposition result:
combining the mutually overlapped prediction frames into a prediction frame, and judging the category of the combined prediction frame according to the following rules:
(1) if the classes of the mutually overlapped prediction frames are the same, the classes of the combined prediction frames are the same;
(2) if the types of the mutually overlapped prediction frames are different, comparing the corresponding conformity of the prediction frames:
(a) if the conformity is different, the category and the probability of the combined prediction frame are the category and the probability of the prediction frame with smaller conformity;
(b) if the conformity degrees are the same, the category and the probability of the combined prediction frame are the category and the probability of the prediction frame with higher corresponding probability;
and (II) for a single prediction frame which is not overlapped with the single prediction frame, if the probability of the class to which the prediction frame belongs is smaller than the false detection threshold value, deleting the prediction frame, and if not, keeping the prediction frame and keeping the probability unchanged.
7. The intelligent traffic-oriented complex multi-target hierarchical combined accurate detection method according to claim 5 or 6, characterized in that when the first layer/the second layer are combined, under the condition that the categories of the mutually overlapped prediction frames are the same, the probabilities of the categories of the combined prediction frames are updated, and the updated probabilities are
Figure FDA0003575077520000031
Wherein q represents the number of prediction frames belonging to the same category among the mutually overlapped prediction frames,
Figure FDA0003575077520000032
indicates the probability of the class of the o-th prediction box belonging to the same class, and delta indicates the offset value.
8. The intelligent traffic-oriented complex multi-target hierarchical combined accurate detection method according to claim 7, wherein the formula of the false detection threshold d1 is defined as follows:
d1=d+δ×BV
wherein d is a false detection basic threshold value, delta is a coefficient, BV is a background difference amount,
Figure FDA0003575077520000033
Figure FDA0003575077520000034
H Currenti 、S Currenti 、V Currenti respectively representing images to be detectedThe hue H, the saturation S and the lightness V of the color image are taken as the number i, H Basei 、S Basei 、V Basei H, S, V three components representing the reference image take the number of i, and H + S + V represents the total number of H, S, V three components in the image to be detected and the reference image.
9. The intelligent traffic-oriented complex multi-target hierarchical combined accurate detection method according to claim 7, wherein the conformity is divided into two categories, namely time-class single-dimensional scene conformity and weather-class single-dimensional scene conformity:
time class single-dimensional scene conformity APM of image to be detected TCurrent =|ADER T -DER Current |
Weather type single-dimensional scene conformity APM of image to be detected WCurrent =|ADER W -DER Current |
Wherein, ADER T Average dynamic rate of change, ADER, for time-like single-dimensional scenes W Average dynamic rate of change, DER, for weather-like single-dimensional scenes Current The dynamic change rate of the image to be detected;
if APM TCurrent >APM WCurrent And if not, the category to which the combined prediction frame belongs is the category corresponding to the output result of the single-dimensional weather detection model.
10. The intelligent traffic-oriented complex multi-target hierarchical combined accurate detection method according to claim 9, wherein the dynamic change rate is an average of the gray distribution change rates between the image to be detected and the two frames of images before and after the image to be detected, and the gray distribution change rates between the image to be detected and the two frames of images before and after the image to be detected are as follows:
Figure FDA0003575077520000041
wherein, PR grayrC Representing inspection mapsThe proportion of the pixel points with the gray scale value r in the image in all the pixel points of the image to be detected,
Figure FDA0003575077520000042
representing the proportion of pixel points with the gray value r in the front/rear frame images of the image to be detected in all the pixel points of the front/rear frame images;
the average dynamic change rate of the time-class/weather-class single-dimensional scene is the average value of the dynamic change rates among a plurality of groups of continuous 3-frame images.
CN202210337923.0A 2022-03-31 2022-03-31 Smart traffic-oriented complex multi-target hierarchical combined accurate detection method Pending CN114842428A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210337923.0A CN114842428A (en) 2022-03-31 2022-03-31 Smart traffic-oriented complex multi-target hierarchical combined accurate detection method
JP2022077903A JP7320307B1 (en) 2022-03-31 2022-05-11 A Complex Multi-Target Precise Hierarchical Gradient Joint Detection Method for Intelligent Traffic

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210337923.0A CN114842428A (en) 2022-03-31 2022-03-31 Smart traffic-oriented complex multi-target hierarchical combined accurate detection method

Publications (1)

Publication Number Publication Date
CN114842428A true CN114842428A (en) 2022-08-02

Family

ID=82564816

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210337923.0A Pending CN114842428A (en) 2022-03-31 2022-03-31 Smart traffic-oriented complex multi-target hierarchical combined accurate detection method

Country Status (2)

Country Link
JP (1) JP7320307B1 (en)
CN (1) CN114842428A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960266A (en) * 2017-05-22 2018-12-07 阿里巴巴集团控股有限公司 Image object detection method and device
US20200012854A1 (en) * 2017-09-08 2020-01-09 Tencent Technology (Shenzhen) Company Ltd Processing method for augmented reality scene, terminal device, system, and computer storage medium
CN111222574A (en) * 2020-01-07 2020-06-02 西北工业大学 Ship and civil ship target detection and classification method based on multi-model decision-level fusion

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111047879A (en) * 2019-12-24 2020-04-21 苏州奥易克斯汽车电子有限公司 Vehicle overspeed detection method
KR102122850B1 (en) * 2020-03-03 2020-06-15 (주)사라다 Solution for analysis road and recognition vehicle license plate employing deep-learning
CN112487911B (en) * 2020-11-24 2024-05-24 中国信息通信科技集团有限公司 Real-time pedestrian detection method and device based on improvement yolov under intelligent monitoring environment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960266A (en) * 2017-05-22 2018-12-07 阿里巴巴集团控股有限公司 Image object detection method and device
US20200012854A1 (en) * 2017-09-08 2020-01-09 Tencent Technology (Shenzhen) Company Ltd Processing method for augmented reality scene, terminal device, system, and computer storage medium
CN111222574A (en) * 2020-01-07 2020-06-02 西北工业大学 Ship and civil ship target detection and classification method based on multi-model decision-level fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
邹香玲等: "智慧视频感知中复杂环境下目标检测的研究", 河南广播电视大学学报, no. 03, 22 August 2017 (2017-08-22) *

Also Published As

Publication number Publication date
JP7320307B1 (en) 2023-08-03
JP2023152231A (en) 2023-10-16

Similar Documents

Publication Publication Date Title
CN109977812B (en) Vehicle-mounted video target detection method based on deep learning
CN113128348B (en) Laser radar target detection method and system integrating semantic information
CN112233097B (en) Road scene other vehicle detection system and method based on space-time domain multi-dimensional fusion
CN108875803B (en) Hazardous chemical substance transport vehicle detection and identification method based on video image
CN114359181B (en) Intelligent traffic target fusion detection method and system based on image and point cloud
CN111104903A (en) Depth perception traffic scene multi-target detection method and system
CN111667512A (en) Multi-target vehicle track prediction method based on improved Kalman filtering
CN113011255B (en) Road surface detection method and system based on RGB image and intelligent terminal
CN111126393A (en) Vehicle appearance refitting judgment method and device, computer equipment and storage medium
CN110610153A (en) Lane recognition method and system for automatic driving
CN114913498A (en) Parallel multi-scale feature aggregation lane line detection method based on key point estimation
CN116071747A (en) 3D point cloud data and 2D image data fusion matching semantic segmentation method
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
CN106845458A (en) A kind of rapid transit label detection method of the learning machine that transfinited based on core
CN113888754A (en) Vehicle multi-attribute identification method based on radar vision fusion
CN112613434A (en) Road target detection method, device and storage medium
CN113408550B (en) Intelligent weighing management system based on image processing
CN114898243A (en) Traffic scene analysis method and device based on video stream
CN112785610B (en) Lane line semantic segmentation method integrating low-level features
CN114048536A (en) Road structure prediction and target detection method based on multitask neural network
CN117726996A (en) Traffic element detection method, device, equipment, medium and product
CN117130010A (en) Obstacle sensing method and system for unmanned vehicle and unmanned vehicle
CN117037085A (en) Vehicle identification and quantity statistics monitoring method based on improved YOLOv5
CN116630702A (en) Pavement adhesion coefficient prediction method based on semantic segmentation network
CN114842428A (en) Smart traffic-oriented complex multi-target hierarchical combined accurate detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination