CN111540203B - Method for adjusting green light passing time based on fast-RCNN - Google Patents

Method for adjusting green light passing time based on fast-RCNN Download PDF

Info

Publication number
CN111540203B
CN111540203B CN202010361892.3A CN202010361892A CN111540203B CN 111540203 B CN111540203 B CN 111540203B CN 202010361892 A CN202010361892 A CN 202010361892A CN 111540203 B CN111540203 B CN 111540203B
Authority
CN
China
Prior art keywords
network
rcnn
fast
training
vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010361892.3A
Other languages
Chinese (zh)
Other versions
CN111540203A (en
Inventor
周武能
廖凯立
黄建华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Nuclear Furstate Software Technology Co ltd
Donghua University
Original Assignee
Shanghai Nuclear Furstate Software Technology Co ltd
Donghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Nuclear Furstate Software Technology Co ltd, Donghua University filed Critical Shanghai Nuclear Furstate Software Technology Co ltd
Priority to CN202010361892.3A priority Critical patent/CN111540203B/en
Publication of CN111540203A publication Critical patent/CN111540203A/en
Application granted granted Critical
Publication of CN111540203B publication Critical patent/CN111540203B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/07Controlling traffic signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention relates to a method for adjusting green light passing time based on fast-RCNN, which comprises the following steps: (1) constructing a waiting vehicle identification model; firstly, extracting features to obtain a feature map, then extracting candidate areas to obtain the feature map with candidate frames with different sizes, then mapping to obtain a small feature map, and finally carrying out full-connection operation on the small feature map to obtain a vehicle condition monitoring map with an identification frame; (2) training a waiting vehicle identification model; firstly, establishing a training set, then training a feature extraction network, then sharing Fast-RCNN network and RPN network parameters, and finally performing model pruning; (3) the method comprises the steps of collecting a vehicle condition monitoring graph when the reciprocal of a red light is 20-40 s in a traffic intersection video, inputting the vehicle condition monitoring graph into a waiting vehicle identification model, outputting the vehicle condition monitoring graph with identification frames, counting the number of waiting vehicles by counting the number of the identification frames, and adjusting green light passing time according to the number of the waiting vehicles. The model of the invention has simple structure and short time consumption, and can realize adaptive timing.

Description

Method for adjusting green light passing time based on fast-RCNN
Technical Field
The invention belongs to the technical field of multi-target identification, and relates to a method for adjusting green light passing time based on fast-RCNN.
Background
The existing traffic light timing is fixed, so that the green light time is too short under certain conditions, the passing is incomplete, or the waiting time of vehicles in other directions is too long due to too long time, and the congestion is caused. At present, there are two solutions, one is to predict and re-allocate the traffic flow at the intersection, and the second is to perform adaptive allocation according to the number of waiting vehicles at the current intersection. Compared with the two methods, the method has higher requirements on the technology and processing equipment when the vehicle waits for the self-adaptability timing according to the current intersection, but is more beneficial to relieving the traffic jam condition.
For vehicles waiting for red light, the vehicle detection under a static state can be modeled, and because the traffic light timing is a real-time problem, the number of the waiting vehicles under the red light condition needs to be counted each time, so that the algorithm efficiency is very necessary to be accelerated. When the algorithm processing speed is fast enough, the requirement of real-time detection can be met.
The static multi-target detection problem can be processed by a deep learning method. Convolutional Neural Networks (CNN) are generally used for image processing, and recognize features in an image by learning the features through convolutional layers. However, the CNN network is slow in the speed of model training and difficult to express the time correlation between data, and on this basis, the improved Regional Convolutional Neural Network (RCNN) can make up for the problem of time correlation.
In the detection research of rail damage, the traditional RCNN algorithm has the defect that the calculation amount is too large because the CNN calculation is carried out once for each region explosals; in cloth flaw detection, the improved Fast-RCNN selects a candidate frame by using a selective search and selects a candidate region on a feature map, so that the problem of repeated calculation of each candidate region is avoided, the calculation speed is improved, but the problem that a third-party tool selective search is used for extracting the feature map is also brought, and the test shows that the selective search needs 2s for extracting the candidate region on a cpu, so that the identification time is increased.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides a method for adjusting green light passing time based on fast-RCNN, which is short in time consumption. Aiming at the problem that the traffic passing efficiency is reduced when the traffic light is fixed, the algorithm for waiting vehicles based on identification and statistics is used for self-adaptive timing; aiming at the problems of long time consumption of an identification algorithm and long time consumption of calling third-party software in the traditional RCNN and Fast-RCNN, the invention provides a red light waiting vehicle statistical method based on the Fast-RCNN algorithm and optimized by using model pruning in combination with the special application background of identifying waiting vehicles.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
the method for adjusting the green light passing time based on the fast-RCNN comprises the following steps:
(1) constructing a waiting vehicle identification model;
(1.1) feature extraction;
inputting the vehicle condition monitoring graph into a feature extraction network, and outputting a feature graph by the feature extraction network;
(1.2) extracting a candidate region;
taking the feature map as the input of an RPN (resilient packet network) in a Faster-RCNN (fast-forward neural network), generating an anchor frame by the RPN to judge the feature map, correcting the anchor frame per se, and outputting the feature map with candidate frames with different sizes, wherein the candidate frames are the anchor frames for judging that the candidate areas are vehicles;
(1.3) mapping;
because the full-link layer needs the candidate frames with the same size, the candidate frames are mapped onto the feature map obtained in step (1.2) by adopting a Region of interest Pooling layer (ROI Pooling), the mesh size of each sampling is calculated by setting a fixed scale by the Region of interest Pooling layer, then the maximum value is sampled, and any effective features in the Region of interest are converted into a small feature map with an H multiplied by W fixed spatial range (the small feature map is a subgraph of the feature map, as long as the small feature map is smaller than the original feature map and the size of the small feature map behind each feature map ROI is ensured to be the same);
(1.4) identifying the vehicle;
performing full connection operation on the small feature maps formed by the region-of-interest pooling layer to obtain a vehicle condition monitoring map with an identification frame (the candidate frame and the identification frame are not a concept, the candidate frame on the feature map may be a part of a vehicle, and the identification frame corresponds to a frame of the vehicle), and completing classification operation on the frame while classifying, so as to obtain the positioning of the vehicle, namely, forming a waiting vehicle identification model based on fast-RCNN;
(2) training a waiting vehicle identification model;
(2.1) establishing a training set;
obtaining a training set for vehicle detection, adopting a manual label labeling mode, and dividing training samples in the training set into 2 types: the training sample labeled as 'yes vehicle' contains a car, and the training sample labeled as 'no vehicle' contains other types of objects except vehicles;
(2.2) training a feature extraction network;
(2.3) sharing Fast-RCNN network and RPN network parameters;
(2.3.1) initializing convolutional layer parameters of Fast-RCNN network and RPN network by using VGG-16 convolutional neural network, and recording the initialization parameters including weight and bias as W0;
(2.3.2) training an RPN network by taking W0 as a parameter and taking a feature map as an input to obtain a candidate region;
(2.3.3) training a Fast-RCNN network by taking the candidate region as input and taking the feature map with the candidate frame as theoretical output, and marking the parameter of the Fast-RCNN network at the moment as W1, so far, the parameter of each layer of the two networks is not shared at all;
(2.3.4) initializing a new RPN network by using W1, and setting the learning rate of the feature extraction network parameters shared by the new RPN network and the Fast-RCNN network to 0, namely, only learning the parameters specific to the new RPN network, and fixing the feature extraction network, so that the two networks share all the common convolutional layers;
(2.3.5) still fixing the shared convolutional layer, adding the specific network layer of the Fast-RCNN network, continuing training, and finely adjusting the specific network layer of the Fast-RCNN network to ensure that the RPN network and the Fast-RCNN network completely share parameters;
(2.4) carrying out model pruning;
model pruning is a model compression method, introducing sparsity to dense connection of a deep neural network, reducing the number of non-zero weights by directly setting 'unimportant' weights to zero, and the model pruning is divided into amplitude-based pruning and channel-based pruning;
the importance of each weight in the feature extraction network and the fast-RCNN network is calculated using a Hessian matrix, the expression of which is the following:
Figure BDA0002475271020000031
in the formula, x1~xnThe weights of 1 st to nth neurons are set, f is a loss function in a feature extraction network or a fast-RCNN network, and n is the number of the neurons in the feature extraction network or the fast-RCNN network;
setting the weight with the lowest importance to zero, and repeating the steps (2.2) - (2.3), namely finishing the training of the waiting vehicle identification model based on the Faster-RCNN;
(3) the method comprises the steps of collecting a vehicle condition monitoring graph when the red light counts down for 20-40 s in a traffic intersection video (collecting road busy degree according to intersection busy degree, namely the size of the whole traffic flow, for example, the busy degree near a business district is larger than that of a residential district, the larger the busy degree is, the later time in countdown is, for example, the count down is 20s, and the count down is 40s when the busy degree is low), inputting the time into a waiting vehicle identification model, outputting the vehicle condition monitoring graph with an identification frame by the vehicle condition monitoring graph, counting the number of waiting vehicles by counting the number of the identification frame, and adjusting green light passing time according to the number of the waiting vehicles.
As a preferred technical scheme:
according to the method for adjusting the green light passing time based on the fast-RCNN, the feature extraction network is a VGG-16 convolutional neural network, conv1 and conv2 in the VGG-16 convolutional neural network learn some basic features, such as low-level features including colors and edges, conv3 learns complex texture features, such as grid textures, conv4 learns more distinctive features, such as parts of a vehicle body, conv5 learns complete and distinguishing key features, the pooling layer adopts maximum pooling, and the activation function in the neural network is a Relu function.
According to the method for adjusting the green light passing time based on the Faster-RCNN, the specific process of training the feature extraction network is as follows: inputting training samples with labels, classifying all the training samples by using a Softmax classifier after convolution-pooling-full connection operation, comparing classification results with classification results when a training set is established, and adjusting the weight and bias of the VGG-16 convolutional neural network by a customs back propagation and gradient descent algorithm until a loss function is minimum or training reaches a certain number of times, wherein probabilistic prediction h made by the Softmax classifierθ(xi) And the Loss function Loss is as follows:
Figure BDA0002475271020000041
Figure BDA0002475271020000042
in the formula, xiFor input data, i is the index of the input data, C is the number of classes into which the input data is classified, yiFor corresponding input data xiOf the prediction class of theta1、θ2、…、θcIs the classification parameter corresponding to the Softmax classifier, N is the total number of training samples, YiIs xiThe category label of (1).
According to the method for adjusting the green light passing time based on the Faster-RCNN, the training times reach 55000-70000 times.
In the method for adjusting green light passing time based on the fast-RCNN as described above, the loss function of the RPN network is as follows:
Figure BDA0002475271020000043
Figure BDA0002475271020000044
Figure BDA0002475271020000045
in the formula, piThe probability that the ith anchor is the detection target is shown, if the anchor is a positive sample, the detection target is the ith anchor
Figure BDA0002475271020000046
Otherwise
Figure BDA0002475271020000047
R (x) represents a smooth L1 function, where x represents an argument; t is tiCoordinates representing anchor to bounding box x;
Figure BDA0002475271020000048
real coordinates representing the anchor;
Figure BDA0002475271020000049
representing the logarithmic loss of the target versus the non-target;
Figure BDA00024752710200000410
represents the regression loss; n is a radical ofclsAnd NregTo normalize the two loss terms; λ is a weight parameter; l is the loss function of the RPN.
The method for adjusting green light passing time based on fast-RCNN as described above specifically includes the following steps:
(1.2.1) recording each pixel point in the original picture as an anchor point, and forming 9 sizes of anchor frames with the length-width ratio of 1:1, 1:2 or 2:1 and the short side length of 1, 2 or 4 by taking the corresponding anchor point as the center on the characteristic diagram;
(1.2.2) judging whether a target (namely the area selected by the anchor frame) in the anchor frame is a vehicle by using a full connection layer;
(1.2.3) using another fully connected layer, the anchor boxes are modified to generate more accurate recommendations, including verifying that there are vehicles in the anchor boxes and adjusting the size of the anchor boxes to have only one vehicle in each anchor box, and the anchor boxes after adjustment are determined to be candidate boxes.
According to the method for adjusting the green light passing time based on the Faster-RCNN, the number of training samples in the training set is 5000-6000.
According to the method for adjusting the green light passing time based on the fast-RCNN, the green light passing time is adjusted according to the number of waiting vehicles, namely the green light passing time is calculated according to the number of waiting vehicles, then the traffic light control program is adjusted, the green light passing time is modified, and the calculation formula is as follows:
t=ts+μ*n;
wherein t is green light passing time and the unit is s; t is tsIs vehicle start delay time in units of s; mu is the time of one vehicle passing through the intersection, and the unit is s; n is the number of waiting vehicles.
Has the advantages that:
(1) according to the method, the region explosals in the RCNN is improved to the RPN for generating the candidate frame, so that the step of a third-party tool selective search is omitted, and the generation of redundant frames can be greatly reduced;
(2) the invention uses model pruning to carry out pruning optimization on the trained model, reduces the complexity of the model and improves the applicability of the model.
Drawings
FIG. 1 is a flow chart of a method for adjusting green light transit time based on fast-RCNN;
FIG. 2 is a schematic diagram of longitudinal edge detection;
fig. 3 is a schematic structural diagram of an RPN network.
Detailed Description
The invention will be further illustrated with reference to specific embodiments. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
The method for adjusting the green light passing time based on the fast-RCNN is shown in figure 1 and comprises the following steps:
(1) constructing a waiting vehicle identification model;
(1.1) feature extraction;
inputting the vehicle condition monitoring graph into a feature extraction network, and outputting a feature graph by the feature extraction network;
the feature extraction network is a VGG-16 convolutional neural network, conv1 and conv2 in the VGG-16 convolutional neural network learn some basic features, such as low-level features of colors, edges and the like, conv3 learns complex texture features, such as some grid textures, conv4 learns more distinctive features, such as parts of a vehicle body and the like, conv5 learns complete and discriminative key features, a pooling layer adopts maximum pooling, and an activation function in the neural network selects a Relu function;
as shown in fig. 2, taking a simple longitudinal edge as an example, when pixel values of an image to be detected are distributed as shown in 6 × 6 squares, a filter for detecting the longitudinal edge selects a Prewitt operator, a 4 × 4 pixel frame is obtained after convolution operation, and in the 4 × 4 pixel frame, a very obvious vertical edge can be seen, because the 6 × 6 image used in the example is too small, the proportion of the output vertical edge is distorted, but when the image to be detected is enlarged, the image effect of edge detection is very real, the detection of other edge features and color features is similar, only the weight values in the filter are changed, and after the detection of the low-level features is finished, a representative complex texture feature, that is, the work of the following convolution layers, can be formed through the combination of different features;
(1.2) extracting a candidate region;
taking the feature map as the input of an RPN (with a structure shown in figure 3) in a Faster-RCNN network, generating an anchor frame by the RPN to judge the feature map, correcting the anchor frame per se, and outputting the feature map with candidate frames with different sizes, wherein the candidate frames are the anchor frames for judging that the candidate areas are vehicles; the method comprises the following specific steps:
(1.2.1) recording each pixel point in the original picture as an anchor point, and forming 9 sizes of anchor frames with the length-width ratio of 1:1, 1:2 or 2:1 and the short side length of 1, 2 or 4 by taking the corresponding anchor point as the center on the characteristic diagram;
(1.2.2) judging whether a target (namely the area selected by the anchor frame) in the anchor frame is a vehicle by using a full connection layer;
(1.2.3) using the other fully connected layer, revising the anchor boxes to generate more accurate recommendations, including verifying whether there are vehicles in the anchor boxes and adjusting the size of the anchor boxes to have only one vehicle in each anchor box, and defining the anchor boxes after adjustment as candidate boxes;
the loss function of the RPN network is as follows:
Figure BDA0002475271020000061
Figure BDA0002475271020000062
Figure BDA0002475271020000063
in the formula, piThe probability that the ith anchor is the detection target is shown, if the anchor is a positive sample, the detection target is the ith anchor
Figure BDA0002475271020000064
Otherwise
Figure BDA0002475271020000065
R (x) represents a smooth L1 function, where x represents an argument; t is tiCoordinates representing anchor to bounding box x;
Figure BDA0002475271020000066
real coordinates representing the anchor;
Figure BDA0002475271020000067
representing the logarithmic loss of the target versus the non-target;
Figure BDA0002475271020000068
represents the regression loss; n is a radical ofclsAnd NregTo normalize the two loss terms; λ is a weight parameter; l is the loss function of the RPN;
(1.3) mapping;
because the full-link layer needs the candidate frames with the same size, the candidate frames are mapped onto the feature map obtained in step (1.2) by adopting a Region of interest Pooling layer (ROI Pooling), the mesh size of each sampling is calculated by setting a fixed scale by the Region of interest Pooling layer, then the maximum value is sampled, and any effective features in the Region of interest are converted into a small feature map with an H multiplied by W fixed spatial range (the small feature map is a subgraph of the feature map, as long as the small feature map is smaller than the original feature map and the size of the small feature map behind each feature map ROI is ensured to be the same);
(1.4) identifying the vehicle;
performing full connection operation on the small feature maps formed by the region-of-interest pooling layer to obtain a vehicle condition monitoring map with an identification frame (the candidate frame and the identification frame are not a concept, the candidate frame on the feature map may be a part of a vehicle, and the identification frame corresponds to a frame of the vehicle), and completing classification operation on the frame while classifying, so as to obtain the positioning of the vehicle, namely, forming a waiting vehicle identification model based on fast-RCNN;
(2) training a waiting vehicle identification model;
(2.1) establishing a training set;
obtaining a training set for vehicle detection, adopting a manual label labeling mode, and dividing training samples (the number is 5000-6000) in the training set into 2 types: the training sample labeled as 'yes vehicle' contains a car, and the training sample labeled as 'no vehicle' contains other types of objects except vehicles;
(2.2) training a feature extraction network;
inputting training samples with labels, classifying all the training samples by using a Softmax classifier after convolution-pooling-full connection operation, comparing classification results with classification results when a training set is established, and adjusting the weight and bias of the VGG-16 convolutional neural network by a customs back propagation and gradient descent algorithm until a loss function is minimum or the training reaches a certain number of times (the training reaches 55000-70000 times), wherein probabilistic prediction h made by the Softmax classifierθ(xi) And the Loss function Loss is as follows:
Figure BDA0002475271020000071
Figure BDA0002475271020000072
in the formula, xiFor input data, i is the index of the input data, C is the number of classes into which the input data is classified, yiFor corresponding input data xiOf the prediction class of theta1、θ2、…、θcIs the classification parameter corresponding to the Softmax classifier, N is the total number of training samples, YiIs xiLabeling the category of (1);
(2.3) sharing Fast-RCNN network and RPN network parameters;
(2.3.1) initializing convolutional layer parameters of Fast-RCNN network and RPN network by using VGG-16 convolutional neural network, and recording the initialization parameters including weight and bias as W0;
(2.3.2) training an RPN network by taking W0 as a parameter and taking a feature map as an input to obtain a candidate region;
(2.3.3) training a Fast-RCNN network by taking the candidate region as input and taking the feature map with the candidate frame as theoretical output, and marking the parameter of the Fast-RCNN network at the moment as W1, so far, the parameter of each layer of the two networks is not shared at all;
(2.3.4) initializing a new RPN network by using W1, and setting the learning rate of the feature extraction network parameters shared by the new RPN network and the Fast-RCNN network to 0, namely, only learning the parameters specific to the new RPN network, and fixing the feature extraction network, so that the two networks share all the common convolutional layers;
(2.3.5) still fixing the shared convolutional layer, adding the specific network layer of the Fast-RCNN network, continuing training, and finely adjusting the specific network layer of the Fast-RCNN network to ensure that the RPN network and the Fast-RCNN network completely share parameters;
(2.4) carrying out model pruning;
model pruning is a model compression method, introducing sparsity to dense connection of a deep neural network, reducing the number of non-zero weights by directly setting 'unimportant' weights to zero, and the model pruning is divided into amplitude-based pruning and channel-based pruning;
the importance of each weight in the feature extraction network and the fast-RCNN network is calculated using a Hessian matrix, the expression of which is the following:
Figure BDA0002475271020000081
in the formula, x1~xnThe weights of 1 st to nth neurons are set, f is a loss function in a feature extraction network or a fast-RCNN network, and n is the number of the neurons in the feature extraction network or the fast-RCNN network;
setting the weight with the lowest importance to zero, and repeating the steps (2.2) - (2.3), namely finishing the training of the waiting vehicle identification model based on the Faster-RCNN;
(3) the method comprises the steps of collecting a vehicle condition monitoring graph when the red light counts down by 20-40 s in a traffic intersection video (collecting road busy degree according to intersection busy degree, namely the size of the whole traffic flow, for example, the busy degree near a business district is larger than that of a residential district, the larger the busy degree is, the later time in countdown is taken, for example, the reciprocal 20s, and the reciprocal 40s when the busy degree is low), inputting the time into a waiting vehicle identification model, outputting the vehicle condition monitoring graph with an identification frame by the vehicle condition monitoring graph, counting the number of waiting vehicles by counting the number of the identification frame, adjusting green light passing time according to the number of the waiting vehicles, namely calculating green light passing time according to the number of the waiting vehicles, adjusting a traffic light control program, and modifying the green light passing time, wherein the calculation formula is as follows:
t=ts+μ*n;
wherein t is green light passing time and the unit is s; t is tsIs vehicle start delay time in units of s; mu is the time of one vehicle passing through the intersection, and the unit is s; n is the number of waiting vehicles.
The core of the invention is to provide a recognition model based on the fast-RCNN, the green light passing time is adjusted to be just one application example, the invention also compares the recognition model based on the fast-RCNN with the prior art (the recognition model based on the RCNN and the recognition model based on the SPP-NET), in a target detection system which takes images related to people captured by a camera in a natural scene as a training set, for the training set of 5000 images, the time consumption of the recognition model based on RCNN is 37.5h, the time consumption of the recognition model based on SPP-NET is 4.6h, the time consumption of the recognition model based on fast-RCNN is only 1.6h, and the accuracy of the recognition model based on the fast-RCNN is very high, experiments in potato bud eye and face recognition show that the recognition accuracy of the recognition model based on the Faster-RCNN reaches 96.32% and 89.0%, and is improved by about 4.65% and 3.4% compared with the prior art.

Claims (7)

1. The method for adjusting the green light passing time based on the fast-RCNN is characterized by comprising the following steps:
(1) constructing a waiting vehicle identification model;
(1.1) feature extraction;
inputting the vehicle condition monitoring graph into a feature extraction network, and outputting a feature graph by the feature extraction network;
(1.2) extracting a candidate region;
taking the feature map as the input of an RPN (resilient packet network) in a Faster-RCNN (fast-forward neural network), generating an anchor frame by the RPN to judge the feature map, correcting the anchor frame per se, and outputting the feature map with candidate frames with different sizes, wherein the candidate frames are the anchor frames for judging that the candidate areas are vehicles;
(1.3) mapping;
mapping the candidate frame to the feature map obtained in step (1.2) by using a Region of interest Pooling layer (ROI Pooling), calculating the grid size of each sampling by setting a fixed scale, then sampling at the maximum value, and converting any effective features in the Region of interest into a small feature map with a fixed space range of H multiplied by W;
(1.4) identifying the vehicle;
carrying out full connection operation on the small feature map to obtain a vehicle condition monitoring map with an identification frame, and finishing classification operation on the frame while classifying to obtain the positioning of the vehicle, namely forming a waiting vehicle identification model based on fast-RCNN;
(2) training a waiting vehicle identification model;
(2.1) establishing a training set;
obtaining a training set for vehicle detection, adopting a manual label labeling mode, and dividing training samples in the training set into 2 types: the training sample labeled as 'yes vehicle' contains a car, and the training sample labeled as 'no vehicle' contains other types of objects except vehicles;
(2.2) training a feature extraction network;
(2.3) sharing Fast-RCNN network and RPN network parameters;
(2.3.1) initializing convolutional layer parameters of Fast-RCNN network and RPN network by using VGG-16 convolutional neural network, and recording the initialization parameters including weight and bias as W0;
(2.3.2) training an RPN network by taking W0 as a parameter and taking a feature map as an input to obtain a candidate region;
(2.3.3) training a Fast-RCNN network by taking the candidate region as input and taking the feature map with the candidate frame as theoretical output, and marking the parameter of the Fast-RCNN network at the moment as W1, so far, the parameter of each layer of the two networks is not shared at all;
(2.3.4) initializing a new RPN network by using W1, and setting the learning rate of the feature extraction network parameters shared by the new RPN network and the Fast-RCNN network to 0, namely, only learning the parameters specific to the new RPN network, and fixing the feature extraction network, so that the two networks share all the common convolutional layers;
(2.3.5) still fixing the shared convolutional layer, adding the specific network layer of the Fast-RCNN network, continuing training, and finely adjusting the specific network layer of the Fast-RCNN network to ensure that the RPN network and the Fast-RCNN network completely share parameters;
(2.4) carrying out model pruning;
the importance of each weight in the feature extraction network and the fast-RCNN network is calculated using a Hessian matrix, the expression of which is the following:
Figure FDA0003053114980000021
in the formula, x1~xnThe weights of 1 st to nth neurons are set, f is a loss function in a feature extraction network or a fast-RCNN network, and n is the number of the neurons in the feature extraction network or the fast-RCNN network;
setting the weight with the lowest importance to zero, and repeating the steps (2.2) - (2.3), namely finishing the training of the waiting vehicle identification model based on the Faster-RCNN;
(3) collecting a vehicle condition monitoring graph when the reciprocal number of a red light is 20-40 s in a traffic intersection video, inputting the vehicle condition monitoring graph into a waiting vehicle identification model, outputting the vehicle condition monitoring graph with identification frames, counting the number of waiting vehicles by counting the number of the identification frames, and adjusting the green light passing time according to the number of the waiting vehicles;
the method comprises the following steps of adjusting the green light passing time according to the number of waiting vehicles, namely calculating the green light passing time according to the number of waiting vehicles, adjusting a traffic light control program, and modifying the green light passing time, wherein the calculation formula is as follows:
t=ts+μ*n;
wherein t is green light passing time and the unit is s; t is tsIs vehicle start delay time in units of s; mu is the time of one vehicle passing through the intersection, and the unit is s; n is the number of waiting vehicles.
2. The method for adjusting green light passing time based on fast-RCNN as claimed in claim 1, wherein the feature extraction network is VGG-16 convolutional neural network, conv1, conv2 in the VGG-16 convolutional neural network learns some basic features, conv3 learns complex texture features, conv4 learns more distinctive features, conv5 learns complete and distinctive key features, the pooling layer adopts maximum pooling, and the activation function in the neural network adopts Relu function.
3. The method for adjusting green light passing time based on Faster-RCNN according to claim 2, wherein the specific process of training the feature extraction network is: inputting training samples with labels, classifying all the training samples by using a Softmax classifier after convolution-pooling-full connection operation, comparing classification results with classification results when a training set is established, and adjusting the weight and bias of the VGG-16 convolutional neural network by a customs back propagation and gradient descent algorithm until a loss function is minimum or training reaches a certain number of times, wherein probabilistic prediction h made by the Softmax classifierθ(xi) And the Loss function Loss is as follows:
Figure FDA0003053114980000031
Figure FDA0003053114980000032
in the formula, xiFor input data, i is the index of the input data, C is the number of classes into which the input data is classified, yiFor corresponding input data xiOf the prediction class of theta1、θ2、…、θCIs the classification parameter corresponding to the Softmax classifier, and N is the training sampleTotal number of books, YiIs xiThe category label of (1).
4. The method for regulating green light passing time based on Faster-RCNN as claimed in claim 3, wherein the training for a certain number of times is 55000-70000 times.
5. The method for regulating green light transit time based on fast-RCNN according to claim 1, wherein the penalty function of the RPN network is as follows:
Figure FDA0003053114980000033
Figure FDA0003053114980000034
Figure FDA0003053114980000035
Figure FDA0003053114980000036
in the formula, piThe probability that the ith anchor is the detection target is shown, if the anchor is a positive sample, the detection target is the ith anchor
Figure FDA0003053114980000041
Otherwise
Figure FDA0003053114980000042
R (x) represents a smooth L1 function, where x represents an argument; t is tiCoordinates representing anchor to bounding box x;
Figure FDA0003053114980000043
indicating true seating of anchorsMarking;
Figure FDA0003053114980000044
representing the logarithmic loss of the target versus the non-target;
Figure FDA0003053114980000045
represents the regression loss; n is a radical ofclsAnd NregTo normalize the two loss terms; λ is a weight parameter; l is the loss function of the RPN.
6. The method for adjusting green light passing time based on Faster-RCNN according to claim 1, wherein the process of extracting the candidate regions is as follows:
(1.2.1) recording each pixel point in the original picture as an anchor point, and forming 9 sizes of anchor frames with the length-width ratio of 1:1, 1:2 or 2:1 and the short side length of 1, 2 or 4 by taking the corresponding anchor point as the center on the characteristic diagram;
(1.2.2) judging whether the target in the anchor frame is a vehicle by using a full connection layer;
(1.2.3) using another fully connected layer, the anchor boxes are modified to generate more accurate recommendations, including verifying that there are vehicles in the anchor boxes and adjusting the size of the anchor boxes to have only one vehicle in each anchor box, and the anchor boxes after adjustment are determined to be candidate boxes.
7. The method for regulating green light passing time based on Faster-RCNN as claimed in claim 1, wherein the number of training samples in the training set is 5000-6000.
CN202010361892.3A 2020-04-30 2020-04-30 Method for adjusting green light passing time based on fast-RCNN Active CN111540203B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010361892.3A CN111540203B (en) 2020-04-30 2020-04-30 Method for adjusting green light passing time based on fast-RCNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010361892.3A CN111540203B (en) 2020-04-30 2020-04-30 Method for adjusting green light passing time based on fast-RCNN

Publications (2)

Publication Number Publication Date
CN111540203A CN111540203A (en) 2020-08-14
CN111540203B true CN111540203B (en) 2021-09-17

Family

ID=71970250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010361892.3A Active CN111540203B (en) 2020-04-30 2020-04-30 Method for adjusting green light passing time based on fast-RCNN

Country Status (1)

Country Link
CN (1) CN111540203B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738367B (en) * 2020-08-17 2020-11-13 成都中轨轨道设备有限公司 Part classification method based on image recognition
CN112183461A (en) * 2020-10-21 2021-01-05 广州市晶华精密光学股份有限公司 Vehicle interior monitoring method, device, equipment and storage medium
CN112819510A (en) * 2021-01-21 2021-05-18 江阴逐日信息科技有限公司 Fashion trend prediction method, system and equipment based on clothing multi-attribute recognition

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446150A (en) * 2016-09-21 2017-02-22 北京数字智通科技有限公司 Method and device for precise vehicle retrieval
CN107808126A (en) * 2017-10-11 2018-03-16 苏州科达科技股份有限公司 Vehicle retrieval method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250812B (en) * 2016-07-15 2019-08-20 汤一平 A kind of model recognizing method based on quick R-CNN deep neural network
US10719743B2 (en) * 2018-01-19 2020-07-21 Arcus Holding A/S License plate reader using optical character recognition on plural detected regions
CN110147707B (en) * 2018-10-25 2021-07-20 初速度(苏州)科技有限公司 High-precision vehicle identification method and system
CN110490115B (en) * 2019-08-13 2021-08-13 北京达佳互联信息技术有限公司 Training method and device of face detection model, electronic equipment and storage medium
CN110705544B (en) * 2019-09-05 2023-04-07 中国民航大学 Self-adaptive rapid target detection method based on fast-RCNN

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446150A (en) * 2016-09-21 2017-02-22 北京数字智通科技有限公司 Method and device for precise vehicle retrieval
CN107808126A (en) * 2017-10-11 2018-03-16 苏州科达科技股份有限公司 Vehicle retrieval method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Preprocessed Faster RCNN for Vehicle Detection";Mduduzi Manana;《 2018 International Conference on Intelligent and Innovative Computing Applications (ICONIC)》;20181231;第1-4页 *
"基于改进卷积神经网络的快速车辆检测";朱锋彬;《传感器与微***》;20190112;第153-160页 *

Also Published As

Publication number Publication date
CN111540203A (en) 2020-08-14

Similar Documents

Publication Publication Date Title
CN111444821B (en) Automatic identification method for urban road signs
CN107563372B (en) License plate positioning method based on deep learning SSD frame
CN110363122B (en) Cross-domain target detection method based on multi-layer feature alignment
CN107134144B (en) A kind of vehicle checking method for traffic monitoring
CN108427912B (en) Optical remote sensing image target detection method based on dense target feature learning
Rachmadi et al. Vehicle color recognition using convolutional neural network
CN110909666B (en) Night vehicle detection method based on improved YOLOv3 convolutional neural network
CN111540203B (en) Method for adjusting green light passing time based on fast-RCNN
CN112150821B (en) Lightweight vehicle detection model construction method, system and device
CN111784685A (en) Power transmission line defect image identification method based on cloud edge cooperative detection
CN111079640B (en) Vehicle type identification method and system based on automatic amplification sample
CN110276264B (en) Crowd density estimation method based on foreground segmentation graph
CN107742099A (en) A kind of crowd density estimation based on full convolutional network, the method for demographics
CN110619327A (en) Real-time license plate recognition method based on deep learning in complex scene
CN112464911A (en) Improved YOLOv 3-tiny-based traffic sign detection and identification method
CN109903339B (en) Video group figure positioning detection method based on multi-dimensional fusion features
CN111368660A (en) Single-stage semi-supervised image human body target detection method
CN111738114A (en) Vehicle target detection method based on anchor-free accurate sampling remote sensing image
CN113205107A (en) Vehicle type recognition method based on improved high-efficiency network
CN112084897A (en) Rapid traffic large-scene vehicle target detection method of GS-SSD
CN107247967A (en) A kind of vehicle window annual test mark detection method based on R CNN
CN112132839B (en) Multi-scale rapid face segmentation method based on deep convolution cascade network
CN110136098B (en) Cable sequence detection method based on deep learning
CN111178275A (en) Fire detection method based on convolutional neural network
CN110765900A (en) DSSD-based automatic illegal building detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant