CN111540203B

CN111540203B - Method for adjusting green light passing time based on fast-RCNN

Info

Publication number: CN111540203B
Application number: CN202010361892.3A
Authority: CN
Inventors: 周武能; 廖凯立; 黄建华
Original assignee: Shanghai Nuclear Furstate Software Technology Co ltd; Donghua University
Current assignee: Shanghai Nuclear Furstate Software Technology Co ltd; Donghua University
Priority date: 2020-04-30
Filing date: 2020-04-30
Publication date: 2021-09-17
Anticipated expiration: 2040-04-30
Also published as: CN111540203A

Abstract

The invention relates to a method for adjusting green light passing time based on fast-RCNN, which comprises the following steps: (1) constructing a waiting vehicle identification model; firstly, extracting features to obtain a feature map, then extracting candidate areas to obtain the feature map with candidate frames with different sizes, then mapping to obtain a small feature map, and finally carrying out full-connection operation on the small feature map to obtain a vehicle condition monitoring map with an identification frame; (2) training a waiting vehicle identification model; firstly, establishing a training set, then training a feature extraction network, then sharing Fast-RCNN network and RPN network parameters, and finally performing model pruning; (3) the method comprises the steps of collecting a vehicle condition monitoring graph when the reciprocal of a red light is 20-40 s in a traffic intersection video, inputting the vehicle condition monitoring graph into a waiting vehicle identification model, outputting the vehicle condition monitoring graph with identification frames, counting the number of waiting vehicles by counting the number of the identification frames, and adjusting green light passing time according to the number of the waiting vehicles. The model of the invention has simple structure and short time consumption, and can realize adaptive timing.

Description

Method for adjusting green light passing time based on fast-RCNN

Technical Field

The invention belongs to the technical field of multi-target identification, and relates to a method for adjusting green light passing time based on fast-RCNN.

Background

The existing traffic light timing is fixed, so that the green light time is too short under certain conditions, the passing is incomplete, or the waiting time of vehicles in other directions is too long due to too long time, and the congestion is caused. At present, there are two solutions, one is to predict and re-allocate the traffic flow at the intersection, and the second is to perform adaptive allocation according to the number of waiting vehicles at the current intersection. Compared with the two methods, the method has higher requirements on the technology and processing equipment when the vehicle waits for the self-adaptability timing according to the current intersection, but is more beneficial to relieving the traffic jam condition.

For vehicles waiting for red light, the vehicle detection under a static state can be modeled, and because the traffic light timing is a real-time problem, the number of the waiting vehicles under the red light condition needs to be counted each time, so that the algorithm efficiency is very necessary to be accelerated. When the algorithm processing speed is fast enough, the requirement of real-time detection can be met.

The static multi-target detection problem can be processed by a deep learning method. Convolutional Neural Networks (CNN) are generally used for image processing, and recognize features in an image by learning the features through convolutional layers. However, the CNN network is slow in the speed of model training and difficult to express the time correlation between data, and on this basis, the improved Regional Convolutional Neural Network (RCNN) can make up for the problem of time correlation.

In the detection research of rail damage, the traditional RCNN algorithm has the defect that the calculation amount is too large because the CNN calculation is carried out once for each region explosals; in cloth flaw detection, the improved Fast-RCNN selects a candidate frame by using a selective search and selects a candidate region on a feature map, so that the problem of repeated calculation of each candidate region is avoided, the calculation speed is improved, but the problem that a third-party tool selective search is used for extracting the feature map is also brought, and the test shows that the selective search needs 2s for extracting the candidate region on a cpu, so that the identification time is increased.

Disclosure of Invention

The invention aims to solve the problems in the prior art and provides a method for adjusting green light passing time based on fast-RCNN, which is short in time consumption. Aiming at the problem that the traffic passing efficiency is reduced when the traffic light is fixed, the algorithm for waiting vehicles based on identification and statistics is used for self-adaptive timing; aiming at the problems of long time consumption of an identification algorithm and long time consumption of calling third-party software in the traditional RCNN and Fast-RCNN, the invention provides a red light waiting vehicle statistical method based on the Fast-RCNN algorithm and optimized by using model pruning in combination with the special application background of identifying waiting vehicles.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

the method for adjusting the green light passing time based on the fast-RCNN comprises the following steps:

(1) constructing a waiting vehicle identification model;

(1.1) feature extraction;

inputting the vehicle condition monitoring graph into a feature extraction network, and outputting a feature graph by the feature extraction network;

(1.2) extracting a candidate region;

taking the feature map as the input of an RPN (resilient packet network) in a Faster-RCNN (fast-forward neural network), generating an anchor frame by the RPN to judge the feature map, correcting the anchor frame per se, and outputting the feature map with candidate frames with different sizes, wherein the candidate frames are the anchor frames for judging that the candidate areas are vehicles;

(1.3) mapping;

because the full-link layer needs the candidate frames with the same size, the candidate frames are mapped onto the feature map obtained in step (1.2) by adopting a Region of interest Pooling layer (ROI Pooling), the mesh size of each sampling is calculated by setting a fixed scale by the Region of interest Pooling layer, then the maximum value is sampled, and any effective features in the Region of interest are converted into a small feature map with an H multiplied by W fixed spatial range (the small feature map is a subgraph of the feature map, as long as the small feature map is smaller than the original feature map and the size of the small feature map behind each feature map ROI is ensured to be the same);

(1.4) identifying the vehicle;

performing full connection operation on the small feature maps formed by the region-of-interest pooling layer to obtain a vehicle condition monitoring map with an identification frame (the candidate frame and the identification frame are not a concept, the candidate frame on the feature map may be a part of a vehicle, and the identification frame corresponds to a frame of the vehicle), and completing classification operation on the frame while classifying, so as to obtain the positioning of the vehicle, namely, forming a waiting vehicle identification model based on fast-RCNN;

(2) training a waiting vehicle identification model;

(2.1) establishing a training set;

obtaining a training set for vehicle detection, adopting a manual label labeling mode, and dividing training samples in the training set into 2 types: the training sample labeled as 'yes vehicle' contains a car, and the training sample labeled as 'no vehicle' contains other types of objects except vehicles;

(2.2) training a feature extraction network;

(2.3) sharing Fast-RCNN network and RPN network parameters;

(2.3.1) initializing convolutional layer parameters of Fast-RCNN network and RPN network by using VGG-16 convolutional neural network, and recording the initialization parameters including weight and bias as W0;

(2.3.2) training an RPN network by taking W0 as a parameter and taking a feature map as an input to obtain a candidate region;

(2.3.3) training a Fast-RCNN network by taking the candidate region as input and taking the feature map with the candidate frame as theoretical output, and marking the parameter of the Fast-RCNN network at the moment as W1, so far, the parameter of each layer of the two networks is not shared at all;

(2.3.4) initializing a new RPN network by using W1, and setting the learning rate of the feature extraction network parameters shared by the new RPN network and the Fast-RCNN network to 0, namely, only learning the parameters specific to the new RPN network, and fixing the feature extraction network, so that the two networks share all the common convolutional layers;

(2.3.5) still fixing the shared convolutional layer, adding the specific network layer of the Fast-RCNN network, continuing training, and finely adjusting the specific network layer of the Fast-RCNN network to ensure that the RPN network and the Fast-RCNN network completely share parameters;

(2.4) carrying out model pruning;

model pruning is a model compression method, introducing sparsity to dense connection of a deep neural network, reducing the number of non-zero weights by directly setting 'unimportant' weights to zero, and the model pruning is divided into amplitude-based pruning and channel-based pruning;

the importance of each weight in the feature extraction network and the fast-RCNN network is calculated using a Hessian matrix, the expression of which is the following:

in the formula, x₁～x_nThe weights of 1 st to nth neurons are set, f is a loss function in a feature extraction network or a fast-RCNN network, and n is the number of the neurons in the feature extraction network or the fast-RCNN network;

setting the weight with the lowest importance to zero, and repeating the steps (2.2) - (2.3), namely finishing the training of the waiting vehicle identification model based on the Faster-RCNN;

(3) the method comprises the steps of collecting a vehicle condition monitoring graph when the red light counts down for 20-40 s in a traffic intersection video (collecting road busy degree according to intersection busy degree, namely the size of the whole traffic flow, for example, the busy degree near a business district is larger than that of a residential district, the larger the busy degree is, the later time in countdown is, for example, the count down is 20s, and the count down is 40s when the busy degree is low), inputting the time into a waiting vehicle identification model, outputting the vehicle condition monitoring graph with an identification frame by the vehicle condition monitoring graph, counting the number of waiting vehicles by counting the number of the identification frame, and adjusting green light passing time according to the number of the waiting vehicles.

As a preferred technical scheme:

according to the method for adjusting the green light passing time based on the fast-RCNN, the feature extraction network is a VGG-16 convolutional neural network, conv1 and conv2 in the VGG-16 convolutional neural network learn some basic features, such as low-level features including colors and edges, conv3 learns complex texture features, such as grid textures, conv4 learns more distinctive features, such as parts of a vehicle body, conv5 learns complete and distinguishing key features, the pooling layer adopts maximum pooling, and the activation function in the neural network is a Relu function.

According to the method for adjusting the green light passing time based on the Faster-RCNN, the specific process of training the feature extraction network is as follows: inputting training samples with labels, classifying all the training samples by using a Softmax classifier after convolution-pooling-full connection operation, comparing classification results with classification results when a training set is established, and adjusting the weight and bias of the VGG-16 convolutional neural network by a customs back propagation and gradient descent algorithm until a loss function is minimum or training reaches a certain number of times, wherein probabilistic prediction h made by the Softmax classifier_θ(xⁱ) And the Loss function Loss is as follows:

in the formula, xⁱFor input data, i is the index of the input data, C is the number of classes into which the input data is classified, yⁱFor corresponding input data xⁱOf the prediction class of theta₁、θ₂、…、θ_cIs the classification parameter corresponding to the Softmax classifier, N is the total number of training samples, YⁱIs xⁱThe category label of (1).

According to the method for adjusting the green light passing time based on the Faster-RCNN, the training times reach 55000-70000 times.

In the method for adjusting green light passing time based on the fast-RCNN as described above, the loss function of the RPN network is as follows:

in the formula, p_iThe probability that the ith anchor is the detection target is shown, if the anchor is a positive sample, the detection target is the ith anchor

Otherwise

R (x) represents a smooth L1 function, where x represents an argument; t is t_iCoordinates representing anchor to bounding box x;

real coordinates representing the anchor;

representing the logarithmic loss of the target versus the non-target;

represents the regression loss; n is a radical of_clsAnd N_regTo normalize the two loss terms; λ is a weight parameter; l is the loss function of the RPN.

The method for adjusting green light passing time based on fast-RCNN as described above specifically includes the following steps:

(1.2.1) recording each pixel point in the original picture as an anchor point, and forming 9 sizes of anchor frames with the length-width ratio of 1:1, 1:2 or 2:1 and the short side length of 1, 2 or 4 by taking the corresponding anchor point as the center on the characteristic diagram;

(1.2.2) judging whether a target (namely the area selected by the anchor frame) in the anchor frame is a vehicle by using a full connection layer;

(1.2.3) using another fully connected layer, the anchor boxes are modified to generate more accurate recommendations, including verifying that there are vehicles in the anchor boxes and adjusting the size of the anchor boxes to have only one vehicle in each anchor box, and the anchor boxes after adjustment are determined to be candidate boxes.

According to the method for adjusting the green light passing time based on the Faster-RCNN, the number of training samples in the training set is 5000-6000.

According to the method for adjusting the green light passing time based on the fast-RCNN, the green light passing time is adjusted according to the number of waiting vehicles, namely the green light passing time is calculated according to the number of waiting vehicles, then the traffic light control program is adjusted, the green light passing time is modified, and the calculation formula is as follows:

t＝t_s+μ*n；

wherein t is green light passing time and the unit is s; t is t_sIs vehicle start delay time in units of s; mu is the time of one vehicle passing through the intersection, and the unit is s; n is the number of waiting vehicles.

Has the advantages that:

(1) according to the method, the region explosals in the RCNN is improved to the RPN for generating the candidate frame, so that the step of a third-party tool selective search is omitted, and the generation of redundant frames can be greatly reduced;

(2) the invention uses model pruning to carry out pruning optimization on the trained model, reduces the complexity of the model and improves the applicability of the model.

Drawings

FIG. 1 is a flow chart of a method for adjusting green light transit time based on fast-RCNN;

FIG. 2 is a schematic diagram of longitudinal edge detection;

fig. 3 is a schematic structural diagram of an RPN network.

Detailed Description

The invention will be further illustrated with reference to specific embodiments. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.

The method for adjusting the green light passing time based on the fast-RCNN is shown in figure 1 and comprises the following steps:

(1) constructing a waiting vehicle identification model;

(1.1) feature extraction;

the feature extraction network is a VGG-16 convolutional neural network, conv1 and conv2 in the VGG-16 convolutional neural network learn some basic features, such as low-level features of colors, edges and the like, conv3 learns complex texture features, such as some grid textures, conv4 learns more distinctive features, such as parts of a vehicle body and the like, conv5 learns complete and discriminative key features, a pooling layer adopts maximum pooling, and an activation function in the neural network selects a Relu function;

as shown in fig. 2, taking a simple longitudinal edge as an example, when pixel values of an image to be detected are distributed as shown in 6 × 6 squares, a filter for detecting the longitudinal edge selects a Prewitt operator, a 4 × 4 pixel frame is obtained after convolution operation, and in the 4 × 4 pixel frame, a very obvious vertical edge can be seen, because the 6 × 6 image used in the example is too small, the proportion of the output vertical edge is distorted, but when the image to be detected is enlarged, the image effect of edge detection is very real, the detection of other edge features and color features is similar, only the weight values in the filter are changed, and after the detection of the low-level features is finished, a representative complex texture feature, that is, the work of the following convolution layers, can be formed through the combination of different features;

(1.2) extracting a candidate region;

taking the feature map as the input of an RPN (with a structure shown in figure 3) in a Faster-RCNN network, generating an anchor frame by the RPN to judge the feature map, correcting the anchor frame per se, and outputting the feature map with candidate frames with different sizes, wherein the candidate frames are the anchor frames for judging that the candidate areas are vehicles; the method comprises the following specific steps:

(1.2.3) using the other fully connected layer, revising the anchor boxes to generate more accurate recommendations, including verifying whether there are vehicles in the anchor boxes and adjusting the size of the anchor boxes to have only one vehicle in each anchor box, and defining the anchor boxes after adjustment as candidate boxes;

the loss function of the RPN network is as follows:

Otherwise

real coordinates representing the anchor;

representing the logarithmic loss of the target versus the non-target;

represents the regression loss; n is a radical of_clsAnd N_regTo normalize the two loss terms; λ is a weight parameter; l is the loss function of the RPN;

(1.3) mapping;

(1.4) identifying the vehicle;

(2) training a waiting vehicle identification model;

(2.1) establishing a training set;

obtaining a training set for vehicle detection, adopting a manual label labeling mode, and dividing training samples (the number is 5000-6000) in the training set into 2 types: the training sample labeled as 'yes vehicle' contains a car, and the training sample labeled as 'no vehicle' contains other types of objects except vehicles;

(2.2) training a feature extraction network;

inputting training samples with labels, classifying all the training samples by using a Softmax classifier after convolution-pooling-full connection operation, comparing classification results with classification results when a training set is established, and adjusting the weight and bias of the VGG-16 convolutional neural network by a customs back propagation and gradient descent algorithm until a loss function is minimum or the training reaches a certain number of times (the training reaches 55000-70000 times), wherein probabilistic prediction h made by the Softmax classifier_θ(xⁱ) And the Loss function Loss is as follows:

in the formula, xⁱFor input data, i is the index of the input data, C is the number of classes into which the input data is classified, yⁱFor corresponding input data xⁱOf the prediction class of theta₁、θ₂、…、θ_cIs the classification parameter corresponding to the Softmax classifier, N is the total number of training samples, YⁱIs xⁱLabeling the category of (1);

(2.3) sharing Fast-RCNN network and RPN network parameters;

(2.4) carrying out model pruning;

(3) the method comprises the steps of collecting a vehicle condition monitoring graph when the red light counts down by 20-40 s in a traffic intersection video (collecting road busy degree according to intersection busy degree, namely the size of the whole traffic flow, for example, the busy degree near a business district is larger than that of a residential district, the larger the busy degree is, the later time in countdown is taken, for example, the reciprocal 20s, and the reciprocal 40s when the busy degree is low), inputting the time into a waiting vehicle identification model, outputting the vehicle condition monitoring graph with an identification frame by the vehicle condition monitoring graph, counting the number of waiting vehicles by counting the number of the identification frame, adjusting green light passing time according to the number of the waiting vehicles, namely calculating green light passing time according to the number of the waiting vehicles, adjusting a traffic light control program, and modifying the green light passing time, wherein the calculation formula is as follows:

t＝t_s+μ*n；

The core of the invention is to provide a recognition model based on the fast-RCNN, the green light passing time is adjusted to be just one application example, the invention also compares the recognition model based on the fast-RCNN with the prior art (the recognition model based on the RCNN and the recognition model based on the SPP-NET), in a target detection system which takes images related to people captured by a camera in a natural scene as a training set, for the training set of 5000 images, the time consumption of the recognition model based on RCNN is 37.5h, the time consumption of the recognition model based on SPP-NET is 4.6h, the time consumption of the recognition model based on fast-RCNN is only 1.6h, and the accuracy of the recognition model based on the fast-RCNN is very high, experiments in potato bud eye and face recognition show that the recognition accuracy of the recognition model based on the Faster-RCNN reaches 96.32% and 89.0%, and is improved by about 4.65% and 3.4% compared with the prior art.

Claims

1. The method for adjusting the green light passing time based on the fast-RCNN is characterized by comprising the following steps:

(1) constructing a waiting vehicle identification model;

(1.1) feature extraction;

(1.2) extracting a candidate region;

(1.3) mapping;

mapping the candidate frame to the feature map obtained in step (1.2) by using a Region of interest Pooling layer (ROI Pooling), calculating the grid size of each sampling by setting a fixed scale, then sampling at the maximum value, and converting any effective features in the Region of interest into a small feature map with a fixed space range of H multiplied by W;

(1.4) identifying the vehicle;

carrying out full connection operation on the small feature map to obtain a vehicle condition monitoring map with an identification frame, and finishing classification operation on the frame while classifying to obtain the positioning of the vehicle, namely forming a waiting vehicle identification model based on fast-RCNN;

(2) training a waiting vehicle identification model;

(2.1) establishing a training set;

(2.2) training a feature extraction network;

(2.3) sharing Fast-RCNN network and RPN network parameters;

(2.4) carrying out model pruning;

(3) collecting a vehicle condition monitoring graph when the reciprocal number of a red light is 20-40 s in a traffic intersection video, inputting the vehicle condition monitoring graph into a waiting vehicle identification model, outputting the vehicle condition monitoring graph with identification frames, counting the number of waiting vehicles by counting the number of the identification frames, and adjusting the green light passing time according to the number of the waiting vehicles;

the method comprises the following steps of adjusting the green light passing time according to the number of waiting vehicles, namely calculating the green light passing time according to the number of waiting vehicles, adjusting a traffic light control program, and modifying the green light passing time, wherein the calculation formula is as follows:

t＝t_s+μ*n；

2. The method for adjusting green light passing time based on fast-RCNN as claimed in claim 1, wherein the feature extraction network is VGG-16 convolutional neural network, conv1, conv2 in the VGG-16 convolutional neural network learns some basic features, conv3 learns complex texture features, conv4 learns more distinctive features, conv5 learns complete and distinctive key features, the pooling layer adopts maximum pooling, and the activation function in the neural network adopts Relu function.

3. The method for adjusting green light passing time based on Faster-RCNN according to claim 2, wherein the specific process of training the feature extraction network is: inputting training samples with labels, classifying all the training samples by using a Softmax classifier after convolution-pooling-full connection operation, comparing classification results with classification results when a training set is established, and adjusting the weight and bias of the VGG-16 convolutional neural network by a customs back propagation and gradient descent algorithm until a loss function is minimum or training reaches a certain number of times, wherein probabilistic prediction h made by the Softmax classifier_θ(xⁱ) And the Loss function Loss is as follows:

in the formula, xⁱFor input data, i is the index of the input data, C is the number of classes into which the input data is classified, yⁱFor corresponding input data xⁱOf the prediction class of theta₁、θ₂、…、θ_CIs the classification parameter corresponding to the Softmax classifier, and N is the training sampleTotal number of books, YⁱIs xⁱThe category label of (1).

4. The method for regulating green light passing time based on Faster-RCNN as claimed in claim 3, wherein the training for a certain number of times is 55000-70000 times.

5. The method for regulating green light transit time based on fast-RCNN according to claim 1, wherein the penalty function of the RPN network is as follows:

Otherwise

indicating true seating of anchorsMarking;

representing the logarithmic loss of the target versus the non-target;

6. The method for adjusting green light passing time based on Faster-RCNN according to claim 1, wherein the process of extracting the candidate regions is as follows:

(1.2.2) judging whether the target in the anchor frame is a vehicle by using a full connection layer;

7. The method for regulating green light passing time based on Faster-RCNN as claimed in claim 1, wherein the number of training samples in the training set is 5000-6000.