CN111860112A

CN111860112A - Vehicle type recognition method and device

Info

Publication number: CN111860112A
Application number: CN202010485652.4A
Authority: CN
Inventors: 文国坤
Original assignee: HUADI COMPUTER GROUP CO Ltd
Current assignee: HUADI COMPUTER GROUP CO Ltd
Priority date: 2020-06-01
Filing date: 2020-06-01
Publication date: 2020-10-30

Abstract

The invention discloses a vehicle type identification method, which comprises the steps of constructing a vehicle type identification model based on a convolutional neural network through classifiers with different vehicle type hierarchical characteristics and a plurality of convolutional layers including a public convolutional layer, identifying vehicle images in a video stream, outputting a vehicle type corresponding to a vehicle in the video stream, rapidly identifying the vehicle type in a traffic video on the premise of ensuring the accuracy, and solving the problems of low vehicle type identification rate and poor real-time property in the prior art.

Description

Vehicle type recognition method and device

Technical Field

The application relates to the field of intelligent transportation, in particular to a vehicle type identification method and a vehicle type identification device.

Background

With the development of computer hardware technology and computer vision technology, Intelligent Transportation (ITS) has also been rapidly developed. Automatic traffic incident detection systems are an important part of intelligent traffic systems, and are also receiving more and more attention from broad students. A good automatic traffic incident detection system, vehicle detection is critical. The traditional vehicle detection methods are many, but the robustness is not good enough, and the vehicle detection based on the traffic video gradually replaces the traditional detection method due to the large detection range, small engineering quantity, simple installation, low cost and rich information. The identification of vehicles by using videos relates to two main technologies: the detected location of the vehicle and the identification of the vehicle.

The detected location of the vehicle is the basis for vehicle identification. There are two main methods for detecting and positioning a vehicle: one is that the background is modeled by using Gaussian, and then the background is subtracted by using a video frame picture to obtain a foreground moving vehicle; the other type is fixed for a traffic video collector, and a moving target can be obtained by using a frame difference method

At present, vehicle type recognition and classification mainly has two modes: (1) features of the image are extracted using a feature extraction algorithm and then classified using a conventional machine learning classifier. For example: extracting Harr-like characteristics of the vehicle image, and then using Adaboost to perform characteristic selection and classifying; extracting features such as sizes of tail lamps and vehicles, and classifying the vehicle types by using a Hybrid Dynamic Bayesian Network (HDBN); using vehicle edge information as features and using Adaboost for classification; and extracting the background through Gaussian background modeling to further obtain foreground vehicles, then extracting the characteristics of the aspect ratio, the width and the like of the vehicles, and classifying by using a Support Vector Machine (SVM). (2) End-to-end learning is carried out by using deep learning, and the model has the characteristic extraction capability automatically. For example: end-to-end learning is carried out by using a neural network; and (3) identifying vehicles and pedestrians by using a Deep Belief Network (DBN). However, the recognition accuracy of the first method is generally lower than that of the second method, which has high time complexity and no real-time property.

Disclosure of Invention

The application provides a vehicle type identification method, which solves the problems of low vehicle type identification rate and poor real-time performance in the prior art.

The application provides a vehicle type identification method, which comprises the following steps:

extracting a vehicle image to be identified in the video stream by using a frame difference method;

constructing a vehicle type recognition model based on a convolutional neural network; the model comprises classifiers for different hierarchical features of a vehicle type and a plurality of convolution layers including a common convolution layer;

the vehicle type recognition model is used for recognizing the vehicle type of the vehicle image to be recognized, and the classifier is used for classifying the low-level features of the vehicle image to be recognized in the public convolution layer and the high-level features of the vehicle image to be recognized in all the convolution layers;

comparing the absolute value of the difference value of the two maximum probability values output in the low-level features with a preset threshold, and if the absolute value is greater than the preset threshold, outputting a vehicle type corresponding to the maximum probability value by the model;

and if the difference value is smaller than a preset threshold value, different weight coefficients are respectively given to the probability values output by the low-level features and the high-level features, weighting calculation is carried out, and the model outputs the vehicle type corresponding to the maximum probability value in the weighting calculation result.

Preferably, the extracting the vehicle image to be identified in the video stream by using the frame difference method includes:

extracting a minimum circumscribed rectangle image containing a vehicle from a video stream by using a frame difference method;

and taking the minimum circumscribed rectangle image as a vehicle image to be identified.

Preferably, the extracting the minimum bounding rectangle image containing the vehicle in the video stream by using the frame difference method includes:

calculation of two consecutive images F in a video stream by means of an edge detection operator_kAnd F_k－1Corresponding E_kAnd E_k－1Two edge images;

difference E_kAnd E_k－1Two edge images to obtain an edge difference image D_k；

Edge difference image D_kDividing into multiple blocks, marking non-0 blocks as S_k；

According to the preset value pair S_kBinarizing to obtain an edge difference image D_kA corresponding matrix M;

and connecting the non-0 blocks in the matrix M, and deleting the area which does not accord with the size of the pixel block of the vehicle image to obtain the minimum circumscribed rectangle image containing the vehicle.

Preferably, after the step of constructing the vehicle type recognition model based on the convolutional neural network, the method further comprises the following steps:

training a classifier of a plurality of convolutional layers including a common convolutional layer and high-level features included in the model using a training data set forward propagation and backward propagation;

The training data set is passed through a common convolutional layer to obtain a feature data set, the classifier of the low-level features included in the model is trained using the feature data set, and common convolutional layer parameters are fine-tuned during back propagation.

Preferably, the method further comprises the following steps:

testing the trained vehicle type recognition model based on the convolutional neural network by using a test data set;

and if the accuracy of the test result is greater than a preset threshold value, the model test is passed.

Preferably, the classifier is configured to classify low-level features of the vehicle images to be identified in the common convolutional layer and high-level features of the vehicle images to be identified in all convolutional layers, and includes:

the model comprises classifiers for low-level features and high-level features of a vehicle model;

the low-level feature classifier classifies low-level features of the vehicle image to be recognized in the model public convolution layer;

the high-level feature classifier classifies high-level features of the vehicle image to be recognized in a plurality of convolutional layers of the model including the common convolutional layer.

Preferably, comparing the absolute value of the difference between the two maximum probability values output in the low-level features with a preset threshold, and if the absolute value is greater than the preset threshold, outputting the vehicle type corresponding to the maximum probability value by the model, including:

Comparing the absolute value of the difference value of the two maximum probability values output in the low-level features with a preset threshold value;

and if the absolute value is greater than a preset threshold value, determining the vehicle type of the vehicle image to be identified as the vehicle type corresponding to the maximum probability value.

Preferably, if the difference is smaller than a preset threshold, different weight coefficients are respectively assigned to the probability values output by the low-level features and the high-level features, and the weighted calculation is performed, where the model outputs a vehicle model corresponding to the maximum probability value in the weighted calculation result, and the method includes:

if the absolute value is smaller than a preset threshold value, a weight coefficient alpha is given to the probability value output by the low-level feature, a weight coefficient beta is given to the probability value output by the high-level feature, and alpha + beta is 1& & alpha < beta;

performing weighted operation on the probability value output by the low-level feature by using alpha, and performing weighted operation on the probability value output by the high-level feature by using beta;

and determining the vehicle type of the vehicle image to be identified according to the maximum probability value in the weighting operation result, wherein the vehicle type is the vehicle type corresponding to the maximum probability value.

This application provides a vehicle type recognition device simultaneously, includes:

the image acquisition unit is used for extracting a vehicle image to be identified in the video stream by using a frame difference method;

The model construction unit is used for constructing a vehicle type recognition model based on a convolutional neural network; the model comprises classifiers for different hierarchical features of a vehicle type and a plurality of convolution layers including a common convolution layer;

the characteristic classification unit is used for performing vehicle type identification on the vehicle images to be identified by using the vehicle type identification model, and the classifier is used for classifying the low-level characteristics of the vehicle images to be identified in the public convolution layer and the high-level characteristics of the vehicle images to be identified in all the convolution layers;

the vehicle type identification unit is used for comparing the difference absolute value of the two maximum probability values output in the low-level features with a preset threshold value, and if the absolute value is greater than the preset threshold value, the model outputs the vehicle type corresponding to the maximum probability value;

and the vehicle type identification unit is used for respectively endowing different weight coefficients to the probability values output by the low-level features and the high-level features and performing weighted calculation if the difference value is smaller than a preset threshold value, and outputting a vehicle type corresponding to the maximum probability value in the weighted calculation result by the model.

Preferably, the vehicle type recognition unit includes:

the comparison subunit compares the absolute value of the difference value of the two maximum probability values output in the low-level features with a preset threshold value;

And the vehicle type output subunit determines the vehicle type of the vehicle image to be identified as the vehicle type corresponding to the maximum probability value if the absolute value is greater than a preset threshold value.

The vehicle type identification method comprises the steps of constructing a vehicle type identification model based on a convolutional neural network through classifiers with different vehicle type hierarchical features and a plurality of convolutional layers including public convolutional layers, identifying vehicle images in a video stream, outputting vehicle types corresponding to vehicles in the video stream, rapidly identifying vehicle types in a traffic video on the premise of ensuring accuracy, and solving the problems of low vehicle type identification rate and poor real-time performance in the prior art.

Drawings

Fig. 1 is a schematic flowchart of a vehicle type identification method provided in the present application;

FIG. 2 is a schematic view of a target detection process involved in the present application;

FIG. 3 is a schematic diagram of a vehicle type recognition model based on a convolutional neural network according to the present application;

FIG. 4 is a schematic diagram of a vehicle type recognition apparatus provided by the present application;

fig. 5 is a diagram of a moving object recognition system architecture to which the present application relates.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

Fig. 1 is a schematic flow chart of a vehicle type identification method provided by the present application, and the method provided by the present application is described in detail below with reference to fig. 1.

And step S101, extracting the vehicle image to be identified in the video stream by using a frame difference method.

At present, the mainstream methods for detecting the moving object include an adjacent frame difference method, an optical flow method, a background elimination method, a statistical learning-based method and the like. Generally, most of traffic video collectors are placed in important places such as intersections and have fixed position invariance. Therefore, in order to detect a moving object in real time, the amount of calculation is not too large, and therefore, a modified frame difference method is used to detect the moving object (a running vehicle). The method comprises the following steps: the frame difference method is used for extracting the minimum circumscribed rectangle image of the vehicle in the video stream, and the minimum circumscribed rectangle image is used as the vehicle image to be identified, and a specific flow chart is shown in an attached figure 2.

The detailed detection process is as follows:

(1) calculation of two consecutive frames of images F in a video stream by the edge detection operator Candy_kAnd F_k－1Corresponding E_kAnd E_k－1Two edge images;

(2) difference E_kAnd E_k－1Two edge images to obtain an edge difference image D_k；

(3) Edge difference image D _kDividing into multiple blocks, marking non-0 blocks as S_k；

(4) According to the preset value pair S_kBinarizing to obtain an edge difference image D_kA corresponding matrix M;

(5) and connecting the non-0 blocks in the matrix M, and deleting the area which does not accord with the size of the pixel block of the vehicle image to obtain the minimum circumscribed rectangle image containing the vehicle.

Step S102, constructing a vehicle type recognition model based on a convolutional neural network; the model includes classifiers for different hierarchical features of a vehicle model, and a plurality of convolutional layers including a common convolutional layer.

Compared with the traditional machine learning classifier, such as SVM and AdaBoost, the convolutional neural network has the following advantages:

(1) the traditional machine learning classifier classifies images and needs to manually extract features such as SIFT, SURF, PCA-SIFT, HOG, Harris and the like; obtaining the feature vectors of these features generally requires a large amount of computation, and also requires preprocessing operations on the images, and the convolutional layer of the convolutional neural network has the capability of automatically extracting the features, so that the original images can be classified by directly inputting the images into the network.

(2) Different convolutional layers of the convolutional neural network can extract features of different layers, and within a certain number of layers, the deeper the number of layers is, the higher the layer number is, the features of the layers can be extracted.

(3) The appearance of the same vehicle is complex under different environments, the distance, the height, the angle, the illumination and the like of the collector greatly influence the same vehicle, all the conditions cannot be considered by manually designing a characteristic operator, and the convolutional neural network has good resistance to affine transformation such as scaling and translation, so that the influence caused by the environments can be effectively overcome.

The convolutional layer of the convolutional neural network can extract a large number of and abundant image features, the features generally have affine transformation resistance, translation invariance, rotation invariance, scale invariance and the like, and the features of higher and higher levels can be extracted along with the increase of the number of convolutional layers, so that the convolutional neural network has great advantages in the field of target identification compared with a shallow network. However, the model calculation amount of the convolutional neural network is mainly focused on the calculation of the convolutional layer, so that the increase of the number of the convolutional layers inevitably leads to the increase of the calculation amount until the real-time requirement cannot be met. For example, the classical target recognition network AlexNet has 5 convolutional layers, and the convolutional layer calculation time accounts for 0.841 of the forwarding time of the whole network; ResNet-50 has 48 convolutional layers, whose convolutional layer computation time is 0.999 of the total network forward time.

In the task of target identification, the differences between different targets are different, some targets have obvious differences such as colors and shapes, and some targets have only texture differences, so that the strategy of extracting features of different layers for comparison can be adopted for identification of different classes of targets in consideration of specific conditions in the identification process to reduce a large amount of convolution calculation. The present application thus classifies vehicles into 5 types car, bus, truck, motorcycle, non-motor vehicle for vehicle type identification applications. The motorcycle and the truck have larger difference with the low-level characteristics of the motorcycle head appearance, the vehicle outline and the like of other vehicle types, and different convolutional layers of the convolutional neural network can extract different hierarchical characteristics, so that different hierarchical characteristics can be extracted from different vehicle types for classification, and the calculated amount of partial convolutional layers is reduced. The convolutional neural network model was developed and shown in figure 3. The model includes classifiers for different hierarchical features of the vehicle model, and a plurality of convolutional layers including a common convolutional layer.

The model is improved on the basis of an original convolutional neural network, and the model mainly comprises a plurality of convolutional layers (comprising a pooling function, an activation function and the like) and two classifiers (the original convolutional neural network only has one classifier).

The classifier 1 and the classifier 2 are respectively responsible for different tasks, the classifier 1 is responsible for classifying low-level features, the classifier 2 is responsible for classifying high-level features, and the classifier 1 and the classifier 2 share a convolutional layer in front of a model; the final classification result is determined by the classifier 1 and the classifier 2 together, and the calculation formula is as follows:

in the formula (1), P_resFor the final classification result, P_class1And P_class2Are the output results of classifier 1 and classifier 2, respectively, and are both n-dimensional probability column vectors (n for n classes). F_class1，MaxIs P_class1Maximum value in column vector, F_class1，SecIs P_class1The next largest value in the column vector. For the threshold value, whether the final result takes into account the result of the classifier 2 is controlled. The sign (x) function is a sign function.

Model in fig. 3: classifier 1 classifies the features extracted by the two convolutional layers, and classifier 2 classifies the features extracted by the 5 convolutional layers, the common convolutional layer being the first two convolutional layers. The model can also be regarded as two convolutional neural networks which respectively correspond to a sub-model in a bold solid line frame and a sub-model in a dashed line frame, and the two sub-models share part of convolutional layers and are mutually constrained.

After the model is built, training a plurality of convolutional layers including a common convolutional layer and a classifier of high-level features included in the model by using forward propagation and backward propagation of a training data set; the training data set is passed through a common convolutional layer to obtain a feature data set, the classifier of the low-level features included in the model is trained using the feature data set, and common convolutional layer parameters are fine-tuned during back propagation.

After training is finished, testing the trained vehicle type recognition model based on the convolutional neural network by using a test data set; and if the accuracy of the test result is greater than a preset threshold value, the model test is passed.

Then, the model can identify the vehicle type.

And S103, identifying the vehicle type of the vehicle image to be identified by using the vehicle type identification model, and classifying the low-level features of the vehicle image to be identified in the public convolution layer and the high-level features of the vehicle image to be identified in all the convolution layers by using the classifier.

The model comprises two classifiers, namely a classifier aiming at the low-level feature and the high-level feature of a vehicle type, wherein the classifier 1 corresponds to a low-level feature classifier, the classifier 2 corresponds to a high-level feature classifier, and the low-level feature classifier classifies the low-level feature of a vehicle image to be identified in a common convolution layer of the model; the high-level feature classifier classifies high-level features of the vehicle image to be recognized in a plurality of convolutional layers of the model including the common convolutional layer.

And step S104, comparing the difference absolute value of the two maximum probability values output in the low-level features with a preset threshold, and if the absolute value is greater than the preset threshold, outputting the vehicle type corresponding to the maximum probability value by the model.

In the process of identifying the vehicle type, when a vehicle image test to be identified passes through the first two convolutional layers (public convolutional layers), obtaining low-level features, inputting the low-level features into a classifier 1 for classification operation, meanwhile, when the vehicle image to be identified passes through the last three convolutional layers, obtaining high-level features, inputting the high-level features into a classifier 2 for classification operation, if the classification of the classifier 1 is completed, comparing the difference absolute value of two maximum probability values output from the low-level features with a preset threshold, if the absolute value is greater than the preset threshold, determining the vehicle type of the vehicle image to be identified, and outputting the probability value of each vehicle type in the vehicle type low-level features corresponding to the maximum probability values. The probability value output from the low-level features is the probability value of the features with obvious vehicle type, for example, the differences between the low-level features such as the vehicle head appearance, the vehicle profile and the like are large, the two maximum probability values are taken for difference calculation, the absolute value of the difference is compared with a preset threshold, if the absolute value is larger than the preset threshold, the low-level features are obvious, namely the low-level features can be classified correctly after being trusted by a large confidence coefficient, and the vehicle type of the vehicle image to be identified is determined to be the vehicle type corresponding to the maximum probability value. At this time, the classification operation by the classifier 2 and the subsequent recognition of the high-level features may be stopped. The next vehicle image P to be identified can be carried out _1testIdentification of (1). Namely, formula (2): c_test＝P_1test

And S105, if the difference value is smaller than a preset threshold value, different weight coefficients are respectively given to the probability values output by the low-level features and the high-level features, weighting calculation is carried out, and the model outputs the vehicle type corresponding to the maximum probability value in the weighting calculation result.

If the absolute value is smaller than a preset threshold value, a weight coefficient alpha is given to the probability value output by the low-level feature, a weight coefficient beta is given to the probability value output by the high-level feature, and alpha + beta is 1& & alpha < beta; performing weighted operation on the probability value output by the low-level feature by using alpha, and performing weighted operation on the probability value output by the high-level feature by using beta; and determining the vehicle type of the vehicle image to be identified according to the maximum probability value in the weighting operation result, wherein the vehicle type is the vehicle type corresponding to the maximum probability value.

If the absolute value is smaller than the preset threshold, it is not recognized which of the vehicle types corresponding to the two maximum probability values is the final vehicle type, for example, the two maximum probability values are the same, one vehicle type identified from the vehicle exterior is a car, and one vehicle type identified from the wheel profile is a bus, it cannot be determined which is the vehicle type that should be output finally, so that a high-level feature needs to be added for further confirmation at this time. The final result will be determined by classifier 1 and classifier 2 together by weight, i.e. equation (3):

In the above equation (assuming that the vectors are classified into 5 classes and the vectors are 5 dimensions), the first vector is the probability vector output by the classifier 1, and only the maximum value and the second largest value are stored, and the other probability terms are set to 0 (assuming that class 1 and class 3 are respectively the maximum probability and the second largest probability); the second vector is the probability vector output by the classifier 2; alpha and beta are the weight coefficients of vector 1 and vector 2, respectively; since the classifier 1 cannot identify with high confidence whether the final result is the maximum probability class or the second-highest probability class, and the classifier 2 classifies the higher-level features, the final result will be more biased toward the result of the classifier 2, that is, it is ensured that α < β and α + β is 1; end result C_testTake the weighted sum of vector 1 and vector 2.

The present application also provides a vehicle type recognition apparatus, as shown in fig. 4, including:

the image acquisition unit 410 is used for extracting a vehicle image to be identified in the video stream by using a frame difference method;

the model construction unit 420 is used for constructing a vehicle type recognition model based on a convolutional neural network; the model comprises classifiers for different hierarchical features of a vehicle type and a plurality of convolution layers including a common convolution layer;

a feature classification unit 430, configured to perform vehicle type recognition on the vehicle image to be recognized by using the vehicle type recognition model, where the classifier is configured to classify low-level features of the vehicle image to be recognized in the common convolutional layer and high-level features of the vehicle images to be recognized in all convolutional layers;

The vehicle type identification unit 440 compares the absolute value of the difference between the two maximum probability values output in the low-level features with a preset threshold, and if the absolute value is greater than the preset threshold, the model outputs the vehicle type corresponding to the maximum probability value;

and if the difference is smaller than a preset threshold, the vehicle type identification unit 450 assigns different weight coefficients to the probability values output by the low-level features and the high-level features respectively, performs weighted calculation, and outputs the vehicle type corresponding to the maximum probability value in the weighted calculation result by the model.

Preferably, the vehicle type recognition unit includes:

In summary, the application relates to identification of a moving target, the work flow of an identification system is shown in fig. 5, the system firstly obtains an external rectangular image of an area where the moving target is located in a traffic video through a moving target detection module, then carries out manual labeling, and finally trains and tests a classification model.

Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the invention.

Claims

1. A vehicle type recognition method is characterized by comprising the following steps:

2. The method of claim 1, wherein extracting the vehicle image to be identified from the video stream using a frame difference method comprises:

3. The method of claim 2, wherein extracting the minimum bounding rectangle image containing the vehicle in the video stream using a frame difference method comprises:

4. The method of claim 1, wherein after the step of constructing the convolutional neural network-based vehicle type recognition model, further comprising:

5. The method of claim 1 or 4, further comprising:

6. The method of claim 1, wherein the classifier classifying low-level features of vehicle images to be identified in a common convolutional layer and high-level features of vehicle images to be identified in all convolutional layers comprises:

7. The method of claim 1, wherein comparing an absolute value of a difference between two maximum probability values output from the low-level features with a preset threshold, and if the absolute value is greater than the preset threshold, the model outputting a vehicle type corresponding to the maximum probability value comprises:

8. The method of claim 1, wherein if the difference is smaller than a preset threshold, different weight coefficients are respectively given to probability values output by the low-level features and the high-level features, and weighting calculation is performed, wherein a vehicle type corresponding to a maximum probability value in a model output weighting calculation result includes:

9. A vehicle type recognition apparatus characterized by comprising:

10. The apparatus according to claim 9, wherein the vehicle type recognition unit includes: