CN113435339B

CN113435339B - Vehicle attribute detection method, device and storage medium

Info

Publication number: CN113435339B
Application number: CN202110721738.7A
Authority: CN
Inventors: 胡倩; 陈燕娟; 谢晓汶
Original assignee: Suzhou Keda Technology Co Ltd
Current assignee: Suzhou Keda Technology Co Ltd
Priority date: 2021-06-28
Filing date: 2021-06-28
Publication date: 2022-07-12
Anticipated expiration: 2041-06-28
Also published as: CN113435339A

Abstract

The application relates to a vehicle attribute detection method, equipment and a storage medium, belonging to the technical field of computers, wherein the method comprises the following steps: obtaining a pre-trained vehicle attribute detection model, wherein the vehicle attribute detection model comprises n network branches; a first branch of the n network branches is used for simultaneously detecting m types of attributes of the vehicle, and each second branch different from the first branch of the n network branches is used for detecting a specified attribute of the m types of attributes; inputting a target image to be subjected to vehicle attribute detection into a vehicle attribute detection model to obtain attribute detection results respectively output by n network branches; and determining the attribute information of the target vehicle in the target image by combining the attribute detection results of the n network branches. When the m attributes of the vehicles cannot be detected simultaneously by the first branch, the attribute information of the target vehicle can be determined by combining the attribute detection result of the second branch, and the vehicle attribute detection model can be ensured to output the vehicle attributes without increasing the classification.

Description

Vehicle attribute detection method, device and storage medium

[ technical field ] A method for producing a semiconductor device

The application relates to a vehicle attribute detection method, equipment and a storage medium, and belongs to the technical field of computers.

[ background of the invention ]

With the development of Intelligent Transport System (ITS) technology, automatic recognition of vehicle attributes from images can be achieved. The vehicle attribute that the user needs to identify may be various, such as: the identification of the vehicle brand requires identifying attributes of the vehicle such as a major brand, a minor brand, and a year money.

In order to reduce the number of neural network models used, the conventional multi-attribute detection method includes: and combining the multiple attribute labels of the vehicle image into a class of labels, and then training the neural network model by using the vehicle image and the combined labels. And then, simultaneously identifying multiple attributes of the image by using the trained network model to obtain the attributes of the vehicle in the image.

However, since the trained network model is obtained by fusing and training a plurality of attribute labels, when a classification of one attribute is added to a vehicle, even if the classifications of other attributes are not added, the newly added classification cannot be identified by the trained network model, and thus the detection results of other attributes cannot be identified. Such as: and the trained network model is used for simultaneously identifying the large brand, the sub-brand and the annual fund of the vehicle. If the classification of the new year money is increased, the network model obtained by training cannot identify the new year money, and at the moment, the large brand and the sub-brand of the vehicle cannot be identified.

[ summary of the invention ]

The application provides a vehicle attribute detection method, equipment and a storage medium, which can solve the problem that when a network model obtained by fusing and training a plurality of attribute labels is used for detecting vehicle attributes, other attributes cannot be identified due to newly added attribute classification. The application provides the following technical scheme:

in a first aspect, a vehicle property detection method is provided, the method comprising:

acquiring a target image to be subjected to vehicle attribute detection;

obtaining a pre-trained vehicle attribute detection model, wherein the vehicle attribute detection model comprises n network branches; a first branch of the n network branches is used for simultaneously detecting m attributes of the vehicle, and each second branch of the n branches, which is different from the first branch, is used for detecting a specified attribute of the m attributes; both n and m are integers greater than 1;

inputting the target image into the vehicle attribute detection model to obtain attribute detection results respectively output by the n network branches;

and determining the attribute information of the target vehicle in the target image by combining the attribute detection results of the n network branches.

Optionally, the m-attribute is used to indicate a brand attribute of the vehicle; the m brand attributes are obtained by gradually dividing the vehicle brands according to a preset hierarchical dividing mode;

the second branch is used for detecting brand attributes of specified levels in the brand attributes from level 1 to level m-1 to obtain the specified attributes; and k is an integer of 1 to m-1 in sequence.

Optionally, the brand attribute at the designated level is a major brand attribute, and the second branch identifies the major brand attribute through the vehicle logo information in the target image; the second branch is also used for positioning a car logo position in the target image;

and/or the presence of a gas in the atmosphere,

the brand attribute at the designated level is a small brand attribute, and the second branch identifies the small brand attribute through the vehicle face information in the target image; the second branch is also used for locating the car face position in the target image.

Optionally, the determining attribute information of the target vehicle in the target image in combination with the attribute detection results of the n network branches includes:

and when the attribute detection result of the first branch indicates that the confidence degrees of the m types of attributes are smaller than a first confidence degree threshold value and the attribute detection result of the second branch indicates that the confidence degree of the specified attribute is larger than a second confidence degree threshold value, classifying the attribute with the highest confidence degree in the attribute detection results output by the second branch as the attribute information of the target vehicle.

and when the attribute detection result of the first branch indicates that the confidence degrees of the m attributes are greater than a first confidence degree threshold value, classifying the attribute of the m attributes with the highest confidence degree output by the first branch as the attribute information of the target vehicle.

Optionally, the first branch comprises a fully connected layer and a classification layer connected to the fully connected layer, and an output of the classification layer is a confidence of each vehicle attribute.

Optionally, the second branch includes a convolutional layer, a pooling layer, and a classification subbranch and a positioning subbranch both connected to the pooling layer, where an output of the classification subbranch is a confidence of the specified attribute, and an output of the positioning subbranch is position information of the specified attribute corresponding to the target image;

wherein the convolutional layer and the pooling layer are used for extracting feature maps of different scales.

Optionally, when the second branch identifies a large brand attribute through the car logo information in the target image, the convolution layer and the pooling layer are used for extracting feature maps of different scales of a car logo area in the target image;

when the second branch identifies a small brand attribute through the car face information in the target image, the convolution layer and the pooling layer are used for extracting feature maps of different scales of the car face area in the target image.

Optionally, the vehicle attribute detection model is obtained by training a pre-created neural network model by using training data and a preset loss function;

each group of training data comprises a sample image and label information of the sample image, wherein the label information comprises an m attribute classification label of the sample image and a position label corresponding to the specified attribute;

the preset loss functions comprise a first loss function corresponding to the first branch, a second loss function corresponding to the classification subbranch and a third loss function corresponding to the positioning subbranch, and the vehicle attribute detection model is obtained by training with the weighted loss values of the first loss function, the second loss function and the third loss function.

In a second aspect, an electronic device is provided, the device comprising a processor and a memory; the memory stores therein a program that is loaded and executed by the processor to implement the vehicle property detection method provided by the first aspect.

In a third aspect, a computer-readable storage medium is provided, in which a program is stored, which, when executed by a processor, is configured to implement the vehicle property detection method provided in the first aspect.

The beneficial effect of this application includes at least: the method comprises the steps that a pre-trained vehicle attribute detection model is obtained, wherein the vehicle attribute detection model comprises n network branches; a first branch of the n network branches is used for simultaneously detecting m attributes of the vehicle, and each second branch different from the first branch of the n branches is used for detecting a specified attribute of the m attributes; inputting a target image to be subjected to vehicle attribute detection into a vehicle attribute detection model to obtain attribute detection results respectively output by n network branches; determining attribute information of the target vehicle in the target image by combining the attribute detection results of the n network branches; the problem that when a network model obtained by fusing and training a plurality of attribute labels is used for detecting the attributes of the vehicle, other attributes cannot be identified due to newly added attribute classification can be solved; when the m types of attributes of the vehicle cannot be detected simultaneously by the first branch, namely when the confidence degree of the attribute information output by the first branch is low, the attribute information of the target vehicle can be determined by combining the attribute detection result of the second branch, and the vehicle attribute detection model can be ensured to output the vehicle attributes without increasing the classification.

In addition, when the m attributes are brand attributes and the brand attributes are pre-divided into m levels, the designated attributes are set to be brand attributes of designated levels from the 1 st level to the m-1 st level, namely, the brand attributes are not the last level, and the classification increase probability of the last level brand attributes is high, so that the second branch can be ensured to identify the vehicle attributes, and the vehicle attribute detection model can be ensured to output the vehicle attributes without increasing the classification.

In addition, the designated attribute is set as the large-brand attribute, and the large-brand attribute is calculated through the vehicle logo information of the vehicle, and as the vehicle logos of the vehicles with the same large brand are not increased, the second branch can be further ensured to be capable of identifying the large-brand attribute of the vehicle. Meanwhile, the position of the car logo can be output, so that the detection result of the model can be enriched, and the accuracy of model training can be improved.

In addition, the designated attribute is set as the small brand attribute, and the small brand attribute is calculated according to the face information of the vehicle, so that the second branch can be further ensured to be capable of identifying the small brand attribute of the vehicle because the faces of the vehicles with the same small brand generally do not change greatly. Meanwhile, the position of the car face can be output, so that the detection result of the model can be enriched, and the accuracy of model training can be improved.

In addition, when the classification of any one vehicle attribute is not increased, the confidence coefficient of the attribute classification corresponding to the m vehicle attributes which can be output by the first branch is not influenced, and when the confidence coefficient is greater than a first confidence coefficient threshold value, the m vehicle attributes output by the first branch are used as the attribute information of the vehicle in the target image; at the same time, the second branch can output the confidence of each attribute classification of the specified attribute. Because the attribute detection result of the first branch comprises a plurality of kinds of attribute information, and the second branch only outputs attribute information of a certain specified attribute, when the confidence degree of the attribute classification output by the first branch is greater than the first confidence degree threshold value, the vehicle attribute output by the first branch is directly used as the vehicle attribute result in the target image, and more accurate and complete vehicle attribute information can be obtained.

In addition, by arranging the convolutional layer and the pooling layer in the second branch to extract feature maps of different scales, the classification performance and the position regression performance of the second branch can be improved.

The foregoing description is only an overview of the technical solutions of the present application, and in order to make the technical solutions of the present application more clear and clear, and to implement the technical solutions according to the content of the description, the following detailed description is made with reference to the preferred embodiments of the present application and the accompanying drawings.

[ description of the drawings ]

FIG. 1 is a flow chart of a vehicle attribute detection method provided by an embodiment of the present application;

FIG. 2 is a schematic illustration of a brand attribute rating of a vehicle provided by one embodiment of the present application;

FIG. 3 is a schematic diagram of a vehicle attribute detection model provided by an embodiment of the present application;

fig. 4 is a block diagram of a vehicle attribute detection apparatus provided in an embodiment of the present application;

fig. 5 is a block diagram of an electronic device provided by an embodiment of the application.

[ detailed description ] embodiments

The following detailed description of embodiments of the present application will be made with reference to the accompanying drawings and examples. The following examples are intended to illustrate the present application, but are not intended to limit the scope of the present application.

Optionally, the vehicle attribute detection method provided in each embodiment is used in an electronic device for example to explain, the electronic device is a terminal or a server, the terminal may be a mobile phone, a computer, a tablet computer, a scanner, an electronic eye, a monitoring camera, and the like, and the embodiment does not limit the type of the electronic device.

Fig. 1 is a flowchart of a vehicle attribute detection method according to an embodiment of the present application, where the method includes at least the following steps:

step 101, obtaining a target image to be subjected to vehicle attribute detection.

The target image is acquired from the running environment of the vehicle. The target image may be a frame of image in the video stream, or an image captured at a single time, and the source of the target image is not limited in this embodiment.

In the application, the vehicle attributes to be detected are m, and m is an integer greater than 1. Each vehicle attribute has at least one classification. In this embodiment, there is a new possibility that at least one of the m vehicle attributes is classified.

Such as: vehicle attributes include the major brand, minor brand, and annual dollar of the vehicle.

The categories of large brands include: a, B and C;

the categories of small brands include: the series of X1 and X2 under flag A, the series of E and V under flag B, and the series of A and B under flag C;

the classification of the annual fee includes: 2012 and 2013 of small brand X1, 2020 of small brand X2, 2010E 260L and 2010E 300L of small brand E series, 2018 and 2017 of small brand V series, 2015 and 2016 of small brand a series, and 2019, 200T and 2019, 280T of small brand B series.

The vehicle attributes are only exemplary, and in actual implementation, the vehicle attributes may also include other attributes, such as: the color of the vehicle, the type of the vehicle, etc., and the present embodiment does not limit the vehicle attribute to be detected.

In this embodiment, the target image may or may not have a vehicle region. In the case where there is no vehicle region, the vehicle attribute cannot be detected.

On the other hand, in the case of having a vehicle region, if the vehicle attribute is identified by using the network model obtained by fusion training of multiple attribute tags, there are cases as follows:

1. and each classification of the vehicle attributes is not newly added during model training, and at the moment, each vehicle attribute of the vehicle can be identified by the network model.

2. The classification of certain vehicle attributes is added relative to model training, such as: the 2021 of small brand X2 is newly added in the classification of the annual money, because the network model trains a plurality of attributes simultaneously, and the obtained classification result is that a plurality of attributes correspond jointly, therefore, when certain vehicle attribute in the input target image, such as the annual money, does not participate in the model training in the training sample, and vehicle attribute information such as vehicle appearance of general different annual money can change, and newly-added annual money information is not recorded in the preset classification result, the trained network model cannot accurately identify the vehicle of the newly-added annual money, if the vehicle cannot be identified: the X2 series 2021 type vehicle of the big brand A is identified as the V series 2020 type vehicle of the big brand B. In other words, the network model is used to identify multiple vehicle attributes simultaneously and output the classification of each vehicle attribute, which may result in a vehicle attribute identification error if one of the vehicle attributes cannot be identified.

Based on the technical problem in the case 2, the vehicle attribute detection model provided in this embodiment may perform single detection on a certain vehicle attribute while performing comprehensive identification on m attributes, so as to ensure that when a result confidence is low or an error occurs while identifying multiple vehicle attributes simultaneously, a vehicle attribute identification result with a higher confidence can be obtained by separately detecting a certain vehicle attribute, which is described in detail below.

102, acquiring a pre-trained vehicle attribute detection model, wherein the vehicle attribute detection model comprises n network branches; a first branch of the n network branches is used to simultaneously detect m attributes of the vehicle, and each of the n branches, which is different from the first branch, is used to detect a specified one of the m attributes. n and m are each an integer greater than 1.

For a given attribute, the classification of the given attribute remains unchanged as new classifications for other attributes are added. The other attributes refer to attributes other than the specified attribute among the m kinds of attributes.

Optionally, the number of the specified attributes is one or at least two, and the number of the specified attributes is not limited in this embodiment.

The description is given by taking the example that m attributes are used for indicating brand attributes of the vehicle, and in one example, the m brand attributes are obtained by dividing the brand of the vehicle step by step according to a preset hierarchical dividing mode; the second branch is used for detecting the brand attributes of the specified level in the brand attributes from level 1 to level m-1 to obtain the specified attributes; k is an integer of 1 to m-1 in order.

In this embodiment, as k increases, the new probability of the classification corresponding to the first-level brand attribute is higher, in other words, the probability of the classification increase of the (k + 1) th-level brand attribute is higher than the probability of the classification increase of the (k + 1) th-level brand attribute. Therefore, the probability of outputting the vehicle attribute by the vehicle attribute detection model can be improved by setting the second branch to detect the specified attribute with the small new classification probability. Based on the principles described above, the specified attributes may be first-level brand attributes.

The number of second branches is one or at least two. When the number of the second branches is at least two, the specified attributes detected by different second branches are different.

Such as: based on the example provided in step 101, the multi-level catalog obtained after the brand attributes of the vehicle are divided step by step according to the division manner of the large brand, the small brand and the annual fee refers to fig. 2, and the vehicle attributes include the large brand, the small brand and the annual fee of the vehicle. The second branch comprises 2, wherein one second branch detects a specified attribute of a big brand and the other second branch detects a specified attribute of a small brand.

In one example, the specified level of brand attributes includes a major brand attribute, the second branch identifying the major brand attribute by the emblem information in the target image. Accordingly, the second branch is also used for locating the position of the emblem in the target image. The vehicle logo still remains unchanged due to the new increase of annual money or small brand of the vehicle. Therefore, in this embodiment, the brand attribute at the designated level is set as the major brand attribute, and the major brand attribute is detected by detecting the vehicle logo information, so that it can be ensured that the annual payment or the minor brand of the vehicle is newly added, and when the confidence corresponding to the m vehicle attributes output by the first branch is low, the major brand attribute can be accurately identified through the second branch.

In another example, the specified level of brand attributes includes a small brand attribute, the second branch identifying the small brand attribute by the face information in the target image. Accordingly, the second branch is also used to locate the car face position in the target image. As the annual payment of the vehicles is increased, the vehicle faces of the vehicles of the same small brand are still kept unchanged or slightly changed. Therefore, in this embodiment, the brand attribute at the designated level is set as the small brand attribute, and the detection of the small brand attribute is realized by detecting the vehicle face information, so that the new annual payment of the vehicle can be ensured, and when the confidence corresponding to the m types of vehicle attributes output by the first branch is low, the small brand attribute of the vehicle can be accurately identified through the second branch.

In the above example, the second branch is taken as two examples for explanation, and in actual implementation, the number of the second branch may also be one, for example: the second branch is only used for detecting the large brand attribute, and the embodiment does not limit the arrangement mode of the second branch.

In the present embodiment, referring to fig. 3, the vehicle attribute detection model includes, in order from input to output, a feature extraction network 31, and a first branch 32 and a second branch 33 connected to the feature extraction network 31, respectively. In fig. 3, the number of the second branches 33 is taken as an example for explanation, in practical implementation, the number of the second branches 33 may also be at least two, the network structure of each second branch is the same as the network structure described in the following second branch 33, and the embodiment is not described one by one here.

The feature extraction network 31 extracts feature information for the input target image. In one example, the feature extraction network 31 includes at least one feature extraction layer of a convolutional layer, a BN layer, and a pooling layer.

Illustratively, the convolution kernel size of the convolutional layer is 7 × 7, and in actual implementation, the size of the convolution kernel may be set to other sizes according to requirements, and this embodiment does not limit the convolution kernel size of the convolutional layer.

Each convolutional layer is activated using an activation function. Optionally, the activation function is a relu activation function, and the problem of gradient disappearance in the network training process can be solved.

The characteristic processed by the activation function is a nonlinear characteristic, after the nonlinear characteristic is input into the BN layer, each neuron is normalized by the BN layer, the problem of gradient explosion can be solved to a certain extent, and meanwhile, the training speed can be accelerated.

Illustratively, the pooling mode is maximum pooling, in other words, the maximum pooling layer selects the maximum feature value of the corresponding feature layer to be reserved in the next layer. The size of the pooling layer is 3 × 3, and in practical implementation, the size of the pooling layer may be set to other sizes according to requirements, and the size of the pooling layer is not limited in this embodiment.

The first branch comprises a fully connected layer and a classification layer connected with the fully connected layer, and the output of the classification layer is confidence degrees of m vehicle attributes.

Illustratively, the classification layer is implemented by a softmax layer. If the category to be identified has x, the softmax layer outputs a column vector composed of x probabilities, and the sum of the x probabilities is 1. And the m vehicle attributes corresponding to the maximum probability are the attribute classification of the first branch output.

The second branch comprises a convolution layer, a pooling layer, a classification subbranch and a positioning subbranch which are connected with the pooling layer, and the output of the classification subbranch is the confidence coefficient of the designated attribute. When the second branch is used for identifying the attributes of the large brand through the vehicle logo information in the target image, the output of the positioning sub-branch is the position information of the vehicle logo in the target image. When the second branch is used for identifying the small brand attribute through the vehicle face information in the target image, the output of the positioning sub-branch is the position information of the vehicle face in the target image.

The convolutional layer and the pooling layer are used for extracting feature maps with different scales, so that the attribute classification performance and the positioning performance of the second branch are improved.

Specifically, when the second branch identifies the attribute of a large brand through the car logo information in the target image, the convolution layer and the pooling layer are used for extracting feature maps of different scales of a car logo area in the target image;

and when the second branch identifies the small brand attribute through the car face information in the target image, the convolution layer and the pooling layer are used for extracting feature maps of different scales of the car face area in the target image.

Optionally, in this embodiment, the output of the pooling layer is respectively convolved with two different convolution kernels, where one convolution kernel is used to process the classification information of the specified attribute, so that the classification sub-branch outputs the confidence of the category corresponding to the brand of the vehicle. Another convolution kernel is an operation on spatial positions to locate the position information of the local area of the output vehicle of the subbranch in the target image. Wherein the attribute of the local area of the vehicle is classified as an output result of the second branch.

Such as: and the classification sub-branch outputs the vehicle logo classification, and positions the position information of the sub-branch output vehicle logo, wherein the position information comprises at least one coordinate value capable of determining the position of the vehicle logo. Such as: the position of the car logo is represented by a rectangular frame, and the position information can be the coordinate value of the top left vertex of the rectangular frame and can also comprise the width and the height of the rectangular frame; alternatively, the position information may be coordinate values of 4 vertices of the rectangular frame.

Illustratively, the classification subbranch is realized by a full connection layer + softmax layer, and the positioning subbranch is realized by a position regression (or frame regression).

In this embodiment, the vehicle attribute detection model is obtained by training a neural network model created in advance using training data and a preset loss function.

Each group of training data comprises a sample image and label information of the sample image, and the label information comprises an m attribute classification label of the sample image and a position label corresponding to the designated attribute.

Such as: the m attribute classification labels comprise large brand classification, small brand classification and annual payment classification of vehicles in the sample image, and the position labels are position information of vehicle marks or vehicle faces.

When the classification layer in the first branch is implemented by a softmax layer, the first loss function may be softmax loss; when the classification subbranch in the second branch is implemented by the softmax layer, the second loss function may be softmax loss; when the locator sub-branch in the second branch is implemented by a position regression method, the third loss function may be smooth L1, and in actual implementation, the first loss function, the second loss function, and the third loss function may also be set as other types of loss functions according to requirements, and this embodiment is not listed here.

Weighted values corresponding to the first loss function, the second loss function, and the third loss function are pre-stored in the electronic device, and the value of the weighted value is not limited in this embodiment.

In the training process, inputting a sample image in training data into a pre-established network model to obtain a model result; calculating loss function values of the first loss function, the second loss function and the third loss function respectively based on the model result; then, weighting and summing the loss function values to obtain a final loss value; optimizing model parameters of the network model through a back propagation algorithm based on the final loss value; and continuously iterating the model until the final loss value is lower than the loss function threshold, and finishing training when the model is converged to obtain the vehicle attribute detection model.

Optionally, step 102 may be executed before step 101, or may also be executed after step 101, or may also be executed simultaneously with step 101, and the execution order between

steps

101 and 102 is not limited in this embodiment.

And 103, inputting the target image into the vehicle attribute detection model to obtain attribute detection results output by the n network branches respectively.

Optionally, each network branch is used to output a confidence for each attribute classification. Specifically, the first branch is used for outputting m attributes as a comprehensive confidence of one attribute classification; the second branch is used to output the confidence of each attribute classification of the specified attribute.

And step 104, determining the attribute information of the target vehicle in the target image by combining the attribute detection results of the n network branches.

Determining the attribute information of the target vehicle in the target image by combining the attribute detection results of the n network branches, wherein the method at least comprises the following conditions:

the first method comprises the following steps: and when the attribute detection result of the first branch indicates that the confidence degrees of the m types of attributes are smaller than a first confidence degree threshold value and the attribute detection result of the second branch indicates that the confidence degree of the designated attributes is larger than a second confidence degree threshold value, classifying the attribute with the highest confidence degree in the attribute detection results output by the second branch as the attribute information of the target vehicle.

The first confidence threshold and the second confidence threshold are equal to or different from each other, and the values of the first confidence threshold and the second confidence threshold are not limited in this embodiment.

Since the first branch outputs m attributes as the comprehensive confidence of an attribute classification, when some attribute is a new attribute classification, the comprehensive confidence is reduced. And the classification of the specified attribute is not increased generally, so that the second branch can detect the attribute information of the specified attribute and output the confidence corresponding to the attribute information of the specified attribute, and if the specified attribute is different from the newly added attribute, the newly added attribute does not influence the confidence of the attribute information of the specified attribute output by the second branch. In this case, the attribute class with the highest confidence level in the attribute detection results output by the second branch is used as the attribute information of the target vehicle, so that the vehicle attribute detection model can be ensured to output the accurate vehicle attribute with the specified attribute.

And the second method comprises the following steps: and when the attribute detection result of the first branch indicates that the confidence degrees of the m attributes are greater than a first confidence degree threshold value, classifying the attributes of the m attributes with the highest confidence degrees output by the first branch as the attribute information of the target vehicle.

When the classification of any one vehicle attribute is not increased, the confidence coefficient of the attribute classification corresponding to the m vehicle attributes output by the first branch is not influenced, and when the confidence coefficient is greater than a first confidence coefficient threshold value, the m vehicle attributes output by the first branch are used as the attribute information of the vehicle in the target image; at the same time, the second branch can output the confidence of each attribute classification of the specified attribute. Because the attribute detection result of the first branch comprises a plurality of kinds of attribute information, and the second branch only outputs attribute information of a certain specified attribute, when the confidence degree of the attribute classification output by the first branch is greater than the first confidence degree threshold value, the vehicle attribute output by the first branch is directly used as the vehicle attribute result in the target image, and more accurate and complete vehicle attribute information can be obtained.

In summary, in the vehicle attribute detection method provided in this embodiment, a pre-trained vehicle attribute detection model is obtained, where the vehicle attribute detection model includes n network branches; a first branch of the n network branches is used for simultaneously detecting m attributes of the vehicle, and each second branch different from the first branch of the n branches is used for detecting a specified attribute of the m attributes; inputting a target image to be subjected to vehicle attribute detection into a vehicle attribute detection model to obtain attribute detection results respectively output by n network branches; determining attribute information of the target vehicle in the target image by combining the attribute detection results of the n network branches; the problem that when a network model obtained by fusing and training a plurality of attribute labels is used for detecting the attributes of the vehicle, other attributes cannot be identified due to newly added attribute classification can be solved; when the m types of attributes of the vehicle cannot be simultaneously and accurately detected by the first branch, the attribute information of the target vehicle can be determined by combining the attribute detection result of the second branch, and the vehicle attribute detection model can be ensured to output the vehicle attributes without adding the classifications.

In addition, the designated attribute is set as the small brand attribute, and the small brand attribute is calculated through the face information of the vehicle, so that the second branch can be further ensured to be capable of identifying the small brand attribute of the vehicle because the faces of vehicles with the same small brand do not change greatly. Meanwhile, the position of the car face can be output, so that the detection result of the model can be enriched, and the accuracy of model training can be improved.

In order to more clearly understand the vehicle property detection method proposed in the present application, the method will be described below as an example. In this example, two network branches are included, i.e., n ═ 2; and the second branch is used for detecting the attribute of the large brand by the vehicle logo information of the vehicle as an example for explanation. Referring to fig. 3, the detection process of the vehicle attribute at least includes the following steps:

step 1, taking a target image as an input of a vehicle attribute detection model, and extracting feature information of the whole image by the vehicle attribute detection model through a feature extraction layer.

And 2, the vehicle attribute detection model obtains the probability (or called confidence) of attribute classification of the m vehicle attributes through the first branch according to the features extracted by the feature extraction layer, and obtains the attribute detection result of the first branch according to the probability to serve as a classification unit.

Step 3, the vehicle attribute detection model detects the position information of the vehicle logo by adopting a multi-scale characteristic diagram through a second branch according to the characteristics extracted by the characteristic extraction layer, and meanwhile, the characteristic information of the vehicle logo area is extracted by using the characteristics extracted by the characteristic extraction layer; and obtaining the probability of each attribute classification of the specified attributes according to the characteristic information of the car logo area, and obtaining the attribute detection result of the second branch as another classification unit according to the probability.

Wherein, step 2 and step 3 are executed simultaneously, that is, there is no clear sequence.

In this embodiment, the dividing manner of the vehicle attribute and the dividing manner of the attribute classification are preset during model training, for example: the manner of dividing the vehicle attributes and the manner of dividing the attribute classifications are shown with reference to fig. 2.

And 4, combining the attribute detection result of the first branch and the attribute detection result of the second branch to obtain the attribute information of the target vehicle and the position information of the vehicle logo.

In this embodiment, when the first branch cannot accurately detect m types of attributes of the vehicle at the same time, the attribute information of the target vehicle can be determined by combining the attribute detection result of the second branch, so that the vehicle attribute detection model can accurately output the vehicle attributes without adding a classification.

Fig. 4 is a block diagram of a vehicle attribute detection device according to an embodiment of the present application. The device at least comprises the following modules: an image acquisition module 410, a model acquisition module 420, an attribute detection module 430, and an attribute determination module 440.

An image obtaining module 410, configured to obtain a target image to be subjected to vehicle attribute detection;

a model obtaining module 420, configured to obtain a pre-trained vehicle attribute detection model, where the vehicle attribute detection model includes n network branches; a first branch of the n network branches is used for simultaneously detecting m attributes of the vehicle, and each second branch of the n branches, which is different from the first branch, is used for detecting a specified attribute of the m attributes; n and m are integers greater than 1;

the attribute detection module 430 is configured to input the target image into the vehicle attribute detection model, so as to obtain attribute detection results output by the n network branches respectively;

an attribute determining module 440, configured to determine attribute information of the target vehicle in the target image according to the attribute detection results of the n network branches.

For relevant details reference is made to the above-described method embodiments.

It should be noted that: in the vehicle attribute detection device provided in the foregoing embodiment, when performing vehicle attribute detection, only the division of the above functional modules is exemplified, and in practical applications, the function distribution may be completed by different functional modules as needed, that is, the internal structure of the vehicle attribute detection device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the vehicle attribute detection device provided by the embodiment and the vehicle attribute detection method embodiment belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment and are not described herein again.

Fig. 5 is a block diagram of an electronic device provided by an embodiment of the application. The device comprises at least a processor 501 and a memory 502.

Processor 501 may include one or more processing cores such as: 4 core processors, 8 core processors, etc. The processor 501 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 501 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 501 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, processor 501 may also include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.

Memory 502 may include one or more computer-readable storage media, which may be non-transitory. Memory 502 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 502 is used to store at least one instruction for execution by processor 501 to implement the vehicle property detection method provided by method embodiments herein.

In some embodiments, the electronic device may further include: a peripheral interface and at least one peripheral. The processor 501, memory 502 and peripheral interface may be connected by bus or signal lines. Each peripheral may be connected to the peripheral interface via a bus, signal line, or circuit board. Illustratively, peripheral devices include, but are not limited to: radio frequency circuit, touch display screen, audio circuit, power supply, etc.

Of course, the electronic device may include fewer or more components, which is not limited by the embodiment.

Optionally, the present application further provides a computer-readable storage medium, in which a program is stored, and the program is loaded and executed by a processor to implement the vehicle property detection method of the above-mentioned method embodiment.

Optionally, the present application further provides a computer product, which includes a computer-readable storage medium, in which a program is stored, and the program is loaded and executed by a processor to implement the vehicle property detection method of the above-mentioned method embodiment.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A vehicle property detection method, characterized in that the method comprises:

acquiring a target image to be subjected to vehicle attribute detection;

obtaining a pre-trained vehicle attribute detection model, wherein the vehicle attribute detection model comprises n network branches; a first branch of the n network branches is used for simultaneously detecting m attributes of the vehicle, and each second branch of the n branches, which is different from the first branch, is used for detecting a specified attribute of the m attributes; n and m are integers greater than 1;

determining attribute information of a target vehicle in the target image by combining the attribute detection results of the n network branches;

the determining the attribute information of the target vehicle in the target image by combining the attribute detection results of the n network branches includes:

2. The method of claim 1, wherein the m-attributes are indicative of brand attributes of a vehicle; the m brand attributes are obtained by gradually dividing the vehicle brands according to a preset hierarchical dividing mode;

3. The method of claim 2,

the brand attribute at the designated level is a large brand attribute, and the second branch identifies the large brand attribute through the vehicle logo information in the target image; the second branch is also used for positioning a car logo position in the target image;

and/or the presence of a gas in the atmosphere,

4. The method of claim 1, wherein the determining attribute information of a target vehicle in the target image in combination with the attribute detection results of the n network branches comprises:

5. The method of claim 1, wherein the first branch comprises a fully connected layer and a classification layer connected to the fully connected layer, the output of the classification layer being a confidence level for each vehicle attribute.

6. The method according to claim 1, wherein the second branch comprises a convolutional layer, a pooling layer, and a classification subbranch and a positioning subbranch connected to the pooling layer, wherein the output of the classification subbranch is the confidence of the specified attribute, and the output of the positioning subbranch is the position information of the specified attribute in the target image;

7. The method of claim 6,

when the second branch identifies a large brand attribute through the car logo information in the target image, the convolution layer and the pooling layer are used for extracting feature maps of different scales of a car logo area in the target image;

when the second branch identifies a small brand attribute through the vehicle face information in the target image, the convolutional layer and the pooling layer are used for extracting feature maps of different scales of the vehicle face area in the target image.

8. An electronic device, wherein the device comprises a processor and a memory; the memory stores therein a program that is loaded and executed by the processor to implement the vehicle property detection method according to any one of claims 1 to 7.

9. A computer-readable storage medium, characterized in that a program is stored in the storage medium, which when executed by a processor, is configured to implement the vehicle property detection method according to any one of claims 1 to 7.