CN112132033A

CN112132033A - Vehicle type recognition method and device, electronic equipment and storage medium

Info

Publication number: CN112132033A
Application number: CN202011009452.8A
Authority: CN
Inventors: 吴晓东
Original assignee: Ping An International Smart City Technology Co Ltd
Current assignee: Ping An International Smart City Technology Co Ltd
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2020-12-25
Anticipated expiration: 2040-09-23
Also published as: CN112132033B

Abstract

The invention relates to the technical field of artificial intelligence, and provides a vehicle type identification method, a vehicle type identification device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring an image to be identified; processing an image to be recognized, and inputting the processed image into a Cascade-YOLOv3 model; outputting a vehicle characteristic matrix corresponding to the processed image through EfficientNet; operating the vehicle characteristic matrix to obtain a first characteristic diagram, a second characteristic diagram and a third characteristic diagram; and respectively detecting and identifying a plurality of anchor frames obtained by pre-clustering on the first characteristic diagram, the second characteristic diagram and the third characteristic diagram to obtain a first vehicle type identification result, a second vehicle type identification result and a third vehicle type identification result. The invention can be applied to the fields of intelligent traffic and the like which need to identify the vehicle type, thereby promoting the development of intelligent cities.

Description

Vehicle type recognition method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a vehicle type identification method and device, electronic equipment and a storage medium.

Background

With the increasing number of vehicle violations, the rapid positioning and identification of vehicles at traffic checkpoints become an extremely important and challenging task in urban traffic management. In the prior art, vehicle type recognition can be performed through a vehicle type recognition algorithm.

However, in practice, it is found that in complex scenes such as haze, rainy days and nights, the existing vehicle type recognition algorithm is poor in recognition effect and low in recognition accuracy.

Therefore, how to identify the vehicle type to improve the accuracy of vehicle type identification is a technical problem to be solved urgently.

Disclosure of Invention

In view of the above, it is desirable to provide a vehicle type recognition method, apparatus, electronic device and storage medium, which can improve the accuracy of vehicle type recognition.

A first aspect of the present invention provides a vehicle type recognition method, including:

acquiring an image to be identified;

performing size processing on the image to be recognized, and inputting the processed image into an improved target detection algorithm Cascade-Yolov3 model, wherein the Cascade-Yolov3 model belongs to a multi-scale serial Cascade network structure under the threshold value of multiple intersection and IOU, and the Cascade-Yolov3 model comprises a composite network EfficientNet;

outputting a vehicle characteristic matrix corresponding to the processed image through the EfficientNet;

performing first preset operation on the vehicle characteristic matrix to obtain a first characteristic diagram;

performing the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map;

performing the first preset operation and the fusion operation on the vehicle feature matrix based on the second feature map to obtain a third feature map;

and respectively detecting and identifying a plurality of anchor frames obtained by pre-clustering on the first feature map, the second feature map and the third feature map to obtain a first vehicle type identification result corresponding to the first feature map, a second vehicle type identification result corresponding to the second feature map and a third vehicle type identification result corresponding to the third feature map.

In a possible implementation manner, the performing, based on the first feature map, the first preset operation and the fusion operation on the vehicle feature matrix to obtain a second feature map includes:

sequentially carrying out block operation and maximum pooling operation on the vehicle characteristic matrix to obtain a first output matrix;

summing the first characteristic diagram and the first output matrix, and performing up-sampling operation on the summed matrix to obtain a first fusion matrix;

and performing the first preset operation on the first fusion matrix to obtain a second characteristic diagram.

In one possible implementation manner, the first preset operation includes a convolutional layer operation, a convolutional block operation, and a convolution operation, and the performing the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map includes:

performing the convolutional layer operation on the vehicle characteristic matrix to obtain a first intermediate matrix;

sequentially carrying out blocking operation and maximum pooling operation on the first intermediate matrix to obtain a second intermediate matrix;

performing upsampling operation on the first feature map to obtain a first intermediate feature map;

summing the first intermediate feature map and the second intermediate matrix to obtain a second fusion matrix;

and performing the convolution block operation and the convolution operation on the second fusion matrix to obtain a second feature map.

performing the convolution layer operation and the convolution block operation on the vehicle feature matrix to obtain a third intermediate matrix;

sequentially carrying out blocking operation and maximum pooling operation on the third intermediate matrix to obtain a fourth intermediate matrix;

summing the first intermediate feature map and the fourth intermediate matrix to obtain a third fusion matrix;

and performing the convolution operation on the third fusion matrix to obtain a second characteristic diagram.

performing the convolution layer operation, the convolution block operation and the convolution operation on the vehicle feature matrix to obtain a fifth intermediate output matrix;

sequentially performing blocking operation and maximum pooling operation on the fifth intermediate matrix to obtain a sixth intermediate matrix;

and summing the first intermediate feature map and the sixth intermediate matrix to obtain a second feature map.

In a possible implementation manner, the detecting and identifying a plurality of anchor frames obtained by using pre-clustering on the first feature map, the second feature map, and the third feature map respectively to obtain a first vehicle type identification result corresponding to the first feature map, a second vehicle type identification result corresponding to the second feature map, and a third vehicle type identification result corresponding to the third feature map includes:

obtaining a plurality of anchor frames obtained by pre-clustering;

sequencing the anchor frames according to the sequence of the areas of the anchor frames from large to small, and carrying out equal division on the sequenced anchor frames to obtain a first group of anchor frames matched with the first feature map, a second group of anchor frames matched with the second feature map and a third group of anchor frames matched with the third feature map;

detecting and identifying the first characteristic diagram by using the first group of anchor frames to obtain a first vehicle type identification result carrying vehicle position coordinates and vehicle types;

detecting and identifying the second characteristic diagram by using the second group of anchor frames to obtain a second vehicle type identification result carrying the position coordinates and the vehicle category of the vehicle;

and detecting and identifying on the third characteristic diagram by using the third group of anchor frames to obtain a third vehicle type identification result carrying the position coordinates and the vehicle category of the vehicle.

In one possible implementation manner, the vehicle type identification method further includes:

obtaining a plurality of vehicle image samples;

after the size processing is carried out on the plurality of vehicle image samples, an intermediate image is obtained, and the intermediate image is input into a composite network EfficientNet of a preset frame YOLOv3, so that an initial feature matrix is obtained;

training the initial feature matrix to obtain a first training feature map, and sliding the first training feature map by using a first preset anchor frame to obtain a first prediction frame; calculating a first prediction intersection ratio IOU based on the first prediction frame and a standard frame, and calculating a first loss between the first prediction frame and the standard frame according to the first prediction IOU and the first preset IOU;

training the initial feature matrix based on the first preset feature map to obtain a second training feature map, and sliding the second training feature map by using a second preset anchor frame to obtain a second prediction frame; calculating a second prediction IOU based on the second prediction frame and the standard frame, and calculating a second loss between the second prediction frame and the standard frame according to the second prediction IOU and a second preset IOU;

training the initial feature matrix based on the second preset feature map to obtain a third training feature map, and sliding the third training feature map by using a third preset anchor frame to obtain a third prediction frame; calculating a third predicted IOU based on the third predicted frame and the standard frame, and calculating a third loss between the third predicted frame and the standard frame according to the third predicted IOU and a third preset IOU;

calculating a sum of losses of the first loss, the second loss, and the third loss;

adjusting model parameters of the YOLOv3 to minimize the loss sum, and determining the adjusted YOLOv3 as a modified Cascade-YOLOv3 model of the target detection algorithm.

A second aspect of the present invention provides a vehicle type recognition apparatus including:

the acquisition module is used for acquiring an image to be identified;

the processing input module is used for carrying out size processing on the image to be recognized and inputting the processed image into an improved target detection algorithm Cascade-YOLOv3 model, wherein the Cascade-YOLOv3 model belongs to a multi-scale serial Cascade network structure under the condition of multiple intersection and IOU threshold, and the Cascade-YOLOv3 model comprises a composite network EfficientNet;

the acquisition module is further used for outputting a vehicle characteristic matrix corresponding to the processed image through the EfficientNet;

the execution module is used for carrying out first preset operation on the vehicle characteristic matrix to obtain a first characteristic diagram;

the execution module is further configured to perform the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map;

the execution module is further configured to perform the first preset operation and the fusion operation on the vehicle feature matrix based on the second feature map to obtain a third feature map;

and the identification module is used for detecting and identifying a plurality of anchor frames obtained by pre-clustering on the first feature map, the second feature map and the third feature map respectively to obtain a first vehicle type identification result corresponding to the first feature map, a second vehicle type identification result corresponding to the second feature map and a third vehicle type identification result corresponding to the third feature map.

A third aspect of the present invention provides an electronic device comprising a processor and a memory, the processor being configured to implement the vehicle type identification method when executing a computer program stored in the memory.

A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the vehicle type identification method.

According to the technical scheme, the vehicle type identification method can be applied to the fields of intelligent traffic and the like which need vehicle type identification, and therefore development of intelligent cities is promoted. In the invention, the Cascade-YOLOv3 model formed by the multi-scale serial Cascade network structure under the threshold value of the IOU is adopted, so that the problem of sample unbalance of simple and complex scenes can be effectively relieved, the vehicle type identification accuracy under the complex scene is obviously improved, and meanwhile, the vehicle characteristic extraction effectiveness is greatly enhanced by adopting the EfficientNet network, thereby further improving the overall accuracy of vehicle type identification.

Drawings

Fig. 1 is a flowchart of a vehicle type recognition method according to a preferred embodiment of the present invention.

Fig. 2 is a functional block diagram of a preferred embodiment of a vehicle type recognition apparatus disclosed in the present invention.

Fig. 3 is a schematic structural diagram of an electronic device according to a preferred embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first" and "second" in the description and claims of the present application and the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.

The electronic device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like. The electronic device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers. The user device includes, but is not limited to, any electronic product that can interact with a user through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), or the like.

Referring to fig. 1, fig. 1 is a flowchart illustrating a vehicle type recognition method according to a preferred embodiment of the present invention. The order of the steps in the flowchart may be changed, and some steps may be omitted.

And S11, acquiring the image to be recognized.

The image to be recognized is an image including a vehicle, and the vehicle type of the image is required to be recognized.

S12, performing size processing on the image to be recognized, and inputting the processed image into an improved target detection algorithm Cascade-YOLOv3 model, wherein the Cascade-YOLOv3 model belongs to a multi-scale serial Cascade network structure under the condition of multiple intersection and IOU threshold, and the Cascade-YOLOv3 model comprises a composite network EfficientNet.

Wherein the size process, i.e. resize process, is to set the image to be recognized, resize, to a fixed size 512 x 512.

The improved target detection algorithm is a Cascade-Yolov3 model obtained by improving the traditional Yolov3 algorithm. Compared with the traditional YOLOv3 algorithm, the Cascade-YOLOv3 model improves the multi-scale parallel network structure under the threshold value of the original single intersection ratio IOU (intersection ratio) into the multi-scale serial Cascade network structure under the threshold value of the multi-intersection ratio IOU (such as 3 IOU thresholds), can effectively relieve the sample imbalance problem of simple and complex scenes, remarkably improves the vehicle type recognition accuracy under the complex scenes, and improves the overall vehicle type recognition accuracy. In addition, the original DarkNet53 network is replaced by an EfficientNet network with better performance, so that the effectiveness of vehicle feature extraction is enhanced to a great extent, and the overall accuracy of vehicle type identification is further improved.

And S13, outputting a vehicle feature matrix corresponding to the processed image through the EfficientNet.

And S14, performing a first preset operation on the vehicle characteristic matrix to obtain a first characteristic diagram.

The first preset operation comprises a convolutional layer operation, a convolutional block operation and a convolution operation, wherein the convolutional layer operation comprises a plurality of layers of convolution operations, a layer of normalization operation and a layer of activation operation, the convolutional block operation comprises a layer of convolution operation, a layer of normalization operation and a layer of activation operation, and the convolution operation comprises a layer of convolution operation.

Specifically, the first preset operation sequentially comprises three operation steps of conv _ layer (5-layer convolution + 1-layer normalization + 1-layer activation), conv _ block (1-layer convolution + 1-layer normalization + 1-layer activation), and conv (1-layer convolution).

And S15, performing the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map.

Wherein the second feature map fuses the features of the first feature map.

Specifically, the performing the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map includes:

The blocking operation (i.e., block operation) represents blocking the matrix, and the maximum pooling operation (i.e., max _ pooling operation) represents maximally pooling each of the partitioned sub-matrices. And summing the first characteristic diagram and the first output matrix, and performing upsampling operation (namely upsampling) on the summed matrix to ensure that the scale of the obtained second characteristic diagram is twice that of the first characteristic diagram, so that the prediction scales of the first characteristic diagram and the second characteristic diagram are different, and multi-scale prediction is realized.

The fusion operation includes a blocking operation, a max-pooling operation, a summing operation, an upsampling operation, and the like.

In this alternative embodiment, the upsampling operation may occur after the summing operation, and the fusing operation may occur before the first predetermined operation, that is, after the first fusing matrix is obtained, the first predetermined operation is performed to obtain the second feature map.

For example, assuming that the size of the vehicle feature matrix after passing through the EfficientNet network is 4 × 128 × 64 and the size of the first feature map is 4 × 8 × 64, the block operation divides the matrix 4 × 128 × 64 into 8 × 64 sub-matrices with the size of 4 × 16 64, and then performs the maximum pooling max _ posing operation on the 64 sub-matrices to obtain a first output matrix with the size of 4 × 8 × 64, and then sums the first output matrix with the first feature map to obtain a first fusion matrix with the size of 4 × 8 × 64.

Specifically, the first preset operation includes a convolutional layer operation, a convolutional block operation, and a convolutional operation, and the performing the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map includes:

performing the convolutional layer operation on the vehicle characteristic matrix to obtain a first intermediate matrix; sequentially carrying out blocking operation and maximum pooling operation on the first intermediate matrix to obtain a second intermediate matrix; performing upsampling operation on the first feature map to obtain a first intermediate feature map; summing the first intermediate feature map and the second intermediate matrix to obtain a second fusion matrix; performing the convolution block operation and the convolution operation on the second fusion matrix to obtain a second feature map; or

Performing the convolution layer operation and the convolution block operation on the vehicle feature matrix to obtain a third intermediate matrix; sequentially carrying out blocking operation and maximum pooling operation on the third intermediate matrix to obtain a fourth intermediate matrix; performing upsampling operation on the first feature map to obtain a first intermediate feature map; summing the first intermediate feature map and the fourth intermediate matrix to obtain a third fusion matrix; performing the convolution operation on the third fusion matrix to obtain a second characteristic diagram; or

Performing the convolution layer operation, the convolution block operation and the convolution operation on the vehicle feature matrix to obtain a fifth intermediate output matrix; sequentially performing blocking operation and maximum pooling operation on the fifth intermediate matrix to obtain a sixth intermediate matrix; performing upsampling operation on the first feature map to obtain a first intermediate feature map; and summing the first intermediate feature map and the sixth intermediate matrix to obtain a second feature map.

In this alternative embodiment, the merging operation (i.e., the blocking operation, the maximum pooling operation, and the upsampling operation) may be inserted in the middle of the partial operations (i.e., after conv _ layer is executed, or after conv _ layer and conv _ block are executed) of the first preset operation to obtain the second feature map; alternatively, the second feature map may be obtained by inserting the merging operation (i.e., the blocking operation, the maximum pooling operation, and the upsampling operation) after all operations of the first preset operation are executed (i.e., after conv _ layer, conv _ block, and conv are executed). Wherein the upsampling operation may occur prior to the summing operation.

And S16, based on the second characteristic diagram, performing the first preset operation and the fusion operation on the vehicle characteristic matrix to obtain a third characteristic diagram.

The first feature map, the second feature map and the third feature map belong to three feature maps with different scales, the three feature maps are in a serial structure as a whole, and the third feature map fuses the features of the second feature map.

And S17, detecting and identifying a plurality of anchor frames obtained by pre-clustering on the first feature map, the second feature map and the third feature map respectively to obtain a first vehicle type identification result corresponding to the first feature map, a second vehicle type identification result corresponding to the second feature map and a third vehicle type identification result corresponding to the third feature map.

Wherein, 9 anchor boxes obtained by clustering by using a k-means algorithm can be utilized in advance.

The vehicle type recognition result may include coordinates (i.e., vehicle position coordinates) of 3 different anchors box, and a vehicle category (i.e., 20 vehicle types, such as a car, a bus, a truck, and the like).

Specifically, the using of the plurality of anchor frames obtained by pre-clustering to respectively detect and identify the first feature map, the second feature map, and the third feature map, and obtaining the first vehicle type identification result corresponding to the first feature map, the second vehicle type identification result corresponding to the second feature map, and the third vehicle type identification result corresponding to the third feature map includes:

obtaining a plurality of anchor frames obtained by pre-clustering;

In this optional embodiment, after the multiple anchor frames are obtained, the multiple anchor frames may be sorted in the order from large to small in area, and the three-level division is evaluated to obtain multiple groups of anchor frames. And the scales of the first characteristic diagram are smaller than the scale of the second characteristic diagram, and the scale of the second characteristic diagram is smaller than the scale of the third characteristic diagram. When the anchor frame is matched with the feature map, the feature map with a smaller scale needs to be matched with the anchor frame with a larger area, so that the requirement that the anchor box with a large area can be predicted on the feature map with a small scale can be met, and the anchor box with a small area can be predicted on the feature map with a large scale can be met.

Thus, the first feature map matches a first set of anchor boxes, the second feature map matches a second set of anchor boxes, and the third feature map matches a third set of anchor boxes. Then, detection and identification can be carried out, and the coordinates (namely, the vehicle position) and the vehicle type of 3 different anchor boxes are respectively predicted on each feature map.

The Cascade-Yolov3 model is a multi-scale serial Cascade network structure, so that the obtained vehicle type recognition result is a result of multiple different scales, the multi-scale requirements under different complex scenes can be met, and the accuracy of vehicle type recognition is improved.

S18, mapping the first vehicle type recognition result, the second vehicle type recognition result and the third vehicle type recognition result to the image to be recognized.

The first vehicle type recognition result, the second vehicle type recognition result and the third vehicle type recognition result are mapped to the image to be recognized, and the vehicle types of the image to be recognized can be seen more intuitively.

Optionally, the method further includes:

obtaining a plurality of vehicle image samples;

The training of the initial feature matrix mainly comprises the steps of sequentially executing three operation steps of conv _ layer (5-layer convolution + 1-layer normalization + 1-layer activation), conv _ block (1-layer convolution + 1-layer normalization + 1-layer activation) and conv (1-layer convolution) on the initial features, wherein when a second training feature diagram is obtained, a first training feature diagram needs to be fused, and when a third training feature diagram is obtained, a second training feature diagram needs to be fused. For a specific fusion process, reference may be made to the above description.

The first preset anchor frame corresponds to the first group of anchor frames, the second preset anchor frame corresponds to the second group of anchor frames, and the third preset anchor frame corresponds to the third group of anchor frames.

The first IOU, the second IOU, and the third IOU are empirical values obtained through a plurality of tests in advance, for example, the first IOU is 0.5, the second IOU is 0.6, and the third IOU is 0.7.

The second training feature map and the third training feature map are mainly added by one fusion operation (i.e., previous training feature maps are fused) relative to the first training feature map, and different IOUs (i.e., a first preset IOU, a second preset IOU, and a third preset IOU) are adopted. Wherein, different positive and negative samples of presetting IOU meeting formation different proportions can be different when calculating the loss, and the generalization performance is better, and the overfitting is difficult to take place for the model of training to can improve the performance of model.

In addition, compared with the traditional YOLOv3 algorithm, the Cascade-YOLOv3 model has basically equivalent processing speed, but the size of the model is reduced by half, and the actual deployment is more convenient.

In the method flow described in fig. 1, the Cascade-YOLOv3 model formed by the multi-scale serial Cascade network structure under the threshold of the IOU with multiple cross-over ratios can effectively alleviate the problem of unbalanced samples of simple and complex scenes, and significantly improve the accuracy of vehicle type recognition under the complex scenes, and meanwhile, the EfficientNet network is adopted to greatly enhance the effectiveness of vehicle feature extraction, thereby further improving the overall accuracy of vehicle type recognition.

According to the embodiment, the method and the device can be applied to the fields of intelligent traffic and the like which need vehicle type identification, and therefore development of intelligent cities is promoted.

The above description is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and it will be apparent to those skilled in the art that modifications may be made without departing from the inventive concept of the present invention, and these modifications are within the scope of the present invention.

Referring to fig. 2, fig. 2 is a functional block diagram of a vehicle type recognition apparatus according to a preferred embodiment of the present invention.

In some embodiments, the vehicle type recognition apparatus is run in an electronic device. The vehicle type recognition apparatus may include a plurality of function modules composed of program code segments. The program codes of the program segments in the vehicle type recognition apparatus may be stored in the memory and executed by the at least one processor to perform part or all of the steps in the vehicle type recognition method described in fig. 1, which please refer to the related description in fig. 1, and are not described herein again.

In this embodiment, the vehicle type recognition apparatus may be divided into a plurality of functional modules according to the functions performed by the vehicle type recognition apparatus. The functional module may include: the system comprises an acquisition module 201, a processing input module 202, an execution module 203, a recognition module 204 and a mapping module 205. The module referred to herein is a series of computer program segments capable of being executed by at least one processor and capable of performing a fixed function and is stored in memory.

An obtaining module 201, configured to obtain an image to be identified;

a processing input module 202, configured to perform size processing on the image to be recognized, and input the processed image into an improved target detection algorithm, namely, a Cascade-YOLOv3 model, where the Cascade-YOLOv3 model belongs to a Cascade network structure with multiple intersections and multiple dimensions in series below an IOU threshold, and the Cascade-YOLOv3 model includes a composite network EfficientNet;

the obtaining module 201 is further configured to output a vehicle feature matrix corresponding to the processed image through the EfficientNet;

the execution module 203 is configured to perform a first preset operation on the vehicle feature matrix to obtain a first feature map;

the execution module 203 is further configured to perform the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map;

the executing module 203 is further configured to perform the first preset operation and the fusion operation on the vehicle feature matrix based on the second feature map to obtain a third feature map;

the identification module 204 is configured to use a plurality of anchor frames obtained by pre-clustering to respectively detect and identify the first feature map, the second feature map, and the third feature map, so as to obtain a first vehicle type identification result corresponding to the first feature map, a second vehicle type identification result corresponding to the second feature map, and a third vehicle type identification result corresponding to the third feature map;

a mapping module 205, configured to map the first vehicle type identification result, the second vehicle type identification result, and the third vehicle type identification result onto the image to be identified.

In the vehicle type recognition device described in fig. 2, the Cascade-YOLOv3 model formed by the multi-scale serial Cascade network structure under the IOU threshold value is adopted, so that the problem of unbalanced sample of simple and complex scenes can be effectively relieved, the vehicle type recognition accuracy under the complex scenes is remarkably improved, and meanwhile, the efficientNet network is adopted, the effectiveness of vehicle characteristic extraction is greatly enhanced, and the overall accuracy of vehicle type recognition is further improved.

As shown in fig. 3, fig. 3 is a schematic structural diagram of an electronic device according to a preferred embodiment of the method for recognizing a vehicle type of the present invention. The electronic device 3 comprises a memory 31, at least one processor 32, a computer program 33 stored in the memory 31 and executable on the at least one processor 32, and at least one communication bus 34.

Those skilled in the art will appreciate that the schematic diagram shown in fig. 3 is merely an example of the electronic device 3, and does not constitute a limitation of the electronic device 3, and may include more or less components than those shown, or combine some components, or different components, for example, the electronic device 3 may further include an input/output device, a network access device, and the like.

The at least one Processor 32 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The processor 32 may be a microprocessor or the processor 32 may be any conventional processor or the like, and the processor 32 is a control center of the electronic device 3 and connects various parts of the whole electronic device 3 by various interfaces and lines.

The memory 31 may be used to store the computer program 33 and/or the module/unit, and the processor 32 may implement various functions of the electronic device 3 by running or executing the computer program and/or the module/unit stored in the memory 31 and calling data stored in the memory 31. The memory 31 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data) created according to the use of the electronic device 3, and the like. In addition, the memory 31 may include non-volatile and volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other storage devices.

With reference to fig. 1, the memory 31 in the electronic device 3 stores a plurality of instructions to implement a vehicle type recognition method, and the processor 32 can execute the plurality of instructions to implement:

acquiring an image to be identified;

respectively detecting and identifying a plurality of anchor frames obtained by pre-clustering on the first feature map, the second feature map and the third feature map to obtain a first vehicle type identification result corresponding to the first feature map, a second vehicle type identification result corresponding to the second feature map and a third vehicle type identification result corresponding to the third feature map;

and mapping the first vehicle type identification result, the second vehicle type identification result and the third vehicle type identification result to the image to be identified.

Specifically, the processor 32 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.

In the electronic device 3 described in fig. 3, the Cascade-YOLOv3 model formed by the multi-scale serial Cascade network structure under the threshold of the IOU with multiple intersection ratios can effectively alleviate the problem of unbalanced samples of simple and complex scenes, and significantly improve the accuracy of vehicle type recognition under the complex scenes, and meanwhile, the EfficientNet network is adopted to greatly enhance the effectiveness of vehicle feature extraction, thereby further improving the overall accuracy of vehicle type recognition.

The integrated modules/units of the electronic device 3 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM).

In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned. The units or means recited in the system claims may also be implemented by software or hardware.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A vehicle type recognition method is characterized by comprising the following steps:

acquiring an image to be identified;

2. The vehicle type identification method according to claim 1, wherein the performing the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map comprises:

3. The vehicle type identification method according to claim 1, wherein the first preset operation comprises a convolutional layer operation, a convolutional block operation and a convolutional operation, and the performing the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map comprises:

4. The vehicle type identification method according to claim 1, wherein the first preset operation comprises a convolutional layer operation, a convolutional block operation and a convolutional operation, and the performing the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map comprises:

5. The vehicle type identification method according to claim 1, wherein the first preset operation comprises a convolutional layer operation, a convolutional block operation and a convolutional operation, and the performing the first preset operation and the fusion operation on the vehicle feature matrix based on the first feature map to obtain a second feature map comprises:

6. The vehicle type recognition method according to claim 1, wherein the obtaining of the first vehicle type recognition result corresponding to the first feature map, the second vehicle type recognition result corresponding to the second feature map, and the third vehicle type recognition result corresponding to the third feature map by detecting and recognizing a plurality of anchor frames obtained by using pre-clustering on the first feature map, the second feature map, and the third feature map, respectively, comprises:

obtaining a plurality of anchor frames obtained by pre-clustering;

7. The vehicle type recognition method according to claim 1, characterized in that the vehicle type recognition method further comprises:

obtaining a plurality of vehicle image samples;

8. A vehicle type recognition apparatus characterized by comprising:

the acquisition module is used for acquiring an image to be identified;

9. An electronic device, characterized in that the electronic device comprises a processor and a memory, the processor being configured to execute a computer program stored in the memory to implement the vehicle type recognition method according to any one of claims 1 to 7.

10. A computer-readable storage medium storing at least one instruction which, when executed by a processor, implements the vehicle type recognition method according to any one of claims 1 to 7.