CN113255395A

CN113255395A - Driver region positioning method and device, electronic equipment and storage medium

Info

Publication number: CN113255395A
Application number: CN202010082800.8A
Authority: CN
Inventors: 吴天舒
Original assignee: Shenzhen Intellifusion Technologies Co Ltd
Current assignee: Shenzhen Intellifusion Technologies Co Ltd
Priority date: 2020-02-07
Filing date: 2020-02-07
Publication date: 2021-08-13
Anticipated expiration: 2040-02-07
Also published as: CN113255395B

Abstract

The embodiment of the invention provides a method and a device for positioning a driver region, electronic equipment and a storage medium, wherein the method comprises the following steps: obtaining an image of a vehicle to be positioned; carrying out license plate number recognition on the vehicle image to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number; detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle; and positioning the target driver based on the target license plate number and the position of the front face of the target vehicle. Because the target driver area is positioned by combining the license plate number and the position of the front face of the vehicle instead of directly predicting the target driver, the license plate number is added as the prior information of the target driver area, and the driver area can be positioned in the position of the front face of the vehicle more accurately.

Description

Driver region positioning method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method and a device for positioning a driver region, electronic equipment and a storage medium.

Background

Image recognition is one of the commonly used techniques for current traffic management, such as: the method comprises the steps of license plate number recognition based on image recognition, driver behavior detection based on image recognition and the like. The behavior detection of the driver is an important task in monitoring scenes such as a checkpoint or an intersection, and includes detecting whether the driver is not wearing a safety belt, holds a telephone, drives fatigue and the like. The behavior detection of the driver is to detect the behavior characteristics of the driver through an image recognition technology and analyze the behavior characteristics according to the behavior characteristics of the driver, and in the existing detection process of the behavior detection of the driver, the behavior detection of the driver is carried out on the basis of the image of the whole vehicle, so that the behavior detection of the driver and the co-driver is quite direct, and thus, the behaviors of the driver and the co-driver can be detected, so that the behavior characteristics of the driver are difficult to find, sometimes, even the behavior characteristics of the co-driver can be detected by mistake and are used as the behavior characteristics of the driver, and the misdetection is caused. Therefore, the existing driver behavior detection has the problem that the accuracy of the driver region positioning is not high, so that the accuracy of the detection result is low.

Disclosure of Invention

The embodiment of the invention provides a driver region positioning method, which can improve the accuracy of driver region positioning, thereby improving the detection accuracy of driver behavior detection.

In a first aspect, an embodiment of the present invention provides a driver area positioning method, including:

obtaining an image of a vehicle to be positioned;

carrying out license plate number recognition on the vehicle image to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number;

detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle;

and positioning a target driver area based on the target license plate number and the position of the front face of the target vehicle.

Optionally, the vehicle front face includes a windshield, the detection of the vehicle front face is performed on the vehicle image to be positioned through a pre-trained detection network, so as to obtain the position of the target vehicle front face, including:

detecting the windshield of the vehicle image to be positioned through a pre-trained windshield detection network to obtain the position of a target windshield;

based on target license plate number and target vehicle front face position, fix a position target driver, include:

and positioning the target driver based on the target license plate number and the position of the target windshield.

Optionally, the pre-trained windshield detection network includes: the method comprises a first convolution area, a plurality of second convolution areas and a full connection layer, wherein the detection of the windshield of the vehicle image to be positioned is carried out through a pre-trained windshield detection network, and the position of a target windshield is obtained, and the method comprises the following steps:

inputting the vehicle image to be positioned into a first convolution area for convolution calculation to obtain an input feature map with preset dimensionality;

sequentially inputting the input feature maps with the preset dimensionality into the plurality of second convolution areas for convolution calculation to obtain target feature maps;

and inputting the target characteristic diagram into a full connection layer for calculation to obtain the position of the target windshield.

Optionally, the second convolution region includes: the method comprises the following steps of sequentially inputting the input feature maps with preset dimensions into the plurality of second convolutional layers for convolution calculation to obtain target feature maps, wherein the steps comprise:

inputting the input feature map with the preset dimension into the first convolution layer for calculation to obtain a first feature map, wherein the channel dimension of the first feature map is smaller than the preset dimension;

inputting the first characteristic diagram into the second convolution layer for calculation to obtain a second characteristic diagram; and

sequentially inputting the first characteristic diagram into the third convolutional layer and the fourth convolutional layer for calculation to obtain a third characteristic diagram, wherein the number of channels of the second characteristic diagram is the same as that of the channels of the third characteristic diagram;

and splicing the second characteristic diagram and the third characteristic diagram on the channel dimension to obtain a target characteristic diagram.

Optionally, the positioning the target driver based on the target license plate number and the position of the target windshield includes:

judging the region of the target license plate number;

acquiring setting information of a vehicle driving position in the region of the target license plate number, wherein the setting information comprises setting information on the left or the right;

and positioning the target driver according to the setting information and the target windshield glass.

Optionally, the obtaining an image of a vehicle to be positioned includes:

acquiring an image to be detected;

inputting the image to be detected into a vehicle detection network trained in advance, and detecting to obtain a target vehicle;

and extracting an image of the target vehicle as an image of the vehicle to be positioned.

In a second aspect, an embodiment of the present invention provides a driver area locating device, including:

the acquisition module is used for acquiring an image of the vehicle to be positioned;

the first processing module is used for carrying out license plate number recognition on the image of the vehicle to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number;

the second processing module is used for detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle;

and the positioning module is used for positioning the target driver based on the target license plate number and the position of the front face of the target vehicle.

In a third aspect, an embodiment of the present invention provides an electronic device, including: the driver region positioning method comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the computer program to realize the steps in the driver region positioning method provided by the embodiment of the invention.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps in the method for locating a driver region provided in the embodiment of the present invention.

In the embodiment of the invention, the image of the vehicle to be positioned is obtained; carrying out license plate number recognition on the vehicle image to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number; detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle; and positioning the target driver based on the target license plate number and the position of the front face of the target vehicle. Because the target driver area is positioned by combining the license plate number and the position of the front face of the vehicle instead of directly predicting the target driver, the license plate number is added as the prior information of the target driver area, and the driver area can be positioned in the position of the front face of the vehicle more accurately.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of a driver region locating method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a windshield inspection method provided by an embodiment of the present invention;

FIG. 3 is a diagram illustrating a second convolution region according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a driver area locating device according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of another driver area locating device provided by an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of another driver area locating device provided by an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of another driver area locating device provided by an embodiment of the invention;

FIG. 8 is a schematic structural diagram of another driver area locating device provided in an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, fig. 1 is a flowchart of a driver region positioning method according to an embodiment of the present invention, as shown in fig. 1, including the following steps:

101. by acquiring an image of the vehicle to be positioned.

In this step, the vehicle image to be positioned may be a still image or a moving video image frame of the target vehicle uploaded by the user, or may be a still image or a moving video image of the vehicle acquired by a camera deployed on a traffic road, at a gateway of a cell, or at a gateway of a parking lot. The image of the vehicle to be positioned comprises a front face of the vehicle and a license plate of the vehicle.

The front face of the vehicle includes at least one of a windshield, a hood, an intake grill, and a bumper.

Further, the image of the vehicle to be positioned may be an image of the target vehicle extracted as the image of the vehicle to be positioned after the vehicle detection is performed by a vehicle detection network trained in advance.

Specifically, an image to be detected is obtained through user uploading or camera acquisition, one or more vehicles can exist in the image to be detected, the image to be detected is input into a vehicle detection network, the position of a target vehicle in the image to be detected is detected, and then the image of the one or more target vehicles is extracted and obtained as an image of the vehicle to be positioned according to the position of the target vehicle in the image to be detected.

It should be noted that the image to be detected may be understood as a large map containing one or more target vehicles, and in the large map, the image of the vehicle to be positioned may be understood as a small map containing one target vehicle, the small map being extracted from the large map.

The vehicle detection network may be any existing vehicle detection network, a trained vehicle detection network obtained from an open source website, or a vehicle detection network obtained by self-training of a user.

102. And (4) carrying out license plate number recognition on the image of the vehicle to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number.

In this step, the image of the vehicle to be positioned includes the license plate number of the target vehicle. It can be understood that when a small image of a target vehicle is extracted from a large image, it is required to ensure that the license plate number of the target vehicle is included in the small image. And the small image refers to a vehicle image to be positioned, the vehicle image to be positioned is input into a pre-trained license plate number recognition network, and a target license plate number of a target vehicle in the vehicle image to be positioned is output.

The license plate number recognition network can be any existing license plate number recognition network, a trained license plate number recognition network obtained from an open source website, or a license plate number recognition network obtained by self-training of a user.

The target license plate number can be used for judging the region to which the target vehicle belongs, judging whether the driver region is on the left side or the right side according to the vehicle driving rule of the region to which the target vehicle belongs, and meanwhile, the target license plate number can also be used as prior information to judge a final positioning result.

103. And detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle.

In this step, the front face of the vehicle includes at least one of a windshield, a hood, an air intake grill, and a bumper, and in some vehicle models, the front face of the vehicle does not include a hood, such as a truck. The windshield, the engine hood, the air intake grille and the bumper are all structures penetrating through the left side and the right side of the front face of the vehicle, and correspond to a driving position and a secondary driving position of the front row of the vehicle. For example, the left and right side areas of the windshield correspond to the driver's seat and the passenger seat of the vehicle, and the left upper area and the right upper area of the hood, the intake grill, and the bumper correspond to the driver's seat and the passenger seat of the vehicle, respectively.

The detection network can correspondingly detect the position of at least one characteristic of the windshield, the engine cover, the air inlet grid and the bumper. The detection network can be trained by constructing a universal detection network and then training through a training data set of at least one of a windshield, an engine cover, an air intake grille and a bumper, so that the trained detection network is obtained.

In order to accelerate the detection speed of the front face of the vehicle and further improve the positioning speed of the driver area, the embodiment of the invention provides a detection network, which comprises the following components: a first convolution region, a plurality of second convolution regions connected in series to the first convolution region, and a full link layer. And inputting the first convolution area into an image of the vehicle to be positioned. Wherein the plurality of second convolution regions are connected in series. The convolution region may also be referred to as a feature extraction module.

The second convolution region includes: a first convolutional layer, a second convolutional layer, a third convolutional layer and a fourth convolutional layer, the third convolutional layer and the fourth convolutional layer are connected in series to form a first branch network, the second convolutional layer is a single second branch network, and the output of the first convolutional layer is connected with the input of the first branch network and the input of the second branch network respectively, as shown in fig. 3.

Specifically, since the left and right side regions of the windshield respectively correspond to the driver region and the passenger region, in the embodiment of the present invention, it is preferable that the detection of the front face of the vehicle is the detection of the windshield. When detecting a windshield, the above-mentioned detection network may also be referred to as a windshield detection network.

In order to increase the detection speed of the windshield, as shown in fig. 2, the detection method of the windshield includes:

201. and inputting the vehicle image to be positioned into the first convolution area for convolution calculation to obtain an input characteristic diagram with preset dimensionality.

Inputting the vehicle image to be positioned into a first convolution area in an RGB three-channel mode, and performing feature extraction on the vehicle image to be positioned in the RGB three-channel mode through each convolution kernel in the first convolution area to obtain an input feature map with preset dimensionality, wherein the preset dimensionality is the dimensionality of a channel, and the dimensionality of the channel is the same as the number of the convolution kernels in the first convolution area.

In a possible embodiment, the depth-separable convolution adopted by the first convolution area is used for performing convolution calculation on the vehicle image to be positioned.

For example, the convolution for the RGB three channels may be a coupling convolution of the channel dimension and the space dimension, or a decoupling convolution of the channel dimension and the space dimension by using a depth-separable convolution. For example, for an 8 × 3 to-be-positioned vehicle image, where 8 × 8 is spatial information of the to-be-positioned vehicle image, 3 is three RGB channels of the to-be-positioned vehicle image, calculation is performed by using convolution kernels of 3 × 32, and an input feature map of 32 channels is obtained after convolution, where 3 × 3 in the front is the size of the convolution kernel, 3 in the back is the number of channels of the convolution kernel, and corresponds to the three RGB channels, respectively, the 32 in the front is the number of convolution kernels, the channels of each convolution kernel share a weight, and the total calculation amount is 3 × 32 — 864.

In the depth-separable convolution, three intermediate feature maps are obtained by respectively performing spatial dimension calculation on RGB three channels through 3 × 3 convolution kernels, and then performing channel dimension calculation on the three intermediate feature maps through 1 × 32 convolution kernels to obtain input feature maps of 32 channels, wherein 3 × 3 represents the size of the convolution kernels used for performing the spatial dimension calculation, 1 × 1 represents the size of the convolution kernels used for performing the channel dimension calculation, 32 represents the number of 1 × 1 convolution kernels, and each convolution kernel calculates a feature map corresponding to one channel, so that the total calculation amount is 3 × 3+ 1 × 32 ═ 123. Since the amount of calculation is reduced, the overall calculation speed can be improved.

In the case where the vehicle image to be located is a small image extracted by the vehicle detection network, the convolution calculation in the first convolution region does not require multi-resolution calculation, that is, up-and-down sampling, the input feature is filled with 1, and the sliding step of the convolution kernel is 1.

202. And sequentially inputting the input feature maps with preset dimensions into a plurality of second convolution areas for convolution calculation to obtain the target feature map.

In this step, the input feature map with the preset dimensionality is output from the first convolution area, the plurality of second convolution areas can be selected according to the detection accuracy and the detection speed, and of course, the detection accuracy and the detection speed are also related to a software environment and a hardware environment, and the detection accuracy and the detection speed may be different in different software environments and hardware environments, so that the number of the specific second convolution areas can be selected according to actual conditions.

In the embodiment of the invention, the operating system of the experimental software environment is Ubuntu 14.04. The CPU of the experimental hardware environment is Intel (R) core (R) i767003.4 GHz multiplied by 8, and the memory is 8G. GPU is NVIDIA (R) GTX (R)1080 TI. The data set adopts traffic monitoring images, and carries out the extraction of vehicle image through vehicle detection network, use open source labeling software to carry out automatic labeling to vehicle image, obtain driver regional positioning sample data set, 3000 therein training sets, 500 test sets, train holistic driver regional positioning network through the training set, the training obtains the regional driver positioning network in a plurality of different numbers second convolution regions, and test these regional driver positioning network's data and obtain precision like table 1, detect speed data:

number of second convolution regions	Average accuracy/%)	Detection speed/(frame/second)
			9	99.97	68
8	99.97	69
			7	99.96	71
6	99.96	73
			5	81.82	74

TABLE 1

As can be seen from the data in table 1, when the number of second convolution regions is 6 or more, the accuracy has been significantly improved, and the detection speed decreases as the number of lightweight second convolution regions increases. When the number of the second convolution areas is reduced to 5, the detection speed is not improved, but the average accuracy is greatly reduced. Based on the data in table 1, it is preferable that the 6 second convolution regions are connected in series in the software environment and the hardware environment provided in the embodiment of the present invention.

Further, the second convolution region includes: a first convolutional layer, a second convolutional layer, a third convolutional layer and a fourth convolutional layer, the third convolutional layer and the fourth convolutional layer are connected in series to form a first branch network, the second convolutional layer is a single second branch network, and the output of the first convolutional layer is connected with the input of the first branch network and the input of the second branch network respectively, as shown in fig. 3.

Specifically, the convolution kernel in the first convolution layer is N × 1 × M convolution kernel, where N is the number of channels of the input feature map, each channel of the convolution kernel shares a weight, and M is the number of 1 × 1 convolution kernels, and the number of channels of the first feature map is obtained through convolution calculation of the first convolution layer. For example, assuming that the input feature map with the predetermined dimension is a tensor of 8 × N, where N is the predetermined dimension, i.e., the number of channels of the input feature map, the convolution is calculated as 8 × N · 1 × M — 8 × M, i.e., the first feature map output by the first convolution layer is a tensor of 8 × 8M. And N is greater than M, so that the output of the first convolution layer is a dimension reduction result, namely the number of channels of the first feature map is less than that of the channels of the input feature map. It should be noted that, in the embodiment of the present invention, the preset dimension refers to a preset number of channels. Compared with the channel number of the input feature map, the channel number of the first feature map is reduced, so that the subsequent calculation amount is reduced while the features are extracted, and the operation speed of the whole detection network is further improved.

After the first feature map 8 × M is obtained, the first feature map 8 × M is input to the first branch network and the second branch network, respectively, to perform convolution calculation. In the first branch network, the first feature map sequentially passes through a third convolutional layer and a fourth convolutional layer, wherein the convolutional kernel of the third convolutional layer is a 3 × 3 convolutional kernel, the 3 × 3 convolutional kernel is a single-channel convolutional kernel, the number of the convolutional kernels is M, the number of the convolutional kernels needs to be kept the same as that of the channels of the first feature map, in the third convolutional layer, each 3 × 3 convolutional kernel performs convolutional calculation with one channel of the first feature map so as to separate the spatial information of the feature map from the channel information, 8 × 8 feature maps of the M channels are obtained, and then the 8 × 8 feature maps of the M channels are input into the fourth convolutional layer for calculation. The convolution kernels in the fourth convolution layer are M × 1 × K, where M channels of the 1 × 1 convolution kernel share a weight, and K is the number of 1 × 1 convolution kernels. The input signature of the fourth convolution layer is the output of the third convolution layer, i.e., the input signature is the tensor of 8 × M, and the convolution calculation is 8 × M · M × 1 × K — 8 × K, i.e., the tensor of 8 × K, which is the third signature output by the fourth convolution layer. And M is larger than K, so that the fourth convolution layer outputs a dimension reduction result, namely the number of channels of the fourth feature map is smaller than that of the first feature map.

In the second branch network, the convolution kernels in the second convolution layer are M × 1 × K, wherein M channels of the 1 × 1 convolution kernels share a weight, and K is the number of 1 × 1 convolution kernels. The input signature of the second convolution layer is the output of the first convolution layer, the first signature is the tensor of 8 × M, and the convolution calculation is 8 × M · M × 1 × K ═ 8 × K, i.e., the second signature of the output of the second convolution layer is the tensor of 8 × K. And M is larger than K, so that the fourth convolution layer outputs a dimension reduction result, namely the number of channels of the second feature map is smaller than that of the first feature map. The number of channels of the second feature map is the same as the number of channels of the third feature map, that is, the number of 1 × 1 convolution kernels in the second convolution layer is the same as the number of 1 × 1 convolution kernels in the fourth feature map.

After the second convolution result and the third convolution result are obtained, the second convolution result and the third convolution result may be spliced in the channel dimension, for example, a tensor of 8 × K in the third eigen map and a tensor of 8 × K in the second eigen map are spliced in the channel dimension, and an obtained tensor of 8 × 2K in the target eigen map is obtained. In this way, the input of the next second convolution region is the union of the outputs of the current second convolution regions, and the multiplexing of each feature map is realized.

In a possible embodiment, after the second convolution result and the third convolution result are obtained, the input feature map, the second convolution result and the third convolution result may be spliced in the channel dimension. Or, the input feature map, the first convolution result, the second convolution result, and the third convolution result may be stitched in the channel dimension. Thereby realizing the multiplexing of the characteristic diagrams.

In the embodiment of the invention, an input feature map with preset dimensions is input into a first convolution layer for calculation to obtain a first feature map, wherein the channel dimension of the first feature map is smaller than the preset dimensions; inputting the first characteristic diagram into a second convolution layer for calculation to obtain a second characteristic diagram; sequentially inputting the first characteristic diagram into a third convolutional layer and a fourth convolutional layer for calculation to obtain a third characteristic diagram, wherein the number of channels of the second characteristic diagram is the same as that of the channels of the third characteristic diagram; and splicing the input feature map, the second feature map and the third feature map with preset dimensions on the channel dimensions to obtain a target feature map. And performing feature extraction through the plurality of second convolution areas, and simultaneously performing dimension reduction and multiplexing on the feature map. Furthermore, spatial convolution and channel convolution are sequentially performed on the third convolution layer and the fourth convolution layer, so that decoupling of spatial information and channel information in the convolution process is achieved, parameters are reduced, and the calculation speed is improved.

203. And inputting the target characteristic diagram into the full-connection layer for calculation to obtain the position of the target windshield.

In this step, because the output feature maps of the plurality of second regions are spliced in the target feature map, the output third feature map of the last second region in the target feature map may be selected as the final target feature map, and input into the full-link layer + Softmax for classification output, so as to obtain the position of the target windshield.

The

steps

201, 202 and 203 are only directed at the detection embodiment of the windshield, and the other structures of the front face of the vehicle can be detected through the same detection network structure and steps, so that the detection of the other structures of the front face of the vehicle can also have the effects of reducing the parameter number and improving the calculation speed by changing the sample data set and the number of the second convolution areas.

104. And positioning the target driver area based on the number of the target license plate and the position of the front face of the target vehicle.

In this step, the target vehicle front face includes at least one of a windshield, a hood, an air intake grill, and a bumper, and in some vehicle models, the vehicle front face does not have a hood, such as a truck.

Since different regions have different vehicle driving rules, for example, the driver area of a domestic vehicle is on the right side with respect to the front of the vehicle, and the driver area of a vehicle in a harbor, australian, or other country is on the left side. Therefore, the region to which the target vehicle belongs can be judged through the number of the target license plate, so that whether the driver area of the target vehicle is on the right side or on the left side is judged. For example, when the target license plate number does not belong to a vehicle in a port, an australian district, or other country, it may be determined that the driver area of the target vehicle is on the right side; when the target license plate number belongs to the harbor and Australian area, judging that the driver area of the target vehicle is on the left side; and when the target license plate number belongs to other countries, judging whether the driver area of the target vehicle is on the left side or the right side according to the vehicle driving rule of the country.

The embodiment of the invention takes the windshield as the front face of the target vehicle as an example, the position of the windshield is obtained after the prediction of the windshield detection network, at this time, only the left area or the right area of the windshield needs to be selected as the driver area, and then the corresponding driver area is extracted for subsequent driver behavior detection.

Of course, when the positions of the engine hood, the air intake grille and the bumper are detected by the detection network, the left upper area or the right upper area of the engine hood, the air intake grille and the bumper can be selected as a target driver area according to the license plate number, and then the corresponding driver area is extracted for subsequent driver behavior detection. It should be noted that the range of the left upper area and the right upper area should include the windshield area on the corresponding side so that the behavior of the driver can be detected.

It should be noted that the method for locating a driver area provided by the embodiment of the present invention may be applied to a device such as a mobile phone, a monitor, a computer, and a server that needs to locate the driver area.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a driver area positioning device according to an embodiment of the present invention, and as shown in fig. 4, the device includes:

the acquisition module 401 is used for acquiring an image of a vehicle to be positioned;

the first processing module 402 is configured to perform license plate number recognition on the to-be-positioned vehicle image through a pre-trained license plate number recognition network to obtain a target license plate number;

the second processing module 403 is configured to perform vehicle front face detection on the to-be-positioned vehicle image through a pre-trained detection network, so as to obtain a position of a front face of a target vehicle;

and a positioning module 404, configured to position a target driver area based on the target license plate number and the position of the front face of the target vehicle.

Optionally, the front face of the vehicle includes a windshield, and the second processing module 403 is further configured to perform windshield detection on the vehicle image to be positioned through a pre-trained windshield detection network, so as to obtain a position of a target windshield;

the positioning module 404 is further configured to position a target driver based on the target license plate number and the position of the target windshield.

Optionally, as shown in fig. 5, the pre-trained windshield detection network includes: a first convolution region, a plurality of second convolution regions, and a full connection layer, wherein the second processing module 403 includes:

the first calculation unit 4031 is used for inputting the vehicle image to be positioned into a first convolution area for convolution calculation to obtain an input feature map with preset dimensions;

a second calculation unit 4032, configured to sequentially input the input feature maps with the preset dimensions into the plurality of second convolution regions for convolution calculation, so as to obtain a target feature map;

and a third calculation unit 4033, configured to input the target feature map into the full-link layer for calculation, so as to obtain a position of the target windshield.

Optionally, as shown in fig. 6, the second convolution region includes: the first convolution layer, the second convolution layer, the third convolution layer, and the fourth convolution layer, the second calculation unit 4032 includes:

a first calculating subunit 40321, configured to input the input feature map with the preset dimension into the first convolution layer for calculation to obtain a first feature map, where a channel dimension of the first feature map is smaller than the preset dimension;

a second calculation subunit 40322, configured to input the first feature map into the second convolution layer for calculation to obtain a second feature map; and

a third calculation subunit 40323, configured to sequentially input the first feature map into the third convolutional layer and a fourth convolutional layer for calculation, so as to obtain a third feature map, where the number of channels in the second feature map is the same as that in the third feature map;

and a splicing subunit 40324, configured to splice the second feature map and the third feature map in a channel dimension to obtain a target feature map.

Optionally, as shown in fig. 7, the positioning module 404 includes:

the judging unit 4041 is configured to judge a region to which the target license plate number belongs;

a first obtaining unit 4042, configured to obtain setting information of a vehicle driving seat in a region to which the target license plate number belongs, where the setting information includes setting information on the left or the right;

a positioning unit 4043, configured to position a target driver area according to the setting information and the target windshield glass.

As shown in fig. 8, the obtaining module 401 includes:

a second obtaining unit 4011, configured to obtain an image to be detected;

the detection unit 4012 is configured to input the image to be detected to a vehicle detection network trained in advance, and detect to obtain a target vehicle;

and the extraction unit 4013 is configured to extract an image of the target vehicle as an image of the vehicle to be positioned.

It should be noted that the device for locating a driver area according to the embodiment of the present invention may be applied to a mobile phone, a monitor, a computer, a server, and other devices that need to locate a driver area.

The driver region positioning device provided by the embodiment of the invention can realize each process realized by the driver region positioning method in the method embodiment, and can achieve the same beneficial effects. To avoid repetition, further description is omitted here.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, as shown in fig. 9, including: a memory 902, a processor 901 and a computer program stored on the memory 902 and executable on the processor 901, wherein:

the processor 901 is used for calling the computer program stored in the memory 902 and executing the following steps:

obtaining an image of a vehicle to be positioned;

Optionally, the front face of the vehicle includes a windshield, and the detecting, performed by the processor 901, of the front face of the vehicle of the to-be-positioned vehicle image through a pre-trained detection network to obtain the position of the front face of the target vehicle includes:

the locating of the target driver based on the target license plate number and the front face position of the target vehicle performed by processor 901 includes:

and positioning a target driver area based on the target license plate number and the position of the target windshield.

Optionally, the pre-trained windshield detection network includes: the detecting, performed by the processor 901, of the windshield of the to-be-positioned vehicle image through the pre-trained windshield detection network to obtain the position of the target windshield includes:

Optionally, the second convolution region includes: the first convolution layer, the second convolution layer, the third convolution layer, and the fourth convolution layer, where the input feature map of the preset dimension executed by the processor 901 is sequentially input into the plurality of second convolution layers for convolution calculation to obtain the target feature map, includes:

Optionally, the locating the target driver area based on the target license plate number and the position of the target windshield executed by the processor 901 includes:

judging the region of the target license plate number;

and positioning a target driver area according to the setting information and the target windshield glass.

Optionally, the obtaining of the image of the vehicle to be positioned by the processor 901 includes:

acquiring an image to be detected;

The electronic device may be a device that is applicable to a mobile phone, a monitor, a computer, a server, and the like that need to locate a driver area.

The electronic equipment provided by the embodiment of the invention can realize each process realized by the method for positioning the driver region in the method embodiment, can achieve the same beneficial effects, and is not repeated here for avoiding repetition.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the method for locating a driver region according to the embodiment of the present invention, and can achieve the same technical effect, and is not described herein again to avoid repetition.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims

1. A driver area positioning method is characterized by comprising the following steps:

obtaining an image of a vehicle to be positioned;

2. The method of claim 1, wherein the vehicle front face comprises a windshield, and the detecting the vehicle front face of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the target vehicle front face comprises:

3. The method of claim 2, wherein the pre-trained windshield detection network comprises: the method comprises a first convolution area, a plurality of second convolution areas and a full connection layer, wherein the detection of the windshield of the vehicle image to be positioned is carried out through a pre-trained windshield detection network, and the position of a target windshield is obtained, and the method comprises the following steps:

4. The method of claim 3, wherein the second convolution region comprises: the method comprises the following steps of sequentially inputting the input feature maps with preset dimensions into the plurality of second convolutional layers for convolution calculation to obtain target feature maps, wherein the steps comprise:

5. The method of claim 2, wherein locating a target driver zone based on the target license plate number and the position of the target windshield comprises:

judging the region of the target license plate number;

6. The method of claim 1, wherein said obtaining an image of a vehicle to be positioned comprises:

acquiring an image to be detected;

7. A driver zone locating device, said device comprising:

and the positioning module is used for positioning the target driver area based on the target license plate number and the position of the front face of the target vehicle.

8. The apparatus of claim 7, wherein the front face of the vehicle comprises a windshield, and the second processing module is further configured to perform windshield detection on the image of the vehicle to be positioned through a pre-trained windshield detection network to obtain a position of a target windshield;

the positioning module is further used for positioning a target driver based on the target license plate number and the position of the target windshield.

9. An electronic device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the driver zone location method according to any of claims 1 to 6 when executing the computer program.

10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps in the method for locating a driver's zone according to any one of claims 1 to 6.