CN113255395A - Driver region positioning method and device, electronic equipment and storage medium - Google Patents

Driver region positioning method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113255395A
CN113255395A CN202010082800.8A CN202010082800A CN113255395A CN 113255395 A CN113255395 A CN 113255395A CN 202010082800 A CN202010082800 A CN 202010082800A CN 113255395 A CN113255395 A CN 113255395A
Authority
CN
China
Prior art keywords
target
vehicle
image
convolution
license plate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010082800.8A
Other languages
Chinese (zh)
Other versions
CN113255395B (en
Inventor
吴天舒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Intellifusion Technologies Co Ltd filed Critical Shenzhen Intellifusion Technologies Co Ltd
Priority to CN202010082800.8A priority Critical patent/CN113255395B/en
Publication of CN113255395A publication Critical patent/CN113255395A/en
Application granted granted Critical
Publication of CN113255395B publication Critical patent/CN113255395B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The embodiment of the invention provides a method and a device for positioning a driver region, electronic equipment and a storage medium, wherein the method comprises the following steps: obtaining an image of a vehicle to be positioned; carrying out license plate number recognition on the vehicle image to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number; detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle; and positioning the target driver based on the target license plate number and the position of the front face of the target vehicle. Because the target driver area is positioned by combining the license plate number and the position of the front face of the vehicle instead of directly predicting the target driver, the license plate number is added as the prior information of the target driver area, and the driver area can be positioned in the position of the front face of the vehicle more accurately.

Description

Driver region positioning method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a device for positioning a driver region, electronic equipment and a storage medium.
Background
Image recognition is one of the commonly used techniques for current traffic management, such as: the method comprises the steps of license plate number recognition based on image recognition, driver behavior detection based on image recognition and the like. The behavior detection of the driver is an important task in monitoring scenes such as a checkpoint or an intersection, and includes detecting whether the driver is not wearing a safety belt, holds a telephone, drives fatigue and the like. The behavior detection of the driver is to detect the behavior characteristics of the driver through an image recognition technology and analyze the behavior characteristics according to the behavior characteristics of the driver, and in the existing detection process of the behavior detection of the driver, the behavior detection of the driver is carried out on the basis of the image of the whole vehicle, so that the behavior detection of the driver and the co-driver is quite direct, and thus, the behaviors of the driver and the co-driver can be detected, so that the behavior characteristics of the driver are difficult to find, sometimes, even the behavior characteristics of the co-driver can be detected by mistake and are used as the behavior characteristics of the driver, and the misdetection is caused. Therefore, the existing driver behavior detection has the problem that the accuracy of the driver region positioning is not high, so that the accuracy of the detection result is low.
Disclosure of Invention
The embodiment of the invention provides a driver region positioning method, which can improve the accuracy of driver region positioning, thereby improving the detection accuracy of driver behavior detection.
In a first aspect, an embodiment of the present invention provides a driver area positioning method, including:
obtaining an image of a vehicle to be positioned;
carrying out license plate number recognition on the vehicle image to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number;
detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle;
and positioning a target driver area based on the target license plate number and the position of the front face of the target vehicle.
Optionally, the vehicle front face includes a windshield, the detection of the vehicle front face is performed on the vehicle image to be positioned through a pre-trained detection network, so as to obtain the position of the target vehicle front face, including:
detecting the windshield of the vehicle image to be positioned through a pre-trained windshield detection network to obtain the position of a target windshield;
based on target license plate number and target vehicle front face position, fix a position target driver, include:
and positioning the target driver based on the target license plate number and the position of the target windshield.
Optionally, the pre-trained windshield detection network includes: the method comprises a first convolution area, a plurality of second convolution areas and a full connection layer, wherein the detection of the windshield of the vehicle image to be positioned is carried out through a pre-trained windshield detection network, and the position of a target windshield is obtained, and the method comprises the following steps:
inputting the vehicle image to be positioned into a first convolution area for convolution calculation to obtain an input feature map with preset dimensionality;
sequentially inputting the input feature maps with the preset dimensionality into the plurality of second convolution areas for convolution calculation to obtain target feature maps;
and inputting the target characteristic diagram into a full connection layer for calculation to obtain the position of the target windshield.
Optionally, the second convolution region includes: the method comprises the following steps of sequentially inputting the input feature maps with preset dimensions into the plurality of second convolutional layers for convolution calculation to obtain target feature maps, wherein the steps comprise:
inputting the input feature map with the preset dimension into the first convolution layer for calculation to obtain a first feature map, wherein the channel dimension of the first feature map is smaller than the preset dimension;
inputting the first characteristic diagram into the second convolution layer for calculation to obtain a second characteristic diagram; and
sequentially inputting the first characteristic diagram into the third convolutional layer and the fourth convolutional layer for calculation to obtain a third characteristic diagram, wherein the number of channels of the second characteristic diagram is the same as that of the channels of the third characteristic diagram;
and splicing the second characteristic diagram and the third characteristic diagram on the channel dimension to obtain a target characteristic diagram.
Optionally, the positioning the target driver based on the target license plate number and the position of the target windshield includes:
judging the region of the target license plate number;
acquiring setting information of a vehicle driving position in the region of the target license plate number, wherein the setting information comprises setting information on the left or the right;
and positioning the target driver according to the setting information and the target windshield glass.
Optionally, the obtaining an image of a vehicle to be positioned includes:
acquiring an image to be detected;
inputting the image to be detected into a vehicle detection network trained in advance, and detecting to obtain a target vehicle;
and extracting an image of the target vehicle as an image of the vehicle to be positioned.
In a second aspect, an embodiment of the present invention provides a driver area locating device, including:
the acquisition module is used for acquiring an image of the vehicle to be positioned;
the first processing module is used for carrying out license plate number recognition on the image of the vehicle to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number;
the second processing module is used for detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle;
and the positioning module is used for positioning the target driver based on the target license plate number and the position of the front face of the target vehicle.
In a third aspect, an embodiment of the present invention provides an electronic device, including: the driver region positioning method comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the computer program to realize the steps in the driver region positioning method provided by the embodiment of the invention.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps in the method for locating a driver region provided in the embodiment of the present invention.
In the embodiment of the invention, the image of the vehicle to be positioned is obtained; carrying out license plate number recognition on the vehicle image to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number; detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle; and positioning the target driver based on the target license plate number and the position of the front face of the target vehicle. Because the target driver area is positioned by combining the license plate number and the position of the front face of the vehicle instead of directly predicting the target driver, the license plate number is added as the prior information of the target driver area, and the driver area can be positioned in the position of the front face of the vehicle more accurately.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a driver region locating method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a windshield inspection method provided by an embodiment of the present invention;
FIG. 3 is a diagram illustrating a second convolution region according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a driver area locating device according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of another driver area locating device provided by an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of another driver area locating device provided by an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of another driver area locating device provided by an embodiment of the invention;
FIG. 8 is a schematic structural diagram of another driver area locating device provided in an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart of a driver region positioning method according to an embodiment of the present invention, as shown in fig. 1, including the following steps:
101. by acquiring an image of the vehicle to be positioned.
In this step, the vehicle image to be positioned may be a still image or a moving video image frame of the target vehicle uploaded by the user, or may be a still image or a moving video image of the vehicle acquired by a camera deployed on a traffic road, at a gateway of a cell, or at a gateway of a parking lot. The image of the vehicle to be positioned comprises a front face of the vehicle and a license plate of the vehicle.
The front face of the vehicle includes at least one of a windshield, a hood, an intake grill, and a bumper.
Further, the image of the vehicle to be positioned may be an image of the target vehicle extracted as the image of the vehicle to be positioned after the vehicle detection is performed by a vehicle detection network trained in advance.
Specifically, an image to be detected is obtained through user uploading or camera acquisition, one or more vehicles can exist in the image to be detected, the image to be detected is input into a vehicle detection network, the position of a target vehicle in the image to be detected is detected, and then the image of the one or more target vehicles is extracted and obtained as an image of the vehicle to be positioned according to the position of the target vehicle in the image to be detected.
It should be noted that the image to be detected may be understood as a large map containing one or more target vehicles, and in the large map, the image of the vehicle to be positioned may be understood as a small map containing one target vehicle, the small map being extracted from the large map.
The vehicle detection network may be any existing vehicle detection network, a trained vehicle detection network obtained from an open source website, or a vehicle detection network obtained by self-training of a user.
102. And (4) carrying out license plate number recognition on the image of the vehicle to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number.
In this step, the image of the vehicle to be positioned includes the license plate number of the target vehicle. It can be understood that when a small image of a target vehicle is extracted from a large image, it is required to ensure that the license plate number of the target vehicle is included in the small image. And the small image refers to a vehicle image to be positioned, the vehicle image to be positioned is input into a pre-trained license plate number recognition network, and a target license plate number of a target vehicle in the vehicle image to be positioned is output.
The license plate number recognition network can be any existing license plate number recognition network, a trained license plate number recognition network obtained from an open source website, or a license plate number recognition network obtained by self-training of a user.
The target license plate number can be used for judging the region to which the target vehicle belongs, judging whether the driver region is on the left side or the right side according to the vehicle driving rule of the region to which the target vehicle belongs, and meanwhile, the target license plate number can also be used as prior information to judge a final positioning result.
103. And detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle.
In this step, the front face of the vehicle includes at least one of a windshield, a hood, an air intake grill, and a bumper, and in some vehicle models, the front face of the vehicle does not include a hood, such as a truck. The windshield, the engine hood, the air intake grille and the bumper are all structures penetrating through the left side and the right side of the front face of the vehicle, and correspond to a driving position and a secondary driving position of the front row of the vehicle. For example, the left and right side areas of the windshield correspond to the driver's seat and the passenger seat of the vehicle, and the left upper area and the right upper area of the hood, the intake grill, and the bumper correspond to the driver's seat and the passenger seat of the vehicle, respectively.
The detection network can correspondingly detect the position of at least one characteristic of the windshield, the engine cover, the air inlet grid and the bumper. The detection network can be trained by constructing a universal detection network and then training through a training data set of at least one of a windshield, an engine cover, an air intake grille and a bumper, so that the trained detection network is obtained.
In order to accelerate the detection speed of the front face of the vehicle and further improve the positioning speed of the driver area, the embodiment of the invention provides a detection network, which comprises the following components: a first convolution region, a plurality of second convolution regions connected in series to the first convolution region, and a full link layer. And inputting the first convolution area into an image of the vehicle to be positioned. Wherein the plurality of second convolution regions are connected in series. The convolution region may also be referred to as a feature extraction module.
The second convolution region includes: a first convolutional layer, a second convolutional layer, a third convolutional layer and a fourth convolutional layer, the third convolutional layer and the fourth convolutional layer are connected in series to form a first branch network, the second convolutional layer is a single second branch network, and the output of the first convolutional layer is connected with the input of the first branch network and the input of the second branch network respectively, as shown in fig. 3.
Specifically, since the left and right side regions of the windshield respectively correspond to the driver region and the passenger region, in the embodiment of the present invention, it is preferable that the detection of the front face of the vehicle is the detection of the windshield. When detecting a windshield, the above-mentioned detection network may also be referred to as a windshield detection network.
In order to increase the detection speed of the windshield, as shown in fig. 2, the detection method of the windshield includes:
201. and inputting the vehicle image to be positioned into the first convolution area for convolution calculation to obtain an input characteristic diagram with preset dimensionality.
Inputting the vehicle image to be positioned into a first convolution area in an RGB three-channel mode, and performing feature extraction on the vehicle image to be positioned in the RGB three-channel mode through each convolution kernel in the first convolution area to obtain an input feature map with preset dimensionality, wherein the preset dimensionality is the dimensionality of a channel, and the dimensionality of the channel is the same as the number of the convolution kernels in the first convolution area.
In a possible embodiment, the depth-separable convolution adopted by the first convolution area is used for performing convolution calculation on the vehicle image to be positioned.
For example, the convolution for the RGB three channels may be a coupling convolution of the channel dimension and the space dimension, or a decoupling convolution of the channel dimension and the space dimension by using a depth-separable convolution. For example, for an 8 × 3 to-be-positioned vehicle image, where 8 × 8 is spatial information of the to-be-positioned vehicle image, 3 is three RGB channels of the to-be-positioned vehicle image, calculation is performed by using convolution kernels of 3 × 32, and an input feature map of 32 channels is obtained after convolution, where 3 × 3 in the front is the size of the convolution kernel, 3 in the back is the number of channels of the convolution kernel, and corresponds to the three RGB channels, respectively, the 32 in the front is the number of convolution kernels, the channels of each convolution kernel share a weight, and the total calculation amount is 3 × 32 — 864.
In the depth-separable convolution, three intermediate feature maps are obtained by respectively performing spatial dimension calculation on RGB three channels through 3 × 3 convolution kernels, and then performing channel dimension calculation on the three intermediate feature maps through 1 × 32 convolution kernels to obtain input feature maps of 32 channels, wherein 3 × 3 represents the size of the convolution kernels used for performing the spatial dimension calculation, 1 × 1 represents the size of the convolution kernels used for performing the channel dimension calculation, 32 represents the number of 1 × 1 convolution kernels, and each convolution kernel calculates a feature map corresponding to one channel, so that the total calculation amount is 3 × 3+ 1 × 32 ═ 123. Since the amount of calculation is reduced, the overall calculation speed can be improved.
In the case where the vehicle image to be located is a small image extracted by the vehicle detection network, the convolution calculation in the first convolution region does not require multi-resolution calculation, that is, up-and-down sampling, the input feature is filled with 1, and the sliding step of the convolution kernel is 1.
202. And sequentially inputting the input feature maps with preset dimensions into a plurality of second convolution areas for convolution calculation to obtain the target feature map.
In this step, the input feature map with the preset dimensionality is output from the first convolution area, the plurality of second convolution areas can be selected according to the detection accuracy and the detection speed, and of course, the detection accuracy and the detection speed are also related to a software environment and a hardware environment, and the detection accuracy and the detection speed may be different in different software environments and hardware environments, so that the number of the specific second convolution areas can be selected according to actual conditions.
In the embodiment of the invention, the operating system of the experimental software environment is Ubuntu 14.04. The CPU of the experimental hardware environment is Intel (R) core (R) i767003.4 GHz multiplied by 8, and the memory is 8G. GPU is NVIDIA (R) GTX (R)1080 TI. The data set adopts traffic monitoring images, and carries out the extraction of vehicle image through vehicle detection network, use open source labeling software to carry out automatic labeling to vehicle image, obtain driver regional positioning sample data set, 3000 therein training sets, 500 test sets, train holistic driver regional positioning network through the training set, the training obtains the regional driver positioning network in a plurality of different numbers second convolution regions, and test these regional driver positioning network's data and obtain precision like table 1, detect speed data:
number of second convolution regions Average accuracy/%) Detection speed/(frame/second)
9 99.97 68
8 99.97 69
7 99.96 71
6 99.96 73
5 81.82 74
TABLE 1
As can be seen from the data in table 1, when the number of second convolution regions is 6 or more, the accuracy has been significantly improved, and the detection speed decreases as the number of lightweight second convolution regions increases. When the number of the second convolution areas is reduced to 5, the detection speed is not improved, but the average accuracy is greatly reduced. Based on the data in table 1, it is preferable that the 6 second convolution regions are connected in series in the software environment and the hardware environment provided in the embodiment of the present invention.
Further, the second convolution region includes: a first convolutional layer, a second convolutional layer, a third convolutional layer and a fourth convolutional layer, the third convolutional layer and the fourth convolutional layer are connected in series to form a first branch network, the second convolutional layer is a single second branch network, and the output of the first convolutional layer is connected with the input of the first branch network and the input of the second branch network respectively, as shown in fig. 3.
Specifically, the convolution kernel in the first convolution layer is N × 1 × M convolution kernel, where N is the number of channels of the input feature map, each channel of the convolution kernel shares a weight, and M is the number of 1 × 1 convolution kernels, and the number of channels of the first feature map is obtained through convolution calculation of the first convolution layer. For example, assuming that the input feature map with the predetermined dimension is a tensor of 8 × N, where N is the predetermined dimension, i.e., the number of channels of the input feature map, the convolution is calculated as 8 × N · 1 × M — 8 × M, i.e., the first feature map output by the first convolution layer is a tensor of 8 × 8M. And N is greater than M, so that the output of the first convolution layer is a dimension reduction result, namely the number of channels of the first feature map is less than that of the channels of the input feature map. It should be noted that, in the embodiment of the present invention, the preset dimension refers to a preset number of channels. Compared with the channel number of the input feature map, the channel number of the first feature map is reduced, so that the subsequent calculation amount is reduced while the features are extracted, and the operation speed of the whole detection network is further improved.
After the first feature map 8 × M is obtained, the first feature map 8 × M is input to the first branch network and the second branch network, respectively, to perform convolution calculation. In the first branch network, the first feature map sequentially passes through a third convolutional layer and a fourth convolutional layer, wherein the convolutional kernel of the third convolutional layer is a 3 × 3 convolutional kernel, the 3 × 3 convolutional kernel is a single-channel convolutional kernel, the number of the convolutional kernels is M, the number of the convolutional kernels needs to be kept the same as that of the channels of the first feature map, in the third convolutional layer, each 3 × 3 convolutional kernel performs convolutional calculation with one channel of the first feature map so as to separate the spatial information of the feature map from the channel information, 8 × 8 feature maps of the M channels are obtained, and then the 8 × 8 feature maps of the M channels are input into the fourth convolutional layer for calculation. The convolution kernels in the fourth convolution layer are M × 1 × K, where M channels of the 1 × 1 convolution kernel share a weight, and K is the number of 1 × 1 convolution kernels. The input signature of the fourth convolution layer is the output of the third convolution layer, i.e., the input signature is the tensor of 8 × M, and the convolution calculation is 8 × M · M × 1 × K — 8 × K, i.e., the tensor of 8 × K, which is the third signature output by the fourth convolution layer. And M is larger than K, so that the fourth convolution layer outputs a dimension reduction result, namely the number of channels of the fourth feature map is smaller than that of the first feature map.
In the second branch network, the convolution kernels in the second convolution layer are M × 1 × K, wherein M channels of the 1 × 1 convolution kernels share a weight, and K is the number of 1 × 1 convolution kernels. The input signature of the second convolution layer is the output of the first convolution layer, the first signature is the tensor of 8 × M, and the convolution calculation is 8 × M · M × 1 × K ═ 8 × K, i.e., the second signature of the output of the second convolution layer is the tensor of 8 × K. And M is larger than K, so that the fourth convolution layer outputs a dimension reduction result, namely the number of channels of the second feature map is smaller than that of the first feature map. The number of channels of the second feature map is the same as the number of channels of the third feature map, that is, the number of 1 × 1 convolution kernels in the second convolution layer is the same as the number of 1 × 1 convolution kernels in the fourth feature map.
After the second convolution result and the third convolution result are obtained, the second convolution result and the third convolution result may be spliced in the channel dimension, for example, a tensor of 8 × K in the third eigen map and a tensor of 8 × K in the second eigen map are spliced in the channel dimension, and an obtained tensor of 8 × 2K in the target eigen map is obtained. In this way, the input of the next second convolution region is the union of the outputs of the current second convolution regions, and the multiplexing of each feature map is realized.
In a possible embodiment, after the second convolution result and the third convolution result are obtained, the input feature map, the second convolution result and the third convolution result may be spliced in the channel dimension. Or, the input feature map, the first convolution result, the second convolution result, and the third convolution result may be stitched in the channel dimension. Thereby realizing the multiplexing of the characteristic diagrams.
In the embodiment of the invention, an input feature map with preset dimensions is input into a first convolution layer for calculation to obtain a first feature map, wherein the channel dimension of the first feature map is smaller than the preset dimensions; inputting the first characteristic diagram into a second convolution layer for calculation to obtain a second characteristic diagram; sequentially inputting the first characteristic diagram into a third convolutional layer and a fourth convolutional layer for calculation to obtain a third characteristic diagram, wherein the number of channels of the second characteristic diagram is the same as that of the channels of the third characteristic diagram; and splicing the input feature map, the second feature map and the third feature map with preset dimensions on the channel dimensions to obtain a target feature map. And performing feature extraction through the plurality of second convolution areas, and simultaneously performing dimension reduction and multiplexing on the feature map. Furthermore, spatial convolution and channel convolution are sequentially performed on the third convolution layer and the fourth convolution layer, so that decoupling of spatial information and channel information in the convolution process is achieved, parameters are reduced, and the calculation speed is improved.
203. And inputting the target characteristic diagram into the full-connection layer for calculation to obtain the position of the target windshield.
In this step, because the output feature maps of the plurality of second regions are spliced in the target feature map, the output third feature map of the last second region in the target feature map may be selected as the final target feature map, and input into the full-link layer + Softmax for classification output, so as to obtain the position of the target windshield.
The steps 201, 202 and 203 are only directed at the detection embodiment of the windshield, and the other structures of the front face of the vehicle can be detected through the same detection network structure and steps, so that the detection of the other structures of the front face of the vehicle can also have the effects of reducing the parameter number and improving the calculation speed by changing the sample data set and the number of the second convolution areas.
104. And positioning the target driver area based on the number of the target license plate and the position of the front face of the target vehicle.
In this step, the target vehicle front face includes at least one of a windshield, a hood, an air intake grill, and a bumper, and in some vehicle models, the vehicle front face does not have a hood, such as a truck.
Since different regions have different vehicle driving rules, for example, the driver area of a domestic vehicle is on the right side with respect to the front of the vehicle, and the driver area of a vehicle in a harbor, australian, or other country is on the left side. Therefore, the region to which the target vehicle belongs can be judged through the number of the target license plate, so that whether the driver area of the target vehicle is on the right side or on the left side is judged. For example, when the target license plate number does not belong to a vehicle in a port, an australian district, or other country, it may be determined that the driver area of the target vehicle is on the right side; when the target license plate number belongs to the harbor and Australian area, judging that the driver area of the target vehicle is on the left side; and when the target license plate number belongs to other countries, judging whether the driver area of the target vehicle is on the left side or the right side according to the vehicle driving rule of the country.
In this step, the target vehicle front face includes at least one of a windshield, a hood, an air intake grill, and a bumper, and in some vehicle models, the vehicle front face does not have a hood, such as a truck.
The embodiment of the invention takes the windshield as the front face of the target vehicle as an example, the position of the windshield is obtained after the prediction of the windshield detection network, at this time, only the left area or the right area of the windshield needs to be selected as the driver area, and then the corresponding driver area is extracted for subsequent driver behavior detection.
Of course, when the positions of the engine hood, the air intake grille and the bumper are detected by the detection network, the left upper area or the right upper area of the engine hood, the air intake grille and the bumper can be selected as a target driver area according to the license plate number, and then the corresponding driver area is extracted for subsequent driver behavior detection. It should be noted that the range of the left upper area and the right upper area should include the windshield area on the corresponding side so that the behavior of the driver can be detected.
In the embodiment of the invention, the image of the vehicle to be positioned is obtained; carrying out license plate number recognition on the vehicle image to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number; detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle; and positioning the target driver based on the target license plate number and the position of the front face of the target vehicle. Because the target driver area is positioned by combining the license plate number and the position of the front face of the vehicle instead of directly predicting the target driver, the license plate number is added as the prior information of the target driver area, and the driver area can be positioned in the position of the front face of the vehicle more accurately.
It should be noted that the method for locating a driver area provided by the embodiment of the present invention may be applied to a device such as a mobile phone, a monitor, a computer, and a server that needs to locate the driver area.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a driver area positioning device according to an embodiment of the present invention, and as shown in fig. 4, the device includes:
the acquisition module 401 is used for acquiring an image of a vehicle to be positioned;
the first processing module 402 is configured to perform license plate number recognition on the to-be-positioned vehicle image through a pre-trained license plate number recognition network to obtain a target license plate number;
the second processing module 403 is configured to perform vehicle front face detection on the to-be-positioned vehicle image through a pre-trained detection network, so as to obtain a position of a front face of a target vehicle;
and a positioning module 404, configured to position a target driver area based on the target license plate number and the position of the front face of the target vehicle.
Optionally, the front face of the vehicle includes a windshield, and the second processing module 403 is further configured to perform windshield detection on the vehicle image to be positioned through a pre-trained windshield detection network, so as to obtain a position of a target windshield;
the positioning module 404 is further configured to position a target driver based on the target license plate number and the position of the target windshield.
Optionally, as shown in fig. 5, the pre-trained windshield detection network includes: a first convolution region, a plurality of second convolution regions, and a full connection layer, wherein the second processing module 403 includes:
the first calculation unit 4031 is used for inputting the vehicle image to be positioned into a first convolution area for convolution calculation to obtain an input feature map with preset dimensions;
a second calculation unit 4032, configured to sequentially input the input feature maps with the preset dimensions into the plurality of second convolution regions for convolution calculation, so as to obtain a target feature map;
and a third calculation unit 4033, configured to input the target feature map into the full-link layer for calculation, so as to obtain a position of the target windshield.
Optionally, as shown in fig. 6, the second convolution region includes: the first convolution layer, the second convolution layer, the third convolution layer, and the fourth convolution layer, the second calculation unit 4032 includes:
a first calculating subunit 40321, configured to input the input feature map with the preset dimension into the first convolution layer for calculation to obtain a first feature map, where a channel dimension of the first feature map is smaller than the preset dimension;
a second calculation subunit 40322, configured to input the first feature map into the second convolution layer for calculation to obtain a second feature map; and
a third calculation subunit 40323, configured to sequentially input the first feature map into the third convolutional layer and a fourth convolutional layer for calculation, so as to obtain a third feature map, where the number of channels in the second feature map is the same as that in the third feature map;
and a splicing subunit 40324, configured to splice the second feature map and the third feature map in a channel dimension to obtain a target feature map.
Optionally, as shown in fig. 7, the positioning module 404 includes:
the judging unit 4041 is configured to judge a region to which the target license plate number belongs;
a first obtaining unit 4042, configured to obtain setting information of a vehicle driving seat in a region to which the target license plate number belongs, where the setting information includes setting information on the left or the right;
a positioning unit 4043, configured to position a target driver area according to the setting information and the target windshield glass.
As shown in fig. 8, the obtaining module 401 includes:
a second obtaining unit 4011, configured to obtain an image to be detected;
the detection unit 4012 is configured to input the image to be detected to a vehicle detection network trained in advance, and detect to obtain a target vehicle;
and the extraction unit 4013 is configured to extract an image of the target vehicle as an image of the vehicle to be positioned.
It should be noted that the device for locating a driver area according to the embodiment of the present invention may be applied to a mobile phone, a monitor, a computer, a server, and other devices that need to locate a driver area.
The driver region positioning device provided by the embodiment of the invention can realize each process realized by the driver region positioning method in the method embodiment, and can achieve the same beneficial effects. To avoid repetition, further description is omitted here.
Referring to fig. 9, fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, as shown in fig. 9, including: a memory 902, a processor 901 and a computer program stored on the memory 902 and executable on the processor 901, wherein:
the processor 901 is used for calling the computer program stored in the memory 902 and executing the following steps:
obtaining an image of a vehicle to be positioned;
carrying out license plate number recognition on the vehicle image to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number;
detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle;
and positioning a target driver area based on the target license plate number and the position of the front face of the target vehicle.
Optionally, the front face of the vehicle includes a windshield, and the detecting, performed by the processor 901, of the front face of the vehicle of the to-be-positioned vehicle image through a pre-trained detection network to obtain the position of the front face of the target vehicle includes:
detecting the windshield of the vehicle image to be positioned through a pre-trained windshield detection network to obtain the position of a target windshield;
the locating of the target driver based on the target license plate number and the front face position of the target vehicle performed by processor 901 includes:
and positioning a target driver area based on the target license plate number and the position of the target windshield.
Optionally, the pre-trained windshield detection network includes: the detecting, performed by the processor 901, of the windshield of the to-be-positioned vehicle image through the pre-trained windshield detection network to obtain the position of the target windshield includes:
inputting the vehicle image to be positioned into a first convolution area for convolution calculation to obtain an input feature map with preset dimensionality;
sequentially inputting the input feature maps with the preset dimensionality into the plurality of second convolution areas for convolution calculation to obtain target feature maps;
and inputting the target characteristic diagram into a full connection layer for calculation to obtain the position of the target windshield.
Optionally, the second convolution region includes: the first convolution layer, the second convolution layer, the third convolution layer, and the fourth convolution layer, where the input feature map of the preset dimension executed by the processor 901 is sequentially input into the plurality of second convolution layers for convolution calculation to obtain the target feature map, includes:
inputting the input feature map with the preset dimension into the first convolution layer for calculation to obtain a first feature map, wherein the channel dimension of the first feature map is smaller than the preset dimension;
inputting the first characteristic diagram into the second convolution layer for calculation to obtain a second characteristic diagram; and
sequentially inputting the first characteristic diagram into the third convolutional layer and the fourth convolutional layer for calculation to obtain a third characteristic diagram, wherein the number of channels of the second characteristic diagram is the same as that of the channels of the third characteristic diagram;
and splicing the second characteristic diagram and the third characteristic diagram on the channel dimension to obtain a target characteristic diagram.
Optionally, the locating the target driver area based on the target license plate number and the position of the target windshield executed by the processor 901 includes:
judging the region of the target license plate number;
acquiring setting information of a vehicle driving position in the region of the target license plate number, wherein the setting information comprises setting information on the left or the right;
and positioning a target driver area according to the setting information and the target windshield glass.
Optionally, the obtaining of the image of the vehicle to be positioned by the processor 901 includes:
acquiring an image to be detected;
inputting the image to be detected into a vehicle detection network trained in advance, and detecting to obtain a target vehicle;
and extracting an image of the target vehicle as an image of the vehicle to be positioned.
The electronic device may be a device that is applicable to a mobile phone, a monitor, a computer, a server, and the like that need to locate a driver area.
The electronic equipment provided by the embodiment of the invention can realize each process realized by the method for positioning the driver region in the method embodiment, can achieve the same beneficial effects, and is not repeated here for avoiding repetition.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the method for locating a driver region according to the embodiment of the present invention, and can achieve the same technical effect, and is not described herein again to avoid repetition.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims (10)

1. A driver area positioning method is characterized by comprising the following steps:
obtaining an image of a vehicle to be positioned;
carrying out license plate number recognition on the vehicle image to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number;
detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle;
and positioning a target driver area based on the target license plate number and the position of the front face of the target vehicle.
2. The method of claim 1, wherein the vehicle front face comprises a windshield, and the detecting the vehicle front face of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the target vehicle front face comprises:
detecting the windshield of the vehicle image to be positioned through a pre-trained windshield detection network to obtain the position of a target windshield;
based on target license plate number and target vehicle front face position, fix a position target driver, include:
and positioning a target driver area based on the target license plate number and the position of the target windshield.
3. The method of claim 2, wherein the pre-trained windshield detection network comprises: the method comprises a first convolution area, a plurality of second convolution areas and a full connection layer, wherein the detection of the windshield of the vehicle image to be positioned is carried out through a pre-trained windshield detection network, and the position of a target windshield is obtained, and the method comprises the following steps:
inputting the vehicle image to be positioned into a first convolution area for convolution calculation to obtain an input feature map with preset dimensionality;
sequentially inputting the input feature maps with the preset dimensionality into the plurality of second convolution areas for convolution calculation to obtain target feature maps;
and inputting the target characteristic diagram into a full connection layer for calculation to obtain the position of the target windshield.
4. The method of claim 3, wherein the second convolution region comprises: the method comprises the following steps of sequentially inputting the input feature maps with preset dimensions into the plurality of second convolutional layers for convolution calculation to obtain target feature maps, wherein the steps comprise:
inputting the input feature map with the preset dimension into the first convolution layer for calculation to obtain a first feature map, wherein the channel dimension of the first feature map is smaller than the preset dimension;
inputting the first characteristic diagram into the second convolution layer for calculation to obtain a second characteristic diagram; and
sequentially inputting the first characteristic diagram into the third convolutional layer and the fourth convolutional layer for calculation to obtain a third characteristic diagram, wherein the number of channels of the second characteristic diagram is the same as that of the channels of the third characteristic diagram;
and splicing the second characteristic diagram and the third characteristic diagram on the channel dimension to obtain a target characteristic diagram.
5. The method of claim 2, wherein locating a target driver zone based on the target license plate number and the position of the target windshield comprises:
judging the region of the target license plate number;
acquiring setting information of a vehicle driving position in the region of the target license plate number, wherein the setting information comprises setting information on the left or the right;
and positioning a target driver area according to the setting information and the target windshield glass.
6. The method of claim 1, wherein said obtaining an image of a vehicle to be positioned comprises:
acquiring an image to be detected;
inputting the image to be detected into a vehicle detection network trained in advance, and detecting to obtain a target vehicle;
and extracting an image of the target vehicle as an image of the vehicle to be positioned.
7. A driver zone locating device, said device comprising:
the acquisition module is used for acquiring an image of the vehicle to be positioned;
the first processing module is used for carrying out license plate number recognition on the image of the vehicle to be positioned through a pre-trained license plate number recognition network to obtain a target license plate number;
the second processing module is used for detecting the front face of the vehicle of the image of the vehicle to be positioned through a pre-trained detection network to obtain the position of the front face of the target vehicle;
and the positioning module is used for positioning the target driver area based on the target license plate number and the position of the front face of the target vehicle.
8. The apparatus of claim 7, wherein the front face of the vehicle comprises a windshield, and the second processing module is further configured to perform windshield detection on the image of the vehicle to be positioned through a pre-trained windshield detection network to obtain a position of a target windshield;
the positioning module is further used for positioning a target driver based on the target license plate number and the position of the target windshield.
9. An electronic device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the driver zone location method according to any of claims 1 to 6 when executing the computer program.
10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps in the method for locating a driver's zone according to any one of claims 1 to 6.
CN202010082800.8A 2020-02-07 2020-02-07 Driver region positioning method and device, electronic equipment and storage medium Active CN113255395B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010082800.8A CN113255395B (en) 2020-02-07 2020-02-07 Driver region positioning method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010082800.8A CN113255395B (en) 2020-02-07 2020-02-07 Driver region positioning method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113255395A true CN113255395A (en) 2021-08-13
CN113255395B CN113255395B (en) 2024-06-11

Family

ID=77219293

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010082800.8A Active CN113255395B (en) 2020-02-07 2020-02-07 Driver region positioning method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113255395B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104680156A (en) * 2015-03-23 2015-06-03 山东农业大学 System and method for identifying unfastened state of safety belt in front row of motor vehicle based on machine version
CN104966066A (en) * 2015-06-26 2015-10-07 武汉大学 Traffic block port monitoring oriented in-car human face detection method and system
CN106485224A (en) * 2016-10-13 2017-03-08 北京智芯原动科技有限公司 A kind of seatbelt wearing recognition methodss and device
CN106503673A (en) * 2016-11-03 2017-03-15 北京文安智能技术股份有限公司 A kind of recognition methodss of traffic driving behavior, device and a kind of video acquisition device
CN106682600A (en) * 2016-12-15 2017-05-17 深圳市华尊科技股份有限公司 Method and terminal for detecting targets
CN106778659A (en) * 2016-12-28 2017-05-31 深圳市捷顺科技实业股份有限公司 A kind of licence plate recognition method and device
CN107977643A (en) * 2017-12-18 2018-05-01 浙江工业大学 A kind of officer's car monitoring method based on road camera
WO2018112900A1 (en) * 2016-12-23 2018-06-28 深圳先进技术研究院 License plate recognition method and apparatus, and user equipment
CN110414451A (en) * 2019-07-31 2019-11-05 深圳市捷顺科技实业股份有限公司 It is a kind of based on end-to-end licence plate recognition method, device, equipment and storage medium
CN110427937A (en) * 2019-07-18 2019-11-08 浙江大学 A kind of correction of inclination license plate and random length licence plate recognition method based on deep learning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104680156A (en) * 2015-03-23 2015-06-03 山东农业大学 System and method for identifying unfastened state of safety belt in front row of motor vehicle based on machine version
CN104966066A (en) * 2015-06-26 2015-10-07 武汉大学 Traffic block port monitoring oriented in-car human face detection method and system
CN106485224A (en) * 2016-10-13 2017-03-08 北京智芯原动科技有限公司 A kind of seatbelt wearing recognition methodss and device
CN106503673A (en) * 2016-11-03 2017-03-15 北京文安智能技术股份有限公司 A kind of recognition methodss of traffic driving behavior, device and a kind of video acquisition device
CN106682600A (en) * 2016-12-15 2017-05-17 深圳市华尊科技股份有限公司 Method and terminal for detecting targets
WO2018112900A1 (en) * 2016-12-23 2018-06-28 深圳先进技术研究院 License plate recognition method and apparatus, and user equipment
CN106778659A (en) * 2016-12-28 2017-05-31 深圳市捷顺科技实业股份有限公司 A kind of licence plate recognition method and device
CN107977643A (en) * 2017-12-18 2018-05-01 浙江工业大学 A kind of officer's car monitoring method based on road camera
CN110427937A (en) * 2019-07-18 2019-11-08 浙江大学 A kind of correction of inclination license plate and random length licence plate recognition method based on deep learning
CN110414451A (en) * 2019-07-31 2019-11-05 深圳市捷顺科技实业股份有限公司 It is a kind of based on end-to-end licence plate recognition method, device, equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHING-TANG HSIEH ET AL: "A real-time mobile vehicle license plate detection and recognition for vehicle monitoring and management", 《2009 JOINT CONFERENCES ON PERVASIVE COMPUTING》, 25 February 2010 (2010-02-25), pages 197 - 202 *
吴天舒 等: "结合YOLO 检测和语义分割的驾驶员安全带检测", 《计算机辅助设计与图形学学报》, vol. 31, no. 1, pages 126 - 131 *
吴天舒: "基于深度学习的驾驶员安全带检测", 《中国优秀硕士学位论文全文数据库 信息科技辑》, vol. 2019, no. 8, pages 138 - 818 *

Also Published As

Publication number Publication date
CN113255395B (en) 2024-06-11

Similar Documents

Publication Publication Date Title
CN113033604B (en) Vehicle detection method, system and storage medium based on SF-YOLOv4 network model
CN108986465B (en) Method, system and terminal equipment for detecting traffic flow
CN111709416B (en) License plate positioning method, device, system and storage medium
CN110222764A (en) Shelter target detection method, system, equipment and storage medium
CN110517261A (en) Seat belt status detection method, device, computer equipment and storage medium
CN106815574A (en) Set up detection model, detect the method and apparatus for taking mobile phone behavior
CN110889421A (en) Target detection method and device
CN115019279A (en) Context feature fusion method based on MobileNet lightweight network
CN111611918A (en) Traffic flow data set acquisition and construction method based on aerial photography data and deep learning
US11120308B2 (en) Vehicle damage detection method based on image analysis, electronic device and storage medium
CN111079634B (en) Method, device and system for detecting obstacle in running process of vehicle and vehicle
CN116597413A (en) Real-time traffic sign detection method based on improved YOLOv5
CN113191270B (en) Method and device for detecting throwing event, electronic equipment and storage medium
CN112348011B (en) Vehicle damage assessment method and device and storage medium
CN116843691A (en) Photovoltaic panel hot spot detection method, storage medium and electronic equipment
CN110059544B (en) Pedestrian detection method and system based on road scene
CN111931721A (en) Method and device for detecting color and number of annual inspection label and electronic equipment
CN113255395A (en) Driver region positioning method and device, electronic equipment and storage medium
KR102416714B1 (en) System and method for city-scale tree mapping using 3-channel images and multiple deep learning
CN103473567A (en) Vehicle detection method based on partial models
CN106446837A (en) Hand waving detection method based on motion historical images
CN112614156A (en) Training method and device for multi-target tracking network model and related equipment
CN103065332B (en) The detection method of greasy weather pedestrian rapid movement behavior and device
CN116989818B (en) Track generation method and device, electronic equipment and readable storage medium
CN112926414B (en) Image processing method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant