CN111931639B - Driver behavior detection method and device, electronic equipment and storage medium - Google Patents

Driver behavior detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111931639B
CN111931639B CN202010790208.3A CN202010790208A CN111931639B CN 111931639 B CN111931639 B CN 111931639B CN 202010790208 A CN202010790208 A CN 202010790208A CN 111931639 B CN111931639 B CN 111931639B
Authority
CN
China
Prior art keywords
steering wheel
driver
detection result
determining
human hand
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010790208.3A
Other languages
Chinese (zh)
Other versions
CN111931639A (en
Inventor
王飞
钱晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Original Assignee
Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Sensetime Lingang Intelligent Technology Co Ltd filed Critical Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Priority to CN202010790208.3A priority Critical patent/CN111931639B/en
Publication of CN111931639A publication Critical patent/CN111931639A/en
Priority to KR1020227003906A priority patent/KR20220032074A/en
Priority to JP2022523602A priority patent/JP2023500218A/en
Priority to PCT/CN2020/135501 priority patent/WO2022027894A1/en
Application granted granted Critical
Publication of CN111931639B publication Critical patent/CN111931639B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/08Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
    • B60W40/09Driving style or behaviour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W30/00Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
    • B60W30/08Active safety systems predicting or avoiding probable or impending collision or attempting to minimise its consequences
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/08Interaction between the driver and the control system
    • B60W50/14Means for informing the driver, warning the driver or prompting a driver intervention
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0001Details of the control system
    • B60W2050/0002Automatic control, details of type of controller or control system architecture
    • B60W2050/0004In digital systems, e.g. discrete-time systems involving sampling
    • B60W2050/0005Processor details or data handling, e.g. memory registers or chip architecture
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/08Interaction between the driver and the control system
    • B60W50/14Means for informing the driver, warning the driver or prompting a driver intervention
    • B60W2050/143Alarm means

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Automation & Control Theory (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Mechanical Engineering (AREA)
  • Transportation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure provides a driver behavior detection method, a device, an electronic device and a storage medium, wherein the method comprises the following steps: acquiring an image to be detected of a driving position area in a vehicle cabin; detecting the image to be detected to obtain a target detection result, wherein the target detection result comprises a steering wheel detection result and a human hand detection result; determining the driving behavior category of the driver according to the target detection result; and when the driving behavior type of the driver is dangerous driving, sending out warning information.

Description

Driver behavior detection method and device, electronic equipment and storage medium
Technical Field
The disclosure relates to the technical field of deep learning, in particular to a driver behavior detection method, a device, electronic equipment and a storage medium.
Background
With the rapid development of vehicles, the vehicles become important transportation means for users to travel, so that safe running of the vehicles becomes one of the important subjects in the current automobile industry. The safe running of the vehicle is determined by various factors such as driving behavior of a driver, road conditions, weather conditions, and the like.
In general, dangerous driving behavior is one of the main factors responsible for most traffic accidents. Therefore, in order to improve the driving safety and ensure the safety of passengers and drivers, the driving behavior of the drivers can be detected.
Disclosure of Invention
In view of this, embodiments of the present disclosure at least provide a driver behavior detection method, apparatus, electronic device, and storage medium.
In a first aspect, the present disclosure provides a driver behavior detection method, including:
acquiring an image to be detected of a driving position area in a vehicle cabin;
Detecting the image to be detected to obtain a target detection result, wherein the target detection result comprises a steering wheel detection result and a human hand detection result;
determining the driving behavior category of the driver according to the target detection result;
and when the driving behavior type of the driver is dangerous driving, sending out warning information.
By adopting the method, the target detection result is obtained by detecting the image to be detected corresponding to the acquired driving position area, the target detection result comprises the steering wheel detection result and the human hand detection result, the driving behavior type of the driver is determined by the target detection result, and when the driving behavior type of the driver is dangerous driving, warning information is sent out, so that the driving behavior of the driver is detected, the driver is conveniently and safely reminded, and the driving safety of the vehicle is improved.
In one possible implementation manner, when the steering wheel detection result includes a steering wheel and the human hand detection result includes a human hand, determining a driving behavior category of the driver according to the target detection result includes:
Determining the position relationship between the steering wheel and the human hand according to the steering wheel detection result and the human hand detection result;
And determining the driving behavior category of the driver according to the position relation.
In one possible embodiment, determining a driving behavior class of the driver according to the positional relationship includes:
and determining that the driving behavior type of the driver is safe driving under the condition that the position relation indicates that the driver holds the steering wheel.
In one possible embodiment, determining a driving behavior class of the driver according to the positional relationship includes:
And determining that the driving behavior type of the driver is dangerous driving when the position relation indicates that the hands of the driver are separated from the steering wheel.
In one possible implementation manner, when the steering wheel detection result includes a steering wheel and the human hand detection result includes no human hand, determining a driving behavior category of the driver according to the target detection result includes:
and determining the driving behavior type of the driver as dangerous driving according to the target detection result.
In a possible implementation manner, detecting the image to be detected to obtain a target detection result includes:
Generating an intermediate feature map corresponding to the image to be detected based on the image to be detected;
Performing at least one target convolution treatment on the intermediate feature map to generate detection feature maps of a plurality of channels corresponding to the intermediate feature map;
performing feature value conversion processing on each feature value of the target channel feature map representing the position in the detection feature maps of the plurality of channels by using an activation function, and generating a converted target channel feature map;
Carrying out maximum pooling treatment on the converted target channel feature graph according to a preset pooling size and pooling step length to obtain a plurality of pooling values and position indexes corresponding to each pooling value in the plurality of pooling values; the position index is used for identifying the position of the pooling value in the converted target channel characteristic diagram;
generating target detection frame information based on the plurality of pooled values and a position index corresponding to each of the plurality of pooled values;
And determining the target detection result according to the target detection frame information.
Under the above embodiment, by performing the maximum pooling processing on the target channel feature map, a plurality of pooled values and position indexes corresponding to each pooled value in the plurality of pooled values are obtained, and target detection frame information is generated, so that data support is provided for generating a target detection result.
In a possible implementation manner, the generating the target detection frame information based on the plurality of pooled values and the position index corresponding to each pooled value in the plurality of pooled values includes:
Determining a target pooling value belonging to a center point of the target detection frame from the plurality of pooling values based on the plurality of pooling values and a pooling threshold when at least one pooling value of the plurality of pooling values is greater than the set pooling threshold;
and generating the target detection frame information based on the position index corresponding to the target pooling value.
In the above embodiment, the pooling value larger than the pooling threshold value among the plurality of pooling values is determined as the target pooling value belonging to the center point of the target detection frame of the steering wheel or the driver's hand, and at least one target detection frame information of the steering wheel or the driver's hand is generated more accurately based on the position index corresponding to the target pooling value.
In a possible implementation manner, the generating the target detection frame information based on the plurality of pooled values and the position index corresponding to each pooled value in the plurality of pooled values includes:
and determining that the target detection frame information is empty under the condition that the plurality of pooling values are smaller than or equal to a set pooling threshold value.
In one possible embodiment, determining the positional relationship between the steering wheel and the human hand according to the steering wheel detection result and the human hand detection result includes:
Under the condition that the human hand detection result comprises one human hand, if a detection frame corresponding to the human hand in the human hand detection result and a detection frame corresponding to the steering wheel in the steering wheel detection result have a superposition area, determining that the position relationship between the steering wheel and the human hand is that the driver holds the steering wheel; if the detection frame corresponding to the human hand in the human hand detection result and the detection frame corresponding to the steering wheel in the steering wheel detection result do not have a superposition area, determining that the position relationship between the steering wheel and the human hand is that the hands of the driver are separated from the steering wheel.
In one possible embodiment, determining the positional relationship between the steering wheel and the human hand according to the steering wheel detection result and the human hand detection result includes:
under the condition that the human hand detection result comprises two human hands, if the detection frames corresponding to the two human hands in the human hand detection result are not overlapped with the detection frames corresponding to the steering wheel in the steering wheel detection result respectively, determining that the position relationship between the steering wheel and the human hands is that the hands of the driver are separated from the steering wheel; if at least one detection frame corresponding to one hand exists in the hand detection result and a detection frame corresponding to the steering wheel in the steering wheel detection result has an overlapping area, determining that the position relationship between the steering wheel and the hand is that the driver holds the steering wheel.
In one possible embodiment, determining the positional relationship between the steering wheel and the human hand according to the steering wheel detection result and the human hand detection result includes:
Generating an intermediate feature map corresponding to the image to be detected based on the image to be detected;
Performing convolution processing on the intermediate feature map at least once to generate a classification feature map of the two channels corresponding to the intermediate feature map; wherein, each channel characteristic diagram in the two-channel classification characteristic diagrams corresponds to a type of human hand;
Extracting two feature values at feature positions matched with the center point position information from the classification feature map based on the center point position information indicated by the detection frame information corresponding to the human hand in the human hand detection result; selecting a maximum characteristic value from the two characteristic values, and determining the category of the channel characteristic map corresponding to the maximum characteristic value in the classification characteristic map as the category corresponding to the central point position information;
And determining the position relationship between the steering wheel and the human hand based on the category corresponding to the position information of each center point indicated by the detection frame information corresponding to the human hand.
In the above embodiment, the steering wheel detection result is determined, the classification feature map is generated by performing at least one convolution process on the intermediate feature map, and the generated center point position information of the driver's hand is combined, so that the position relationship between the steering wheel and the human hand can be determined more accurately.
In one possible implementation manner, determining the position relationship between the steering wheel and the human hand based on the category corresponding to each center point position information indicated by the detection frame information corresponding to the human hand includes:
Under the condition that the detection frame information corresponding to the hands comprises central point position information, determining the category corresponding to the central point position information as the position relation between the steering wheel and the hands;
When the detection frame information corresponding to the hands of the driver comprises two pieces of center point position information and the categories corresponding to the two pieces of center point position information are the positions of the hands of the driver, determining that the position relationship between the steering wheel and the hands of the driver is the positions of the hands of the driver, which are away from the steering wheel; and determining that the position relationship between the steering wheel and the human hand is that the driver holds the steering wheel under the condition that at least one category corresponding to the center point position information is the condition that the driver holds the steering wheel in the categories corresponding to the two center point position information.
The following description of the effects of the apparatus, the electronic device, etc. refers to the description of the above method, and will not be repeated here.
In a second aspect, the present disclosure provides a driver behavior detection apparatus including:
the acquisition module is used for acquiring an image to be detected of a driving position area in the vehicle cabin;
The detection module is used for detecting the image to be detected to obtain a target detection result, wherein the target detection result comprises a steering wheel detection result and a human hand detection result;
the determining module is used for determining the driving behavior category of the driver according to the target detection result;
and the warning module is used for sending out warning information when the driving behavior class of the driver is dangerous driving.
In a third aspect, the present disclosure provides an electronic device comprising: a processor, a memory and a bus, the memory storing machine readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is running, the machine readable instructions when executed by the processor performing the steps of the driver behavior detection method as described in the first aspect or any of the embodiments above.
In a fourth aspect, the present disclosure provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the driver behaviour detection method as described in the first aspect or any of the embodiments above.
The foregoing objects, features and advantages of the disclosure will be more readily apparent from the following detailed description of the preferred embodiments taken in conjunction with the accompanying drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for the embodiments are briefly described below, which are incorporated in and constitute a part of the specification, these drawings showing embodiments consistent with the present disclosure and together with the description serve to illustrate the technical solutions of the present disclosure. It is to be understood that the following drawings illustrate only certain embodiments of the present disclosure and are therefore not to be considered limiting of its scope, for the person of ordinary skill in the art may admit to other equally relevant drawings without inventive effort.
FIG. 1 is a flow chart of a driver behavior detection method provided by an embodiment of the present disclosure;
fig. 2 is a flowchart illustrating a specific method for detecting an image to be detected to obtain a target detection result in the driver behavior detection method according to the embodiment of the present disclosure;
FIG. 3 illustrates a schematic architecture diagram of a driver behavior detection apparatus provided by an embodiment of the present disclosure;
Fig. 4 shows a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, not all embodiments. The components of the embodiments of the present disclosure, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be made by those skilled in the art based on the embodiments of this disclosure without making any inventive effort, are intended to be within the scope of this disclosure.
Considering that dangerous driving behavior is one of the main factors responsible for most traffic accidents. Therefore, in order to improve the driving safety and ensure the safety of passengers and drivers, the driving behavior of the drivers can be detected. In order to solve the above-described problems, the embodiments of the present disclosure provide a driver behavior detection method.
For the convenience of understanding the embodiments of the present disclosure, a driver behavior detection method disclosed in the embodiments of the present disclosure will be described in detail.
Referring to fig. 1, a flow chart of a driver behavior detection method according to an embodiment of the disclosure is shown, where the method includes S101-S104, where:
S101, acquiring an image to be detected of a driving position area in a vehicle cabin.
S102, detecting an image to be detected to obtain a target detection result, wherein the target detection result comprises a steering wheel detection result and a human hand detection result.
S103, determining the driving behavior type of the driver according to the target detection result.
And S104, sending out warning information when the driving behavior type of the driver is dangerous driving.
According to the method, the target detection result is obtained by detecting the acquired image to be detected of the driving position area, the target detection result comprises the steering wheel detection result and the human hand detection result, the driving behavior type of the driver is determined according to the target detection result, and when the driving behavior type of the driver is dangerous driving, warning information is sent out, so that the driving behavior of the driver is detected, the driver is conveniently and safely reminded, and the driving safety of the vehicle is improved.
For S101:
Here, an image pickup apparatus may be provided in the vehicle cabin, and an image to be detected of the driving position area may be acquired in real time by the image pickup apparatus provided in the vehicle cabin. The mounting position of the image pickup apparatus may be a position where the steering wheel and the driver seat area in the driving position area can be photographed.
For S102 and S103:
Here, the image to be detected may be input into the trained neural network, and the image to be detected is detected respectively to obtain a target detection result, where the target detection result includes a steering wheel detection result and a human hand detection result. The steering wheel detection result comprises information of whether the steering wheel exists in the image to be detected, and when the steering wheel exists, the steering wheel detection result comprises detection frame information of the steering wheel; the human hand detection result comprises information of a detection frame of whether a human hand exists in the image to be detected; when the human hand exists, the human hand detection result comprises detection frame information of the human hand.
In an alternative embodiment, referring to fig. 2, the detecting an image to be detected to obtain a target detection result may include:
s201, based on the image to be detected, generating an intermediate feature map corresponding to the image to be detected.
S202, performing target convolution processing on the intermediate feature map at least once to generate detection feature maps of a plurality of channels corresponding to the intermediate feature map.
S203, performing feature value conversion processing on each feature value of the target channel feature map representing the position in the detection feature maps of the plurality of channels by using the activation function, and generating a converted target channel feature map.
S204, carrying out maximum pooling treatment on the converted target channel feature map according to a preset pooling size and pooling step length to obtain a plurality of pooling values and position indexes corresponding to each pooling value in the plurality of pooling values; the location index is used to identify the location of the pooled value in the transformed target channel feature map.
S205, generating target detection frame information based on the plurality of pooled values and the position index corresponding to each of the plurality of pooled values.
S206, determining a target detection result according to the target detection frame information.
According to the embodiment, the target channel feature map is subjected to maximum pooling processing, so that a plurality of pooled values and position indexes corresponding to each pooled value in the plurality of pooled values are obtained, target detection frame information is generated, and data support is provided for generating a target detection result.
The image to be detected can be input into a trained neural network, and a backbone network in the trained neural network carries out convolution processing on the image to be detected for a plurality of times to generate an intermediate feature map corresponding to the image to be detected. The backbone network structure in the neural network can be set according to actual requirements.
Here, the intermediate feature maps may be input to a steering wheel detection branch network and a hand detection branch network of the neural network, respectively, to generate a steering wheel detection result and a human hand detection result. The following describes the generation of the steering wheel detection result in detail.
Here, the intermediate feature map may be first subjected to at least one first convolution process (i.e., a target convolution process), so as to generate a detection feature map of a plurality of channels corresponding to the steering wheel, where the number of channels corresponding to the detection feature map may be three channels. The detection feature map includes a first channel feature map characterizing a position (the first channel feature map is a target channel feature map), a second channel feature map characterizing length information of a detection frame, and a third channel feature map characterizing width information of the detection frame.
And then, performing feature value conversion processing on the target channel feature graphs of the representative positions in the detection feature graphs of the plurality of channels by using an activation function to generate a converted target channel feature graph, wherein each feature value in the converted target channel feature graph is a numerical value between 0 and 1. Wherein the activation function may be a sigmoid function. For the feature value of any feature point in the converted target channel feature map, if the feature value tends to be 1, the probability that the feature point corresponding to the feature value belongs to the center point of the detection frame of the steering wheel is larger.
Then, carrying out maximum pooling treatment on the converted target channel feature map according to a preset pooling size and pooling step length to obtain a pooling value corresponding to each feature position in the target channel feature map and a position index corresponding to each pooling value; the location index may be used to identify the location of the pooled value in the transformed target channel feature map. And then the same position index in the position indexes corresponding to each characteristic position can be combined to obtain a position index corresponding to a plurality of pooled values and each pooled value in the pooled values of the target channel characteristic map. The preset pooling size and the pooling step length can be set according to actual needs, for example, the preset pooling size can be 3×3, and the preset pooling step length can be 1.
Further, first detection frame information (i.e., target detection frame information) corresponding to the steering wheel may be generated based on the plurality of pooled values and the position index corresponding to each of the plurality of pooled values.
By way of example, the target channel feature map may be subjected to a 3×3 max pooling process with a step size of 1; in pooling, a maximum response value (i.e., a pooled value) of 3×3 feature points and a position index of the maximum response value on the target channel feature map are determined for feature values of every 3×3 feature points in the target channel feature map. At this time, the number of maximum response values is related to the size of the target channel feature map; for example, if the size of the target channel feature map is 80×60×3, the maximum response values obtained after the target channel feature map is subjected to the maximum pooling process are 80×60 in total; and for each maximum response value there may be at least one other maximum response value that is the same as its location index. And combining the maximum response values with the same position index to obtain M maximum response values and position indexes corresponding to each maximum response value in the M maximum response values. Finally, based on the M maximum response values (pooled values) and the position index corresponding to each maximum response value, first detection frame information corresponding to the steering wheel is generated.
The determining process of the second detection frame information corresponding to the human hand may refer to the determining process of the first detection frame information corresponding to the steering wheel, which is not described herein.
After the first detection frame information corresponding to the steering wheel is obtained, the first detection frame information may be determined as a steering wheel detection result. And when the first detection frame information corresponding to the steering wheel is not obtained, determining that the steering wheel detection result does not include the steering wheel. And after obtaining the second detection frame information corresponding to the human hand, determining the second detection frame information as a human hand detection result. And when the second detection frame information corresponding to the human hand is not obtained, determining that the human hand detection result does not include the human hand.
In an alternative embodiment, generating the target detection frame information based on the plurality of pooled values and the location index corresponding to each of the plurality of pooled values may include:
A1, determining a target pooling value of a central point of the target detection frame from the plurality of pooling values based on the plurality of pooling values and the pooling threshold value when at least one pooling value of the plurality of pooling values is larger than the set pooling threshold value.
A2, generating target detection frame information based on the position index corresponding to the target pooling value.
Continuing with the steering wheel example, a pooling threshold may be set, where, if at least one of the plurality of pooling values is greater than the set pooling threshold, the plurality of pooling values is filtered based on the set pooling threshold to obtain a target pooling value of the plurality of pooling values that is greater than the pooling threshold. In case each of the plurality of pooling values is less than or equal to the set pooling threshold value, then there is no target pooling value, i.e. no first detection frame information of the steering wheel.
Further, the center point position information of the first detection frame corresponding to the steering wheel may be generated based on the position index corresponding to the target pooled value. The pooling threshold corresponding to the steering wheel and the pooling threshold corresponding to the driver hand can be the same or different. Specifically, the pooling threshold corresponding to the steering wheel and the pooling threshold corresponding to the driver's hand can be determined according to actual conditions. For example, multiple frames of sample images acquired by the camera equipment corresponding to the image to be detected can be acquired, and a pooling threshold corresponding to the steering wheel and a pooling threshold corresponding to the driver hand are respectively generated by utilizing an adaptive algorithm according to the acquired multiple frames of sample images.
Continuing the above example, after obtaining M maximum response values and a position index corresponding to each of the M maximum response values, comparing each of the M maximum response values with a pooling threshold; when a certain maximum response value is greater than the pooling threshold, the maximum response value is determined as a target pooling value. And (3) position index corresponding to the target pooling value, namely the position information of the central point of the first detection frame of the steering wheel.
Here, the target channel feature map before conversion may be directly subjected to the maximum pooling processing, so as to obtain the center point position information of the first detection frame of the steering wheel.
After obtaining the position information of the central point of the first detection frame of the steering wheel, the second characteristic value at the characteristic position matched with the position information of the central point can be selected from the second channel characteristic diagram based on the position information of the central point, the selected second characteristic value is determined to be the length corresponding to the first detection frame of the steering wheel, the third characteristic value at the characteristic position matched with the position information of the central point is selected from the third channel characteristic diagram, the selected third characteristic value is determined to be the width corresponding to the first detection frame of the steering wheel, and the size information of the first detection frame of the steering wheel is obtained.
When aiming at the hands of the driver, one or two pieces of second detection frame information can be obtained, namely the second detection frame information corresponding to the left hand and/or the right hand can be obtained. Specifically, the process of determining the second detection frame information corresponding to the hand of the driver may refer to the process of determining the first detection frame information of the steering wheel, which is not described herein.
In the above embodiment, the pooling value larger than the pooling threshold value among the plurality of pooling values is determined as the target pooling value belonging to the center point of the target detection frame of the steering wheel or the driver's hand, and at least one target detection frame information of the steering wheel or the driver's hand is generated more accurately based on the position index corresponding to the target pooling value.
In an alternative embodiment, generating target detection frame information based on a plurality of pooled values and a position index corresponding to each of the plurality of pooled values includes: and determining that the target detection frame information is empty under the condition that the plurality of pooling values are smaller than or equal to the set pooling threshold value.
Here, when a plurality of pooling values corresponding to the steering wheel are smaller than or equal to a set pooling threshold value, determining that first detection frame information of the steering wheel is empty; and when at least one pooling value in the corresponding multiple pooling values of the steering wheel is larger than the set pooling threshold value, determining that the first detection frame information of the steering wheel is not empty.
After the steering wheel detection result and the human hand detection result are obtained, the driving behavior type of the driver can be determined based on the steering wheel detection result and the human hand detection result.
In one possible embodiment, when the steering wheel detection result includes a steering wheel and the human hand detection result includes a human hand, determining a driving behavior class of the driver according to the target detection result includes:
determining the position relationship between the steering wheel and the human hand according to the steering wheel detection result and the human hand detection result;
and determining the driving behavior category of the driver according to the position relation.
Here, the positional relationship between the steering wheel and the human hand may be determined based on the steering wheel detection result and the human hand detection result, and the driving behavior type of the driver, that is, whether the driver is driving safely or driving at risk, may be determined based on the determined positional relationship.
In one possible embodiment, determining a driving behavior class of the driver according to the positional relationship includes: and determining that the driving behavior type of the driver is safe driving under the condition that the position relation indicates that the driver holds the steering wheel.
Here, when it is detected that the positional relationship indicates that the driver holds the steering wheel, the behavior class of the driver is determined to be safe driving. The driver's hand-held steering wheel conditions include a driver left-hand-held steering wheel, a driver right-hand-held steering wheel, or a driver both hands-held steering wheel.
In one possible embodiment, determining a driving behavior class of the driver according to the positional relationship includes: in the case where the positional relationship indicates that both hands of the driver are away from the steering wheel, the driving behavior class of the driver is determined to be dangerous driving.
Here, when the positional relationship is determined as the driver's hands are out of the steering wheel, the category of the driving behavior of the driver is determined as dangerous driving.
In one possible embodiment, when the steering wheel detection result includes a steering wheel and the human hand detection result includes no human hand, determining a driving behavior class of the driver according to the target detection result includes: and determining the driving behavior type of the driver as dangerous driving according to the target detection result.
Here, if the steering wheel is detected in the steering wheel detection result and the human hand is not detected in the human hand detection result, the driver is characterized in that both hands of the driver are separated from the steering wheel, and the driving behavior type of the driver is determined to be dangerous driving.
For example, if the steering wheel is not detected in the steering wheel detection result, the image to be detected is determined to be an abnormal image, so that the driving behavior type of the driver is determined to be abnormal.
In one possible embodiment, determining a positional relationship between the steering wheel and the human hand based on the steering wheel detection result and the human hand detection result includes: under the condition that the human hand detection result comprises one human hand, if a detection frame corresponding to the human hand in the human hand detection result and a detection frame corresponding to the steering wheel in the steering wheel detection result have a superposition area, determining that the position relationship between the steering wheel and the human hand is that the driver holds the steering wheel; if the detection frame corresponding to the human hand in the human hand detection result and the detection frame corresponding to the steering wheel in the steering wheel detection result do not have a superposition area, determining that the position relationship between the steering wheel and the human hand is that the hands of the driver are separated from the steering wheel.
In one possible embodiment, determining a positional relationship between the steering wheel and the human hand based on the steering wheel detection result and the human hand detection result includes: under the condition that the human hand detection result comprises two human hands, if the detection frames corresponding to the two human hands in the human hand detection result respectively have no overlapping areas with the detection frames corresponding to the steering wheel in the steering wheel detection result, determining that the position relationship between the steering wheel and the human hands is that the hands of the driver are separated from the steering wheel; if at least one detection frame corresponding to a human hand exists in the human hand detection result and a detection frame corresponding to a steering wheel in the steering wheel detection result has an overlapping area, determining that the position relationship between the steering wheel and the human hand is that the driver holds the steering wheel.
Here, the positional relationship between the steering wheel and the human hand may be determined using a detection frame corresponding to the steering wheel in the steering wheel detection result and a detection frame corresponding to the human hand in the human hand detection result.
When the detection result of the human hand comprises a human hand, and when a superposition area exists between the detection frame corresponding to the human hand and the detection frame corresponding to the steering wheel, the position relationship between the steering wheel and the human hand is determined to be the hand holding steering wheel. When the detection frame corresponding to the hand of the person and the detection frame corresponding to the steering wheel have a non-overlapping area, determining that the position relationship between the steering wheel and the hand of the person is that the hand is separated from the steering wheel.
When the detection result of the hands comprises two hands, and a superposition area exists between the detection frame corresponding to at least one hand and the detection frame corresponding to the steering wheel, the position relationship between the steering wheel and the hands is determined to be the hand holding steering wheel. When the detection frames corresponding to the two hands respectively have the non-overlapping areas with the detection frames corresponding to the steering wheel, the position relationship between the steering wheel and the hands is determined to be that the hands are separated from the steering wheel.
In an alternative embodiment, determining the positional relationship between the steering wheel and the human hand according to the steering wheel detection result and the human hand detection result may include:
based on the image to be detected, generating an intermediate feature map corresponding to the image to be detected;
Performing convolution processing on the intermediate feature map at least once to generate a classification feature map of the two channels corresponding to the intermediate feature map; each channel characteristic diagram in the two-channel classification characteristic diagrams corresponds to a type of human hand.
Extracting two characteristic values at characteristic positions matched with the central point position information from the classification characteristic map based on the central point position information indicated by the detection frame information corresponding to the human hand in the human hand detection result; and selecting the maximum characteristic value from the two characteristic values, and determining the category of the channel characteristic map corresponding to the maximum characteristic value in the classification characteristic map as the category corresponding to the central point position information.
And determining the position relationship between the steering wheel and the human hand based on the category corresponding to the position information of each center point indicated by the detection frame information corresponding to the human hand.
Here, when the second detection frame information indicated by the human hand detection result is not empty, the intermediate feature map may be subjected to convolution processing at least once, and a two-channel classification feature map corresponding to the intermediate feature map may be generated. Each channel characteristic diagram in the two-channel classification characteristic diagrams corresponds to a type of human hand. For example, in the classification feature map, the category corresponding to the channel feature map of the 0 th channel may be that the driver's hand is separated from the steering wheel; the category corresponding to the channel characteristic diagram of the 1 st channel can be that the driver holds the steering wheel.
Further, based on the center point position information indicated by the box detection information corresponding to the hand, two feature values at the feature positions matched with the center point position information can be extracted from the classification feature map, the maximum feature value is selected from the two feature values, and the category of the channel feature map corresponding to the maximum feature value in the classification feature map is determined as the category corresponding to the center point position information.
When the detection frame information corresponding to the hand of the person includes two pieces of center point position information (namely, includes center point position information corresponding to the left hand and center point position information corresponding to the right hand), a category corresponding to the center point position information is determined for each piece of center point position information.
For example, if the classification feature map corresponds to the channel feature map of the 0 th channel, the classification feature map may be that the driver's hand is separated from the steering wheel; the category corresponding to the channel feature map of the 1 st channel can be the category of the center point position information corresponding to the left hand, if the driver holds the steering wheel, two feature values, namely 0.8 and 0.2, are extracted from the classification feature map, and the category of the 0 th channel feature map corresponding to 0.8 in the classification feature map is determined as the category of the center point position information corresponding to the left hand, namely the category of the center point position information corresponding to the left hand is the category of the steering wheel which is separated from the driver hand. Meanwhile, the category of the center point position information corresponding to the right hand can be obtained.
In the above embodiment, the steering wheel detection result is determined, the classification feature map is generated by performing at least one convolution process on the intermediate feature map, and the generated center point position information of the driver's hand is combined, so that the position relationship between the steering wheel and the human hand can be determined more accurately.
In an alternative embodiment, determining a positional relationship between the steering wheel and the human hand based on a category corresponding to each of the center point position information indicated by the detection frame information corresponding to the human hand includes:
in the first aspect, when the detection frame information corresponding to the hand includes one piece of center point position information, the category corresponding to the center point position information is determined as the positional relationship between the steering wheel and the hand.
In the second mode, when the detection frame information corresponding to the hands of the user comprises two pieces of center point position information and the category corresponding to the two pieces of center point position information is that the hands of the driver are separated from the steering wheel, determining that the position relationship between the steering wheel and the hands of the user is that the hands of the driver are separated from the steering wheel; and determining the position relationship between the steering wheel and the human hand as the steering wheel held by the driver under the condition that at least one category corresponding to the center point position information is the steering wheel held by the driver in the categories corresponding to the two center point position information.
In the first aspect, when the detection frame information corresponding to the human hand includes one piece of center point position information, that is, the detection frame information corresponding to the human hand includes center point position information corresponding to the left hand or center point position information corresponding to the right hand, the category corresponding to the center point position information in the detection frame information corresponding to the human hand may be determined as the positional relationship between the steering wheel and the human hand. For example, the detection frame information corresponding to the hand includes center point position information corresponding to the left hand, and when the type of the center point position information corresponding to the left hand is that the driver holds the steering wheel, the positional relationship between the steering wheel and the hand is that the driver holds the steering wheel.
Aiming at the second mode, the detection frame information corresponding to the hand of the person comprises two center point positions, namely the detection frame information corresponding to the hand of the person comprises center point position information corresponding to the left hand and center point position information corresponding to the right hand, and when the categories corresponding to the two center point position information are all the steering wheel separated from the hand of the driver, the position relationship between the steering wheel and the hand of the person is determined to be the steering wheel separated from the hand of the driver; when the type of the center point position information corresponding to the left hand is that the driver holds the steering wheel and/or the type of the center point position information corresponding to the right hand is that the driver holds the steering wheel, the position relation between the steering wheel and the human hand is determined to be that the driver holds the steering wheel.
For S104:
Here, when it is determined that the driving behavior class of the driver is dangerous driving, warning information for the driver may be generated based on the driving behavior class of the driver. The warning information can be played in a voice mode. For example, the generated warning information may be "dangerous, please hold the steering wheel".
It will be appreciated by those skilled in the art that in the above-described method of the specific embodiments, the written order of steps is not meant to imply a strict order of execution but rather should be construed according to the function and possibly inherent logic of the steps.
Based on the same concept, the embodiment of the disclosure further provides a driver behavior detection device, referring to fig. 3, which is a schematic structural diagram of the driver behavior detection device provided by the embodiment of the disclosure, including an obtaining module 301, a detecting module 302, a determining module 303, and a warning module 304, specifically:
The acquiring module 301 is configured to acquire an image to be detected corresponding to a driving position area in the vehicle cabin;
The detection module 302 is configured to detect the image to be detected to obtain a target detection result, where the target detection result includes a steering wheel detection result and a human hand detection result;
a determining module 303, configured to determine a driving behavior class of the driver according to the target detection result;
The warning module 304 is configured to send out warning information when the driving behavior class of the driver is dangerous driving.
In a possible implementation manner, when the steering wheel detection result includes a steering wheel and the human hand detection result includes a human hand, the determining module 303 is configured to, when determining the driving behavior class of the driver according to the target detection result:
Determining the position relationship between the steering wheel and the human hand according to the steering wheel detection result and the human hand detection result;
And determining the driving behavior category of the driver according to the position relation.
In a possible implementation manner, the determining module 303 is configured to, when determining the driving behavior class of the driver according to the location relationship:
and determining that the driving behavior type of the driver is safe driving under the condition that the position relation indicates that the driver holds the steering wheel.
In a possible implementation manner, the determining module 303 is configured to, when determining the driving behavior class of the driver according to the location relationship:
And determining that the driving behavior type of the driver is dangerous driving when the position relation indicates that the hands of the driver are separated from the steering wheel.
In a possible implementation manner, when the steering wheel detection result includes a steering wheel and the human hand detection result includes no human hand, the determining module 303 is configured to, when determining the driving behavior class of the driver according to the target detection result:
and determining the driving behavior type of the driver as dangerous driving according to the target detection result.
In a possible implementation manner, the detection module 302 is configured to, when detecting the image to be detected to obtain a target detection result:
Generating an intermediate feature map corresponding to the image to be detected based on the image to be detected;
Performing at least one target convolution treatment on the intermediate feature map to generate detection feature maps of a plurality of channels corresponding to the intermediate feature map;
performing feature value conversion processing on each feature value of the target channel feature map representing the position in the detection feature maps of the plurality of channels by using an activation function, and generating a converted target channel feature map;
Carrying out maximum pooling treatment on the converted target channel feature graph according to a preset pooling size and pooling step length to obtain a plurality of pooling values and position indexes corresponding to each pooling value in the plurality of pooling values; the position index is used for identifying the position of the pooling value in the converted target channel characteristic diagram;
generating target detection frame information based on the plurality of pooled values and a position index corresponding to each of the plurality of pooled values;
And determining the target detection result according to the target detection frame information.
In a possible implementation manner, the detection module 302 is configured to, when generating the target detection frame information based on the plurality of pooled values and the position index corresponding to each of the plurality of pooled values:
Determining a target pooling value of a center point of the target detection frame from the plurality of pooling values based on the plurality of pooling values and a pooling threshold when at least one pooling value of the plurality of pooling values is greater than the set pooling threshold;
and generating the target detection frame information based on the position index corresponding to the target pooling value.
In a possible implementation manner, the detection module 302 is configured to, when generating the target detection frame information based on a plurality of pooled values and a position index corresponding to each pooled value of the plurality of pooled values:
and determining that the target detection frame information is empty under the condition that the plurality of pooling values are smaller than or equal to a set pooling threshold value.
In a possible implementation manner, the determining module 303 is configured to, when determining the positional relationship between the steering wheel and the human hand according to the steering wheel detection result and the human hand detection result:
Under the condition that the human hand detection result comprises one human hand, if a detection frame corresponding to the human hand in the human hand detection result and a detection frame corresponding to the steering wheel in the steering wheel detection result have a superposition area, determining that the position relationship between the steering wheel and the human hand is that the driver holds the steering wheel; if the detection frame corresponding to the human hand in the human hand detection result and the detection frame corresponding to the steering wheel in the steering wheel detection result do not have a superposition area, determining that the position relationship between the steering wheel and the human hand is that the hands of the driver are separated from the steering wheel.
In a possible implementation manner, the determining module 303 is configured to, when determining the positional relationship between the steering wheel and the human hand according to the steering wheel detection result and the human hand detection result:
under the condition that the human hand detection result comprises two human hands, if the detection frames corresponding to the two human hands in the human hand detection result are not overlapped with the detection frames corresponding to the steering wheel in the steering wheel detection result respectively, determining that the position relationship between the steering wheel and the human hands is that the hands of the driver are separated from the steering wheel; if at least one detection frame corresponding to one hand exists in the hand detection result and a detection frame corresponding to the steering wheel in the steering wheel detection result has an overlapping area, determining that the position relationship between the steering wheel and the hand is that the driver holds the steering wheel.
In a possible implementation manner, the determining module 303 is configured to, when determining the positional relationship between the steering wheel and the human hand according to the steering wheel detection result and the human hand detection result:
Generating an intermediate feature map corresponding to the image to be detected based on the image to be detected;
Performing convolution processing on the intermediate feature map at least once to generate a classification feature map of the two channels corresponding to the intermediate feature map; wherein, each channel characteristic diagram in the two-channel classification characteristic diagrams corresponds to a type of human hand;
Extracting two feature values at feature positions matched with the center point position information from the classification feature map based on the center point position information indicated by the detection frame information corresponding to the human hand in the human hand detection result; selecting a maximum characteristic value from the two characteristic values, and determining the category of the channel characteristic map corresponding to the maximum characteristic value in the classification characteristic map as the category corresponding to the central point position information;
And determining the position relationship between the steering wheel and the human hand based on the category corresponding to the position information of each center point indicated by the detection frame information corresponding to the human hand.
In a possible implementation manner, the determining module 303 is configured to, when determining the positional relationship between the steering wheel and the human hand based on the category corresponding to each center point position information indicated by the detection frame information corresponding to the human hand:
Under the condition that the detection frame information corresponding to the hands comprises central point position information, determining the category corresponding to the central point position information as the position relation between the steering wheel and the hands;
When the detection frame information corresponding to the hands of the driver comprises two pieces of center point position information and the categories corresponding to the two pieces of center point position information are the steering wheel which is separated from the hands of the driver, determining that the position relationship between the steering wheel and the hands of the driver is the steering wheel which is separated from the hands of the driver; and determining that the position relationship between the steering wheel and the human hand is that the driver holds the steering wheel under the condition that at least one category corresponding to the center point position information is the condition that the driver holds the steering wheel in the categories corresponding to the two center point position information.
In some embodiments, the functions or templates included in the apparatus provided by the embodiments of the present disclosure may be used to perform the methods described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.
Based on the same technical concept, the embodiment of the disclosure also provides electronic equipment. Referring to fig. 4, a schematic structural diagram of an electronic device according to an embodiment of the disclosure includes a processor 401, a memory 402, and a bus 403. The memory 402 is configured to store execution instructions, including a memory 4021 and an external memory 4022; the memory 4021 is also referred to as an internal memory, and is used for temporarily storing operation data in the processor 401 and data exchanged with the external memory 4022 such as a hard disk, the processor 401 exchanges data with the external memory 4022 through the memory 4021, and when the electronic device 400 operates, the processor 401 and the memory 402 communicate with each other through the bus 403, so that the processor 401 executes the following instructions:
acquiring an image to be detected of a driving position area in a vehicle cabin;
Detecting the image to be detected to obtain a target detection result, wherein the target detection result comprises a steering wheel detection result and a human hand detection result;
determining the driving behavior category of the driver according to the target detection result;
and when the driving behavior type of the driver is dangerous driving, sending out warning information.
Furthermore, the embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor performs the steps of the driver behavior detection method described in the above-described method embodiments.
The computer program product of the driver behavior detection method provided in the embodiments of the present disclosure includes a computer readable storage medium storing program code, where the program code includes instructions for executing the steps of the driver behavior detection method described in the above method embodiments, and the specific reference may be made to the above method embodiments, which are not repeated herein.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present disclosure may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in essence or a part contributing to the prior art or a part of the technical solution, or in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a specific embodiment of the disclosure, but the protection scope of the disclosure is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the disclosure, and it should be covered in the protection scope of the disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (13)

1. A driver behavior detection method, characterized by comprising:
acquiring an image to be detected of a driving position area in a vehicle cabin;
Generating an intermediate feature map corresponding to the image to be detected based on the image to be detected;
Processing the intermediate feature map to generate a converted target channel feature map corresponding to the image to be detected; carrying out maximum pooling treatment on the converted target channel feature graph according to a preset pooling size and pooling step length to obtain a plurality of pooling values and position indexes corresponding to each pooling value in the plurality of pooling values; generating target detection frame information based on the plurality of pooled values and a position index corresponding to each of the plurality of pooled values; determining a target detection result according to the target detection frame information; the converted target channel feature map is a channel feature map of the positions represented by a plurality of channel feature maps of the image to be detected, the position index is used for identifying the positions of the pooling values in the converted target channel feature map, and the target detection results comprise steering wheel detection results and human hand detection results;
determining the driving behavior category of the driver according to the target detection result;
When the driving behavior type of the driver is dangerous driving, warning information is sent out;
When the steering wheel detection result includes a human hand, determining a driving behavior category of a driver according to the target detection result, including:
Performing convolution processing on the intermediate feature map at least once to generate a classification feature map of the two channels corresponding to the intermediate feature map; wherein, each channel characteristic diagram in the two-channel classification characteristic diagrams corresponds to a type of human hand; the type of the hand is that the hand of the driver is separated from the steering wheel or the driver holds the steering wheel;
Extracting two feature values at feature positions matched with the center point position information from the classification feature map based on the center point position information indicated by the detection frame information corresponding to the human hand in the human hand detection result; selecting a maximum characteristic value from the two characteristic values, and determining the category of the channel characteristic diagram corresponding to the maximum characteristic value in the classification characteristic diagram as the category corresponding to the central point position information;
and determining the driving behavior category of the driver according to the category corresponding to the central point position information.
2. The method according to claim 1, wherein determining the driving behavior class of the driver based on the positional relationship comprises:
and determining that the driving behavior type of the driver is safe driving under the condition that the position relation indicates that the driver holds the steering wheel.
3. The method according to claim 1, wherein determining the driving behavior class of the driver based on the positional relationship comprises:
And determining that the driving behavior type of the driver is dangerous driving when the position relation indicates that the hands of the driver are separated from the steering wheel.
4. The method according to claim 1, wherein when the steering wheel detection result includes a steering wheel and the human hand detection result includes no human hand, determining the driving behavior class of the driver according to the target detection result includes:
and determining the driving behavior type of the driver as dangerous driving according to the target detection result.
5. The method according to any one of claims 1 to 4, wherein processing the intermediate feature map to generate a converted target channel feature map corresponding to the image to be detected includes:
Performing at least one target convolution treatment on the intermediate feature map to generate detection feature maps of a plurality of channels corresponding to the intermediate feature map;
And performing feature value conversion processing on each feature value of the target channel feature map representing the position in the detection feature maps of the plurality of channels by using an activation function, and generating a converted target channel feature map.
6. The method of claim 1, wherein generating the target detection box information based on the plurality of pooled values and the location index corresponding to each of the plurality of pooled values comprises:
Determining a target pooling value of a center point of the target detection frame from the plurality of pooling values based on the plurality of pooling values and a pooling threshold when at least one pooling value of the plurality of pooling values is greater than the set pooling threshold;
and generating the target detection frame information based on the position index corresponding to the target pooling value.
7. The method of claim 1, wherein generating the target detection box information based on the plurality of pooled values and the location index corresponding to each of the plurality of pooled values comprises:
and determining that the target detection frame information is empty under the condition that the plurality of pooling values are smaller than or equal to a set pooling threshold value.
8. The method of claim 1, wherein determining a positional relationship between a steering wheel and a human hand based on the steering wheel detection result and the human hand detection result comprises:
Under the condition that the human hand detection result comprises one human hand, if a detection frame corresponding to the human hand in the human hand detection result and a detection frame corresponding to the steering wheel in the steering wheel detection result have a superposition area, determining that the position relationship between the steering wheel and the human hand is that the driver holds the steering wheel; if the detection frame corresponding to the human hand in the human hand detection result and the detection frame corresponding to the steering wheel in the steering wheel detection result do not have a superposition area, determining that the position relationship between the steering wheel and the human hand is that the hands of the driver are separated from the steering wheel.
9. The method of claim 1, wherein determining a positional relationship between a steering wheel and a human hand based on the steering wheel detection result and the human hand detection result comprises:
under the condition that the human hand detection result comprises two human hands, if the detection frames corresponding to the two human hands in the human hand detection result are not overlapped with the detection frames corresponding to the steering wheel in the steering wheel detection result respectively, determining that the position relationship between the steering wheel and the human hands is that the hands of the driver are separated from the steering wheel; if at least one detection frame corresponding to one hand exists in the hand detection result and a detection frame corresponding to the steering wheel in the steering wheel detection result has an overlapping area, determining that the position relationship between the steering wheel and the hand is that the driver holds the steering wheel.
10. The method according to claim 1, wherein determining the positional relationship between the steering wheel and the human hand based on the category corresponding to each center point positional information indicated by the detection frame information corresponding to the human hand includes:
Under the condition that the detection frame information corresponding to the hands comprises central point position information, determining the category corresponding to the central point position information as the position relation between the steering wheel and the hands;
When the detection frame information corresponding to the hands of the driver comprises two pieces of center point position information and the categories corresponding to the two pieces of center point position information are the positions of the hands of the driver, determining that the position relationship between the steering wheel and the hands of the driver is the positions of the hands of the driver, which are away from the steering wheel; and determining that the position relationship between the steering wheel and the human hand is that the driver holds the steering wheel under the condition that at least one category corresponding to the center point position information is the condition that the driver holds the steering wheel in the categories corresponding to the two center point position information.
11. A driver behavior detection apparatus, characterized by comprising:
the acquisition module is used for acquiring an image to be detected of a driving position area in the vehicle cabin;
The detection module is used for generating an intermediate feature map corresponding to the image to be detected based on the image to be detected, processing the intermediate feature map and generating a converted target channel feature map corresponding to the image to be detected; carrying out maximum pooling treatment on the converted target channel feature graph according to a preset pooling size and pooling step length to obtain a plurality of pooling values and position indexes corresponding to each pooling value in the plurality of pooling values; generating target detection frame information based on the plurality of pooled values and a position index corresponding to each of the plurality of pooled values; determining a target detection result according to the target detection frame information; the converted target channel feature map is a channel feature map of the positions represented by a plurality of channel feature maps of the image to be detected, the position index is used for identifying the positions of the pooling values in the converted target channel feature map, and the target detection results comprise steering wheel detection results and human hand detection results;
the determining module is used for determining the driving behavior category of the driver according to the target detection result;
The warning module is used for sending out warning information when the driving behavior class of the driver is dangerous driving;
The steering wheel detection result comprises a steering wheel, and when the hand detection result comprises a hand, the determining module is used for determining the driving behavior type of the driver according to the target detection result:
Performing convolution processing on the intermediate feature map at least once to generate a classification feature map of the two channels corresponding to the intermediate feature map; wherein, each channel characteristic diagram in the two-channel classification characteristic diagrams corresponds to a type of human hand; the type of the hand is that the hand of the driver is separated from the steering wheel or the driver holds the steering wheel;
Extracting two feature values at feature positions matched with the center point position information from the classification feature map based on the center point position information indicated by the detection frame information corresponding to the human hand in the human hand detection result; selecting a maximum characteristic value from the two characteristic values, and determining the category of the channel characteristic map corresponding to the maximum characteristic value in the classification characteristic map as the category corresponding to the central point position information;
and determining the driving behavior category of the driver according to the category corresponding to the central point position information.
12. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine readable instructions executable by the processor, the processor and the memory in communication over the bus when the electronic device is running, the machine readable instructions when executed by the processor performing the steps of the driver behavior detection method according to any one of claims 1 to 10.
13. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of the driver behavior detection method according to any one of claims 1 to 10.
CN202010790208.3A 2020-08-07 2020-08-07 Driver behavior detection method and device, electronic equipment and storage medium Active CN111931639B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202010790208.3A CN111931639B (en) 2020-08-07 2020-08-07 Driver behavior detection method and device, electronic equipment and storage medium
KR1020227003906A KR20220032074A (en) 2020-08-07 2020-12-10 Driver behavior detection method, apparatus, electronic device, storage medium and program
JP2022523602A JP2023500218A (en) 2020-08-07 2020-12-10 DRIVER ACTION DETECTION METHOD, APPARATUS, ELECTRONIC DEVICE, STORAGE MEDIUM AND PROGRAM
PCT/CN2020/135501 WO2022027894A1 (en) 2020-08-07 2020-12-10 Driver behavior detection method and apparatus, electronic device, storage medium and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010790208.3A CN111931639B (en) 2020-08-07 2020-08-07 Driver behavior detection method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111931639A CN111931639A (en) 2020-11-13
CN111931639B true CN111931639B (en) 2024-06-11

Family

ID=73307528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010790208.3A Active CN111931639B (en) 2020-08-07 2020-08-07 Driver behavior detection method and device, electronic equipment and storage medium

Country Status (4)

Country Link
JP (1) JP2023500218A (en)
KR (1) KR20220032074A (en)
CN (1) CN111931639B (en)
WO (1) WO2022027894A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931639B (en) * 2020-08-07 2024-06-11 上海商汤临港智能科技有限公司 Driver behavior detection method and device, electronic equipment and storage medium
CN112528910B (en) * 2020-12-18 2023-04-07 上海高德威智能交通***有限公司 Hand-off steering wheel detection method and device, electronic equipment and storage medium
CN113486759B (en) * 2021-06-30 2023-04-28 上海商汤临港智能科技有限公司 Dangerous action recognition method and device, electronic equipment and storage medium
CN113780108A (en) * 2021-08-24 2021-12-10 中联重科建筑起重机械有限责任公司 Method, processor and device for identifying driver behavior of tower crane and tower crane
CN115171082B (en) * 2022-06-29 2024-01-19 北京百度网讯科技有限公司 Driving behavior detection method and device, electronic equipment and readable storage medium
CN115471826B (en) * 2022-08-23 2024-03-26 中国航空油料集团有限公司 Method and device for judging safe driving behavior of aviation fueller and safe operation and maintenance system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102263937A (en) * 2011-07-26 2011-11-30 华南理工大学 Driver's driving behavior monitoring device and monitoring method based on video detection
CN104276080A (en) * 2014-10-16 2015-01-14 北京航空航天大学 Bus driver hand-off-steering-wheel detection warning system and warning method
CN107766865A (en) * 2017-11-06 2018-03-06 北京旷视科技有限公司 Pond method, object detecting method, device, system and computer-readable medium
CN111439170A (en) * 2020-03-30 2020-07-24 上海商汤临港智能科技有限公司 Child state detection method and device, electronic equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5018926B2 (en) * 2010-04-19 2012-09-05 株式会社デンソー Driving assistance device and program
CN102289660B (en) * 2011-07-26 2013-07-03 华南理工大学 Method for detecting illegal driving behavior based on hand gesture tracking
CN108229307B (en) * 2017-11-22 2022-01-04 北京市商汤科技开发有限公司 Method, device and equipment for object detection
CN109086662B (en) * 2018-06-19 2021-06-15 浙江大华技术股份有限公司 Abnormal behavior detection method and device
CN109034111A (en) * 2018-08-17 2018-12-18 北京航空航天大学 A kind of driver's hand based on deep learning is from steering wheel detection method and system
CN111931639B (en) * 2020-08-07 2024-06-11 上海商汤临港智能科技有限公司 Driver behavior detection method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102263937A (en) * 2011-07-26 2011-11-30 华南理工大学 Driver's driving behavior monitoring device and monitoring method based on video detection
CN104276080A (en) * 2014-10-16 2015-01-14 北京航空航天大学 Bus driver hand-off-steering-wheel detection warning system and warning method
CN107766865A (en) * 2017-11-06 2018-03-06 北京旷视科技有限公司 Pond method, object detecting method, device, system and computer-readable medium
CN111439170A (en) * 2020-03-30 2020-07-24 上海商汤临港智能科技有限公司 Child state detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2022027894A1 (en) 2022-02-10
JP2023500218A (en) 2023-01-05
CN111931639A (en) 2020-11-13
KR20220032074A (en) 2022-03-15

Similar Documents

Publication Publication Date Title
CN111931639B (en) Driver behavior detection method and device, electronic equipment and storage medium
CN108701229B (en) Driving behavior analysis method and driving behavior analysis device
CN109800633B (en) Non-motor vehicle traffic violation judgment method and device and electronic equipment
CN111931640B (en) Abnormal sitting posture identification method and device, electronic equipment and storage medium
CN110395260B (en) Vehicle, safe driving method and device
US8547214B2 (en) System for preventing handheld device use while operating a vehicle
JP6737906B2 (en) Control device, system and method for determining the perceptual load of a visual and dynamic driving scene
JP7288097B2 (en) Seat belt wearing detection method, device, electronic device, storage medium and program
JP2020042785A (en) Method, apparatus, device and storage medium for identifying passenger state in unmanned vehicle
CN112926544A (en) Driving state determination method, device, equipment and storage medium
CN110232310A (en) A kind of method for detecting fatigue driving neural network based and relevant device
CN114373189A (en) Behavior detection method and apparatus, terminal device and storage medium
CN106710027A (en) Configuration method and device of on-board equipment
CN110807394A (en) Emotion recognition method, test driving experience evaluation method, device, equipment and medium
CN110826433B (en) Emotion analysis data processing method, device and equipment for test driving user and storage medium
CN113051958A (en) Driver state detection method, system, device and medium based on deep learning
CN113706741B (en) Data recording method and system for automobile with driving assisting equipment
CN113283286B (en) Driver abnormal behavior detection method and device
CN115115531A (en) Image denoising method and device, vehicle and storage medium
CN111275008A (en) Method and device for detecting abnormality of target vehicle, storage medium, and electronic device
CN111611804A (en) Danger identification method and device, electronic equipment and storage medium
CN111797784B (en) Driving behavior monitoring method and device, electronic equipment and storage medium
CN117198065B (en) Intelligent speed limiter for automobile
CN112329657B (en) Method and related device for sensing upper body movement of driver
CN116625702A (en) Vehicle collision detection method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant