CN111222477B - Vision-based method and device for detecting departure of hands from steering wheel - Google Patents

Vision-based method and device for detecting departure of hands from steering wheel Download PDF

Info

Publication number
CN111222477B
CN111222477B CN202010026699.4A CN202010026699A CN111222477B CN 111222477 B CN111222477 B CN 111222477B CN 202010026699 A CN202010026699 A CN 202010026699A CN 111222477 B CN111222477 B CN 111222477B
Authority
CN
China
Prior art keywords
steering wheel
picture
network
driver
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010026699.4A
Other languages
Chinese (zh)
Other versions
CN111222477A (en
Inventor
戚治舟
王汉超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Ruiwei Information Technology Co ltd
Original Assignee
Xiamen Ruiwei Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Ruiwei Information Technology Co ltd filed Critical Xiamen Ruiwei Information Technology Co ltd
Priority to CN202010026699.4A priority Critical patent/CN111222477B/en
Publication of CN111222477A publication Critical patent/CN111222477A/en
Application granted granted Critical
Publication of CN111222477B publication Critical patent/CN111222477B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a vision-based method for detecting whether two hands leave a steering wheel, which comprises the following steps: collecting sample data, marking the sample data, and performing network training and optimization by using the marked sample data to obtain a model; converting the model into a model under ncnn; acquiring an infrared picture of a driver, processing the picture, inputting the picture into a model, analyzing a model result to acquire a steering wheel position, expanding a steering wheel region, selecting a set roi region, cutting out the roi region, processing the cut picture, inputting the picture into the model to judge whether the hands of the driver leave the steering wheel, and giving an alarm if the driver does not have one hand and then turns to the steering wheel; if not, not alarming; the invention also provides a device which can effectively improve the detection rate of the model, reduce the input size of the network and improve the speed of the model more quickly.

Description

Vision-based method and device for detecting departure of hands from steering wheel
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for detecting whether hands leave a steering wheel based on vision.
Background
When a driver drives a vehicle, many factors interfere with safe driving, and the driver does not necessarily observe traffic regulations and safe operation regulations, such as driving to receive calls, smoking and other actions, can endanger the safety of passengers. At the moment, the driver can be warned to correct the irregular behavior of the driver by judging the time when the hands of the driver leave the steering wheel. At present, three main detection methods for the driver's hands leaving the steering wheel are as follows:
(1) Based on the steering wheel torque signal: the driver torque state is estimated from the plurality of electric power steering signals, the torque state is used to determine whether the driver is gripping the steering wheel, and the magnitude of the driver torque state is compared to a high grip torque threshold. The method has the advantages of higher processing speed, obvious defects, poor robustness and small application range, and only a preset experience threshold value can be used.
(2) Methods for measuring pressure or temperature of hands based on steering wheel sensor and the like: mainly through the ability of hardware sensor, install the sensor around the steering wheel, whether both hands grip the steering wheel through temperature or pressure sensing. The method has high processing speed, but has higher cost, can be easily subjected to external interference factors and is easy to cause false alarm.
(3) Machine vision based methods: with the development of deep learning, computer machine vision technology has been rapidly developed through convolutional neural networks. The biggest advantage of the deep learning method is that features required by a target task are learned through a convolution network, and in many fields, the recognition rate of the deep learning object can exceed that of human eyes. Because of the limitation of the computing power of the hardware equipment, the neural network with excellent effect needs enough computing power, so that many artificial intelligence projects are difficult to land.
Disclosure of Invention
The invention aims to solve the technical problem of providing a vision-based method and device for detecting whether two hands leave a steering wheel, which can effectively improve the detection rate of a model, reduce the size of network input and improve the speed of the model more quickly.
In a first aspect, the present invention provides a method comprising:
step 1, collecting sample data, marking the sample data, and performing network training and optimization by using the marked sample data to obtain a model;
step 2, converting the model into a model under ncnn;
step 3, obtaining an infrared picture of a driver, processing the picture, inputting the picture into a model, analyzing a model result to obtain a steering wheel position, expanding a steering wheel region, selecting a set roi region, cutting out the roi region, processing the cut picture, inputting the picture into the model to judge whether the hands of the driver leave the steering wheel, and giving an alarm if the driver does not have one hand to turn the steering wheel; if not, the alarm is not given.
Further, the step 1 is further specifically: the method comprises the steps that an infrared picture of a driver is collected through an in-vehicle monitoring camera, wherein the infrared picture comprises a picture of the camera right above the driver, a picture of the camera on the top of a vehicle door and a picture of a steering wheel crawled on the internet;
marking a steering wheel in the collected infrared pictures, obtaining position coordinates of the steering wheel, expanding a steering wheel area, selecting a set roi area, cutting the picture, and marking the pictures, wherein the hand coordinates of a driver holding the steering wheel, the hand coordinates without holding the steering wheel and category information are marked respectively;
the method comprises the steps of performing network training by using a caffe frame, converting marked infrared pictures into lmdb training data under caffe, selecting a MobileNet v2-yolov3 network by a steering wheel detection network and a two-hand leaving steering wheel identification network, setting learning rate and the number of training data each time by adopting an SGD (generalized name space) optimization learning method, performing data enhancement operation on pictures with different input sizes, performing training for set times by the steering wheel detection network and the two-hand leaving steering wheel identification network, stabilizing and converging network loss values, and performing pruning optimization on the steering wheel detection network to finally obtain a trained model.
Further, the step 3 is further specifically:
step 31, starting the vehicle, acquiring an infrared picture of a driver, processing the picture, inputting the picture into a steering wheel detection network if the position of the steering wheel does not exist, analyzing a model result to acquire the position of the steering wheel, and entering step 32; if so, go to step 32;
step 32, enlarging a steering wheel area, selecting a set roi area, cutting out the roi area, processing the cut picture, inputting the picture into a two-hand leaving steering wheel identification network to judge whether the two hands of a driver leave the steering wheel, and calling a steering wheel detection network again if the two-hand leaving steering wheel identification network continuously identifies that the driver does not have one hand and then the steering wheel is located in the set time, and alarming if the steering wheel is detected, and not needing to alarm if the steering wheel is not detected; if the hands leave the steering wheel identification network within the set time and continuously identify that one or two hands of the driver are on the steering wheel again, no alarm is given.
In a second aspect, the present invention provides an apparatus comprising:
the training optimization module is used for collecting sample data, marking the sample data, and performing network training and optimization by using the marked sample data to obtain a model;
the conversion module converts the model into a model under ncnn;
the detection module is used for acquiring an infrared picture of a driver, processing the picture, inputting the picture into the model, analyzing a model result to acquire a steering wheel position, expanding a steering wheel region, selecting a set roi region, cutting out the roi region, processing the cut picture, inputting the picture into the model to judge whether the hands of the driver leave the steering wheel, and giving an alarm if the driver does not have one hand to turn the steering wheel; if not, the alarm is not given.
Further, the training optimization module is further specifically: the method comprises the steps that an infrared picture of a driver is collected through an in-vehicle monitoring camera, wherein the infrared picture comprises a picture of the camera right above the driver, a picture of the camera on the top of a vehicle door and a picture of a steering wheel crawled on the internet;
marking a steering wheel in the collected infrared pictures, obtaining position coordinates of the steering wheel, expanding a steering wheel area, selecting a set roi area, cutting the picture, and marking the pictures, wherein the hand coordinates of a driver holding the steering wheel, the hand coordinates without holding the steering wheel and category information are marked respectively;
the method comprises the steps of performing network training by using a caffe frame, converting marked infrared pictures into lmdb training data under caffe, selecting a MobileNet v2-yolov3 network by a steering wheel detection network and a two-hand leaving steering wheel identification network, setting learning rate and the number of training data each time by adopting an SGD (generalized name space) optimization learning method, performing data enhancement operation on pictures with different input sizes, performing training for set times by the steering wheel detection network and the two-hand leaving steering wheel identification network, stabilizing and converging network loss values, and performing pruning optimization on the steering wheel detection network to finally obtain a trained model.
Further, the detection module is further specifically:
the position unit is used for starting the vehicle, acquiring an infrared picture of a driver, processing the picture, inputting the picture into a steering wheel detection network if the position of the steering wheel does not exist, analyzing a model result to acquire the position of the steering wheel, and entering the alarm unit; if yes, entering an alarm unit;
the alarming unit expands the steering wheel area, selects a set roi area, cuts out the roi area, processes the cut picture, inputs the picture into the two-hand leaving steering wheel identification network to judge whether the two hands of the driver leave the steering wheel, and if the two-hand leaving steering wheel identification network continuously identifies that the driver does not have one hand and then the steering wheel is positioned on the steering wheel within the set time, invokes the steering wheel detection network again, alarms if the steering wheel is detected, and does not need to alarm if the steering wheel is not detected; if the hands leave the steering wheel identification network within the set time and continuously identify that one or two hands of the driver are on the steering wheel again, no alarm is given.
One or more technical solutions provided in the embodiments of the present invention at least have the following technical effects or advantages:
(1) The in-car monitoring camera is directly utilized, no matter the camera is at any angle, the detection roi area can be cut out through the steering wheel detection algorithm, the trouble of the camera installation is avoided, the application range is wide, and the cost is low.
(2) The combination judgment of the steering wheel detection model and the two network models of whether the two hands leave the steering wheel recognition network can better solve the problems of false alarm and external interference. When people and objects are met to shield the camera or the installation position of the camera is not good, false alarm can not be caused. The method has the advantages that the detection of whether the hand leaves the steering wheel or not is carried out by intercepting the roi area of interest through the steering wheel detection model, so that the detection rate of the model can be effectively improved, the network input size is reduced, and the model speed is improved more rapidly.
(3) The scheme combines the light-weight mobile v2 main network and the post-processing method of the yolo v3 network with very good detection effect. Although two detection networks are needed to cooperate, the speed is not affected at all, because the detection times of the steering wheel are small, the network consumption time is negligible, whether the two hands leave the steering wheel to identify the network input is small, the speed is high, and the speed can reach 40-60 ms on the arm. On the premise of high speed, the method has better detection effect and better robustness compared with other algorithms. In the scheme, 328904 pictures are collected in total, 25592 pictures are collected in a test set, the accuracy of steering wheel detection reaches more than 99%, and the preparation rate of hands leaving the steering wheel reaches more than 95%.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
The invention will be further described with reference to examples of embodiments with reference to the accompanying drawings.
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic illustration of a sample annotation according to the present invention;
FIG. 3 is a flowchart and steps of an on-line model deployment of the present invention.
Detailed Description
According to the technical scheme in the embodiment of the application, the overall thought is as follows:
(1) The steering wheel detection technology is utilized to position the steering wheel, and the steering wheel area is searched for by the steering wheel position, so that the technical innovation is that the steering wheel is accurately positioned, the required hardware calculation force is not much, and the steering wheel detection speed is very high.
(2) The acquired steering wheel area is expanded, the roi area of interest is selected, the roi area is input to the convolution network, the convolution network can identify whether the hands of a driver leave the steering wheel or not, the convolution network outputs the positions of the hands simultaneously, and the state of each hand (whether the steering wheel is grasped or not) is judged.
(3) In order to overcome the false alarm caused by the fact that a person or an object blocks the steering wheel when the driver is identified to hold the steering wheel, multiple frames are adopted to judge whether the driver holds the steering wheel or not within a continuous period of time, whether the steering wheel is shielded or not is judged by detecting the steering wheel, and the false alarm is thoroughly solved by integrating the identification result calculation of the multiple frames.
The first part is model training, including data collection, data format conversion, network selection, model training and network optimization; the second part is online service deployment, including model conversion, preprocessing under the mobile terminal framework, network analysis and other code writing.
1. Detailed steps and flow of deep learning model training (as shown in FIG. 1)
(1) Collection of data: the infrared pictures of the driver are collected through the in-vehicle monitoring camera, and the in-vehicle monitoring camera comprises pictures of the camera right above the driver, pictures of the camera at the top of the vehicle door and pictures of the steering wheel crawled on the internet, and covers various scenes (pictures under conditions of day, night, strong light, dim light, backlight and the like).
(2) Labeling of sample data (as in fig. 2): and marking the steering wheel in the collected infrared picture data, and obtaining the position coordinates of the steering wheel, wherein the position of the steering wheel of the same vehicle is not moved and only needs to be marked once. After the steering wheel area is enlarged and the roi area of interest is selected, the picture is marked after cutting, and the hand coordinates and the category information of the driver holding the steering wheel and the hand without holding the steering wheel are marked respectively.
(3) Training and optimization of the network: from the above data collection and data annotation we finally obtained training data. We use the caffe framework for network training and need to convert the pictures and annotation data in the training data into lmdb training data under caffe. Next is the design of our network, which has chosen a lightweight network MobileNet-V2 as the primary network due to the need to run algorithms on the embedded device (arm). The MobileNet series network is specially designed by *** for mobile terminal equipment, so that the calculation amount of the network is greatly reduced, and the network is selected as a main network to improve the performance. The post-processing of the network is selected between ssd network post-processing and yolov3 network post-processing, experiments show that the post-processing of yolov3 has better detection effect on small targets, and finally the mobile Net v2-yolov3 network is selected for steering wheel detection and steering wheel departure recognition of both hands. After the network and lmdb training data are ready, training begins. By adopting the SGD optimization learning method, the learning rate is set to be 0.001, the number of the network training data batch_size is 128 each time, data enhancement operations such as random scaling and the like are performed on pictures with different input sizes, and the robustness of the network is improved. After 15 ten thousand times of training, the network loss value is stable and converged, and then pruning optimization is carried out on the steering wheel detection network, so that a trained model is finally obtained. After training, the accuracy of the steering wheel detection network can reach more than 0.99 and the accuracy of the hands leaving the steering wheel network can reach more than 0.95 after testing on 3 ten thousand data sets. Because the detection area of the two hands leaving the steering wheel network is the roi area of the steering wheel area, the network input is smaller, the consumed time is smaller, 40-60 ms can be reached on arm, and the requirements can be better met. But the steering wheel detection network needs full-image detection, the network input is large, and the time consumption reaches 400ms. And pruning optimization is carried out on the network, wherein pruning mainly reduces the output of partial network layers and removes redundant characteristics extracted by the network. According to the target detection size, it is inferred that one up-sampling layer has little effect in the network, one up-sampling layer is deleted, the time consumption of the network is reduced from 400ms to 100ms, and the accuracy can still reach more than 0.99.
2. Steps and flow of online model deployment (as in fig. 3):
(1) Model conversion: since the frame used by us is caffe during training, the frame is really deployed on the mobile terminal and needs to be transplanted to the mobile terminal frame. The domestic mobile terminal framework is better provided with a forward computation framework ncnn of the neural network with vacation and a deep neural network reasoning engine mnn of the Ali. Since ncnn uses a relatively large number, the model of caffe is converted into a model under ncnn.
(2) Preprocessing, network analysis and other code writing: the infrared picture of the driver is obtained through the camera, the picture is processed by the preprocessing code and then is input into the steering wheel detection network, the model result is analyzed to obtain the position of the steering wheel (the model result is a relative value, the real coordinate of the steering wheel on the picture is restored according to the size of the picture), and the steering wheel detection network only needs to be called once because the coordinate of the steering wheel of the same vehicle is not changed. Enlarging the steering wheel area, selecting an interesting roi area, cutting out the roi area, processing the cut picture, and inputting whether the hands leave the steering wheel network or not. If the driver is not on the steering wheel, the network can send out an alarm, and the network can send out an alarm to the driver if the driver is on the steering wheel, if the driver is in the steering wheel, the network can send out an alarm, and if the driver is detected, the network can send out an alarm, and the driver is prompted to drive safely by putting the hands on the steering wheel.
Example 1
The embodiment provides a method comprising;
step 1, collecting infrared pictures of a driver through a monitoring camera in a vehicle, wherein the infrared pictures comprise pictures of the camera right above the driver, pictures of the camera on the top of a vehicle door and pictures of a steering wheel crawled on the internet;
marking a steering wheel in the collected infrared pictures, obtaining position coordinates of the steering wheel, expanding a steering wheel area, selecting a set roi area, cutting the picture, and marking the pictures, wherein the hand coordinates of a driver holding the steering wheel, the hand coordinates without holding the steering wheel and category information are marked respectively;
the method comprises the steps of performing network training by using a caffe frame, converting marked infrared pictures into lmdb training data under caffe, selecting a MobileNet v2-yolov3 network by a steering wheel detection network and a two-hand leaving steering wheel identification network, setting learning rate and the number of training data each time by adopting an SGD (generalized name space) optimization learning method, performing data enhancement operation on pictures with different input sizes, performing training for set times by the steering wheel detection network and the two-hand leaving steering wheel identification network, stabilizing and converging network loss values, and performing pruning optimization on the steering wheel detection network to finally obtain a trained model;
step 2, converting the model into a model under ncnn;
step 3, obtaining an infrared picture of a driver, processing the picture, inputting the picture into a model, analyzing a model result to obtain a steering wheel position, expanding a steering wheel region, selecting a set roi region, cutting out the roi region, processing the cut picture, inputting the picture into the model to judge whether the hands of the driver leave the steering wheel, and giving an alarm if the driver does not have one hand to turn the steering wheel; if not, the alarm is not given.
The step 3 is further specifically:
step 31, starting the vehicle, acquiring an infrared picture of a driver, processing the picture, inputting the picture into a steering wheel detection network if the position of the steering wheel does not exist, analyzing a model result to acquire the position of the steering wheel, and entering step 32; if so, go to step 32;
step 32, enlarging a steering wheel area, selecting a set roi area, cutting out the roi area, processing the cut picture, inputting the picture into a two-hand leaving steering wheel identification network to judge whether the two hands of a driver leave the steering wheel, and calling a steering wheel detection network again if the two-hand leaving steering wheel identification network continuously identifies that the driver does not have one hand and then the steering wheel is located in the set time, and alarming if the steering wheel is detected, and not needing to alarm if the steering wheel is not detected; if the hands leave the steering wheel identification network within the set time and continuously identify that one or two hands of the driver are on the steering wheel again, no alarm is given.
Based on the same inventive concept, the present application also provides a device corresponding to the method in the first embodiment, and details of the second embodiment are described in the following.
Example two
In this embodiment, there is provided an apparatus including:
the training optimization module is used for collecting infrared pictures of a driver through an in-vehicle monitoring camera, wherein the infrared pictures comprise pictures of the camera right above the driver, pictures of the camera on the top of a vehicle door and pictures of a steering wheel crawled on the internet;
marking a steering wheel in the collected infrared pictures, obtaining position coordinates of the steering wheel, expanding a steering wheel area, selecting a set roi area, cutting the picture, and marking the pictures, wherein the hand coordinates of a driver holding the steering wheel, the hand coordinates without holding the steering wheel and category information are marked respectively;
the method comprises the steps of performing network training by using a caffe frame, converting marked infrared pictures into lmdb training data under caffe, selecting a MobileNet v2-yolov3 network by a steering wheel detection network and a two-hand leaving steering wheel identification network, setting learning rate and the number of training data each time by adopting an SGD (generalized name space) optimization learning method, performing data enhancement operation on pictures with different input sizes, performing training for set times by the steering wheel detection network and the two-hand leaving steering wheel identification network, stabilizing and converging network loss values, and performing pruning optimization on the steering wheel detection network to finally obtain a trained model;
the conversion module converts the model into a model under ncnn;
the detection module is used for acquiring an infrared picture of a driver, processing the picture, inputting the picture into the model, analyzing a model result to acquire a steering wheel position, expanding a steering wheel region, selecting a set roi region, cutting out the roi region, processing the cut picture, inputting the picture into the model to judge whether the hands of the driver leave the steering wheel, and giving an alarm if the driver does not have one hand to turn the steering wheel; if not, the alarm is not given.
The detection module is further specifically:
the position unit is used for starting the vehicle, acquiring an infrared picture of a driver, processing the picture, inputting the picture into a steering wheel detection network if the position of the steering wheel does not exist, analyzing a model result to acquire the position of the steering wheel, and entering the alarm unit; if yes, entering an alarm unit;
the alarming unit expands the steering wheel area, selects a set roi area, cuts out the roi area, processes the cut picture, inputs the picture into the two-hand leaving steering wheel identification network to judge whether the two hands of the driver leave the steering wheel, and if the two-hand leaving steering wheel identification network continuously identifies that the driver does not have one hand and then the steering wheel is positioned on the steering wheel within the set time, invokes the steering wheel detection network again, alarms if the steering wheel is detected, and does not need to alarm if the steering wheel is not detected; if the hands leave the steering wheel identification network within the set time and continuously identify that one or two hands of the driver are on the steering wheel again, no alarm is given.
Since the device described in the second embodiment of the present invention is a device for implementing the method described in the first embodiment of the present invention, based on the method described in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and the deformation of the device, and thus the detailed description thereof is omitted herein. All devices used in the method according to the first embodiment of the present invention are within the scope of the present invention.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that the specific embodiments described are illustrative only and not intended to limit the scope of the invention, and that equivalent modifications and variations of the invention in light of the spirit of the invention will be covered by the claims of the present invention.

Claims (2)

1. A vision-based method for detecting whether two hands leave a steering wheel is characterized in that: comprising the following steps:
step 1, collecting sample data, marking the sample data, and performing network training and optimization by using the marked sample data to obtain a model;
step 2, converting the model into a model under ncnn;
step 3, obtaining an infrared picture of a driver, processing the picture, inputting the picture into a model, analyzing a model result to obtain a steering wheel position, expanding a steering wheel region, selecting a set roi region, cutting out the roi region, processing the cut picture, inputting the picture into the model to judge whether the hands of the driver leave the steering wheel, and giving an alarm if the driver does not have one hand to turn the steering wheel; if not, not alarming;
wherein, the step 1 is further specifically: the method comprises the steps that an infrared picture of a driver is collected through an in-vehicle monitoring camera, wherein the infrared picture comprises a picture of the camera right above the driver, a picture of the camera on the top of a vehicle door and a picture of a steering wheel crawled on the internet;
marking a steering wheel in the collected infrared pictures, obtaining position coordinates of the steering wheel, expanding a steering wheel area, selecting a set roi area, cutting the picture, and marking the pictures, wherein the hand coordinates of a driver holding the steering wheel, the hand coordinates without holding the steering wheel and category information are marked respectively;
the method comprises the steps of performing network training by using a caffe frame, converting marked infrared pictures into lmdb training data under caffe, selecting a MobileNet v2-yolov3 network by a steering wheel detection network and a two-hand leaving steering wheel identification network, setting learning rate and the number of training data each time by adopting an SGD (generalized name space) optimization learning method, performing data enhancement operation on pictures with different input sizes, performing training for set times by the steering wheel detection network and the two-hand leaving steering wheel identification network, stabilizing and converging network loss values, and performing pruning optimization on the steering wheel detection network to finally obtain a trained model;
wherein: the step 3 is further specifically:
step 31, starting the vehicle, acquiring an infrared picture of a driver, processing the picture, inputting the picture into a steering wheel detection network if the position of the steering wheel does not exist, analyzing a model result to acquire the position of the steering wheel, and entering step 32; if so, go to step 32;
step 32, enlarging a steering wheel area, selecting a set roi area, cutting out the roi area, processing the cut picture, inputting the picture into a two-hand leaving steering wheel identification network to judge whether the two hands of a driver leave the steering wheel, and calling a steering wheel detection network again if the two-hand leaving steering wheel identification network continuously identifies that the driver does not have one hand and then the steering wheel is located in the set time, and alarming if the steering wheel is detected, and not needing to alarm if the steering wheel is not detected; if the hands leave the steering wheel identification network within the set time and continuously identify that one or two hands of the driver are on the steering wheel again, no alarm is given.
2. Vision-based steering wheel detection device is left to both hands, its characterized in that: comprising the following steps:
the training optimization module is used for collecting sample data, marking the sample data, and performing network training and optimization by using the marked sample data to obtain a model;
the conversion module converts the model into a model under ncnn;
the detection module is used for acquiring an infrared picture of a driver, processing the picture, inputting the picture into the model, analyzing a model result to acquire a steering wheel position, expanding a steering wheel region, selecting a set roi region, cutting out the roi region, processing the cut picture, inputting the picture into the model to judge whether the hands of the driver leave the steering wheel, and giving an alarm if the driver does not have one hand to turn the steering wheel; if not, not alarming;
wherein, the training optimization module further specifically comprises: the method comprises the steps that an infrared picture of a driver is collected through an in-vehicle monitoring camera, wherein the infrared picture comprises a picture of the camera right above the driver, a picture of the camera on the top of a vehicle door and a picture of a steering wheel crawled on the internet;
marking a steering wheel in the collected infrared pictures, obtaining position coordinates of the steering wheel, expanding a steering wheel area, selecting a set roi area, cutting the picture, and marking the pictures, wherein the hand coordinates of a driver holding the steering wheel, the hand coordinates without holding the steering wheel and category information are marked respectively;
the method comprises the steps of performing network training by using a caffe frame, converting marked infrared pictures into lmdb training data under caffe, selecting a MobileNet v2-yolov3 network by a steering wheel detection network and a two-hand leaving steering wheel identification network, setting learning rate and the number of training data each time by adopting an SGD (generalized name space) optimization learning method, performing data enhancement operation on pictures with different input sizes, performing training for set times by the steering wheel detection network and the two-hand leaving steering wheel identification network, stabilizing and converging network loss values, and performing pruning optimization on the steering wheel detection network to finally obtain a trained model;
wherein, the detection module further specifically comprises:
the position unit is used for starting the vehicle, acquiring an infrared picture of a driver, processing the picture, inputting the picture into a steering wheel detection network if the position of the steering wheel does not exist, analyzing a model result to acquire the position of the steering wheel, and entering the alarm unit; if yes, entering an alarm unit;
the alarming unit expands the steering wheel area, selects a set roi area, cuts out the roi area, processes the cut picture, inputs the picture into the two-hand leaving steering wheel identification network to judge whether the two hands of the driver leave the steering wheel, and if the two-hand leaving steering wheel identification network continuously identifies that the driver does not have one hand and then the steering wheel is positioned on the steering wheel within the set time, invokes the steering wheel detection network again, alarms if the steering wheel is detected, and does not need to alarm if the steering wheel is not detected; if the hands leave the steering wheel identification network within the set time and continuously identify that one or two hands of the driver are on the steering wheel again, no alarm is given.
CN202010026699.4A 2020-01-10 2020-01-10 Vision-based method and device for detecting departure of hands from steering wheel Active CN111222477B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010026699.4A CN111222477B (en) 2020-01-10 2020-01-10 Vision-based method and device for detecting departure of hands from steering wheel

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010026699.4A CN111222477B (en) 2020-01-10 2020-01-10 Vision-based method and device for detecting departure of hands from steering wheel

Publications (2)

Publication Number Publication Date
CN111222477A CN111222477A (en) 2020-06-02
CN111222477B true CN111222477B (en) 2023-05-30

Family

ID=70828361

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010026699.4A Active CN111222477B (en) 2020-01-10 2020-01-10 Vision-based method and device for detecting departure of hands from steering wheel

Country Status (1)

Country Link
CN (1) CN111222477B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347891B (en) * 2020-10-30 2022-02-22 南京佑驾科技有限公司 Method for detecting drinking water state in cabin based on vision
CN112580627A (en) * 2020-12-16 2021-03-30 中国科学院软件研究所 Yoov 3 target detection method based on domestic intelligent chip K210 and electronic device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013013487A1 (en) * 2011-07-26 2013-01-31 华南理工大学 Device and method for monitoring driving behaviors of driver based on video detection
CN107679539A (en) * 2017-09-18 2018-02-09 浙江大学 A kind of single convolutional neural networks local message wild based on local sensing and global information integration method
CN110084803A (en) * 2019-04-29 2019-08-02 南京星程智能科技有限公司 Eye fundus image method for evaluating quality based on human visual system
CN110135398A (en) * 2019-05-28 2019-08-16 厦门瑞为信息技术有限公司 Both hands off-direction disk detection method based on computer vision
CN110222596A (en) * 2019-05-20 2019-09-10 浙江零跑科技有限公司 A kind of driving behavior analysis anti-cheating method of view-based access control model
CN110633701A (en) * 2019-10-23 2019-12-31 德瑞姆创新科技(深圳)有限公司 Driver call detection method and system based on computer vision technology

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013013487A1 (en) * 2011-07-26 2013-01-31 华南理工大学 Device and method for monitoring driving behaviors of driver based on video detection
CN107679539A (en) * 2017-09-18 2018-02-09 浙江大学 A kind of single convolutional neural networks local message wild based on local sensing and global information integration method
CN110084803A (en) * 2019-04-29 2019-08-02 南京星程智能科技有限公司 Eye fundus image method for evaluating quality based on human visual system
CN110222596A (en) * 2019-05-20 2019-09-10 浙江零跑科技有限公司 A kind of driving behavior analysis anti-cheating method of view-based access control model
CN110135398A (en) * 2019-05-28 2019-08-16 厦门瑞为信息技术有限公司 Both hands off-direction disk detection method based on computer vision
CN110633701A (en) * 2019-10-23 2019-12-31 德瑞姆创新科技(深圳)有限公司 Driver call detection method and system based on computer vision technology

Also Published As

Publication number Publication date
CN111222477A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
US20220175287A1 (en) Method and device for detecting driver distraction
CN111505424A (en) Large experimental device power equipment fault diagnosis method based on deep convolutional neural network
JP2019523943A (en) Control apparatus, system and method for determining perceptual load of visual and dynamic driving scene
CN111222477B (en) Vision-based method and device for detecting departure of hands from steering wheel
CN110852222A (en) Campus corridor scene intelligent monitoring method based on target detection
CN113487610B (en) Herpes image recognition method and device, computer equipment and storage medium
CN116259002A (en) Human body dangerous behavior analysis method based on video
CN111199238A (en) Behavior identification method and equipment based on double-current convolutional neural network
CN114155492A (en) High-altitude operation safety belt hanging rope high-hanging low-hanging use identification method and device and electronic equipment
CN116935361A (en) Deep learning-based driver distraction behavior detection method
CN116977909B (en) Deep learning fire intensity recognition method and system based on multi-modal data
CN116403162B (en) Airport scene target behavior recognition method and system and electronic equipment
CN116580232A (en) Automatic image labeling method and system and electronic equipment
Mansur et al. Highway drivers drowsiness detection system model with r-pi and cnn technique
CN113283286B (en) Driver abnormal behavior detection method and device
CN112967335A (en) Bubble size monitoring method and device
CN112598646A (en) Capacitance defect detection method and device, electronic equipment and storage medium
QU et al. Multi-Attention Fusion Drowsy Driving Detection Model
CN111325132A (en) Intelligent monitoring system
CN112926414B (en) Image processing method and device and electronic equipment
CN112712061B (en) Method, system and storage medium for recognizing multidirectional traffic police command gestures
WO2023275968A1 (en) Abnormality determination device, abnormality determination method, and abnormality determination program
CN116977923A (en) Unmanned aerial vehicle detection early warning method based on supervision attention mechanism and semantic features
CN117746400A (en) Fatigue driving detection method and device, electronic equipment and storage medium
CN117351462A (en) Construction operation detection model training method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant