WO2022247527A1 - 驾驶员头部动作的确定方法、存储介质、电子装置 - Google Patents

驾驶员头部动作的确定方法、存储介质、电子装置 Download PDF

Info

Publication number
WO2022247527A1
WO2022247527A1 PCT/CN2022/087552 CN2022087552W WO2022247527A1 WO 2022247527 A1 WO2022247527 A1 WO 2022247527A1 CN 2022087552 W CN2022087552 W CN 2022087552W WO 2022247527 A1 WO2022247527 A1 WO 2022247527A1
Authority
WO
WIPO (PCT)
Prior art keywords
driver
head
head posture
network
texture map
Prior art date
Application number
PCT/CN2022/087552
Other languages
English (en)
French (fr)
Inventor
叶剑
张铁监
汪洋
Original Assignee
多伦科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 多伦科技股份有限公司 filed Critical 多伦科技股份有限公司
Publication of WO2022247527A1 publication Critical patent/WO2022247527A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the technical field of data processing, and in particular, relates to a method for determining a driver's head movement, a storage medium, and an electronic device.
  • driver behavior monitoring and early warning technology has become one of the focuses of research in the field of intelligent transportation. It has not only become the standard equipment for assisting safe driving, but also the focus of research in the field of motor vehicle driving tests. Every year, a large number of students pass the driver's professional skills training and finally pass the examination to obtain a driver's license.
  • the motor vehicle driver skills test at the present stage adopts a combination of computer judgment and examiner's manual judgment. At present, only some of the test items are automatically scored by computer, and some test items still need manual evaluation. For example, in the process of driving test students getting on the car, the examiner needs to manually judge whether the students have looked at the left and right rearview mirrors, etc. Due to the manual judgment involved, there will be unfair and unfair behaviors in the motor vehicle driving test industry.
  • Embodiments of the present application provide a method for determining a driver's head movement, a storage medium, and an electronic device, so as to at least solve the problem in the related art that the head posture of a motor vehicle driver cannot be accurately recognized.
  • a method for determining the driver's head motion including: preprocessing the image data containing the driver's head posture to obtain continuous frame images; from the continuous frame images Extracting the movement feature of the driver's head posture; determining the yaw angle of the driver's head posture according to the movement feature; determining the driver's head movement according to the size of the yaw angle.
  • the method before preprocessing the image data containing the driver's head posture, the method further includes: collecting the image data containing the driver's head posture in real time through a camera; Data is uploaded to the pending queue.
  • the method further includes: performing normalization processing on the continuous frame images, using a target detection algorithm to The region of interest is extracted from the head region in the continuous frame images.
  • the extracting the movement feature of the driver's head posture from the continuous frame images includes: extracting the movement feature of the driver's head posture from the region of interest.
  • the determining the yaw angle of the driver's head posture according to the movement characteristics includes: scaling the image containing the region of interest to a preset size; the image of the preset size In the input head posture extraction network, wherein, the head posture extraction network includes a backbone network and an auxiliary network; extract the movement characteristics of the driver's head posture through the backbone network, and obtain the key point feature vector of the face; The key point feature vector of the human face is used to calculate the yaw angle of the driver's head posture through the accessory network.
  • the training method of the head pose extraction network includes: establishing a two-dimensional UV position map and its corresponding UV texture map according to the feature vector of the key point of the face, wherein the UV texture map Contain people's face UV texture map and mask UV texture map among the figure; Described human face UV texture map is multiplied by described mask UV texture map, obtain target UV texture map; Described target UV texture map Perform remapping to obtain a face image with a mask; use the face image with a mask to train the head pose extraction network.
  • the training method of the head pose extraction network further includes: calculating the loss of the head pose extraction network according to the features output by the backbone network and the auxiliary network and combining the labeled real values of the sample data value; optimize the training parameters of the head pose extraction network according to the loss value.
  • the determining the driver's head movement according to the yaw angle includes: determining that the driver's head turns when the yaw angle is greater than or equal to a preset threshold action, wherein the turning action includes at least one of the following: turning to the left, turning to the right, turning down, and turning up.
  • a computer-readable storage medium is also provided, and a computer program is stored in the storage medium, wherein the computer program is set to execute any one of the above method embodiments when running in the steps.
  • an electronic device including a memory and a processor, wherein a computer program is stored in the memory, and the processor is configured to run the computer program to perform The steps in any one of the above method embodiments.
  • the image data containing the driver's head posture is preprocessed to obtain continuous frame images; the movement characteristics of the driver's head posture are extracted from the continuous frame images; According to the movement characteristics, the yaw angle of the driver's head posture is determined through the neural network model; the driver's head movement is determined according to the size of the yaw angle.
  • the driver's head movement can be machine-recognized by the computer, and combined with the deep learning and training of the neural network model, the recognition process can be made more accurate.
  • FIG. 1 is a flow chart of an optional method for determining a driver's head movement according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of an optional training method for moving features of a driver's head image according to an embodiment of the present application.
  • FIG. 1 is a flowchart of an optional method for determining a driver's head movement according to an embodiment of the present application. As shown in Fig. 1 , the method includes:
  • Step S102 preprocessing the image data containing the driver's head posture to obtain continuous frame images
  • Step S104 extracting the movement characteristics of the driver's head posture from the continuous frame images
  • Step S106 determining the yaw angle of the driver's head posture according to the movement characteristics
  • Step S108 determining the driver's head movement according to the magnitude of the yaw angle.
  • the driver's head movement can be machine-recognized by computer, and combined with the deep learning and training of the neural network model, it can make the recognition process more accurate and overcome the shortcomings caused by manual judgment in the driver's test process.
  • the method before preprocessing the image data containing the driver's head posture, the method further includes: collecting the image data containing the driver's head posture in real time through the camera; uploading the image data to the queue to be processed .
  • the above-mentioned camera can be a video RGB camera installed in the vehicle, mainly for real-time collection of head movement images of the driver in the main driving position, and transmit the collected image data to the queue.
  • the method further includes: performing normalization processing on the continuous frame images, and using a target detection algorithm to analyze the continuous frame images The region of interest is extracted from the head region in the image.
  • extracting the movement features of the driver's head posture from the continuous frame images includes: extracting the movement characteristics of the driver's head posture from the region of interest.
  • determining the yaw angle of the driver's head posture according to the movement characteristics includes: scaling the image containing the region of interest to a preset size; inputting the image of the preset size into the head posture extraction network, wherein,
  • the head posture extraction network includes a backbone network and an auxiliary network, through which the movement characteristics of the driver's head posture are extracted to obtain the feature vector of key points of the face; The yaw angle of the head attitude.
  • the preset size involved in the embodiment of the present application can be any size set according to practical requirements. It can be set to 114*144 or any other size, which is not limited in this embodiment of the present application.
  • the head pose extraction network is specifically: set up two backbone networks, respectively backbone network 1, backbone 1 and backbone network 2, backbone2.
  • the backbone network 1 uses the first 6 layers of resnet18 to extract image features;
  • the auxiliary network Auxiliary uses multi-layer Conv+BN+Act operator fusion to calculate the yaw angle of the head posture.
  • the backbone network backbone2 connected to the backbone network backbone1 adopts the last two layers of resenet18, which are used to extract the feature vector of the key points of the face.
  • the feature vector of the key points of the face is a one-dimensional 1*136 vector.
  • the 18-layer convolutional neural network structure resnet18 can make the calculation amount relatively small, and it is convenient to quickly calculate the yaw angle of the driver's head posture.
  • the face key point feature vector may also be a two-dimensional or multi-dimensional vector, which is not limited in this embodiment of the present application.
  • the two-dimensional 2*68 coordinates are obtained.
  • the intrinsic parameters of the video RGB camera are known, and (U, V, W) represent the position of the three-dimensional point in the world coordinate system.
  • R and t represent the rotation matrix and translation vector of the world coordinate system relative to the camera coordinate system, respectively, and the coordinates (X, Y, Z) of the point in the camera coordinate system are calculated by formula (1);
  • f x , f y are the focal lengths in the x and y directions respectively; c x , cy are the optical centers respectively; s is the scaling factor, and the deviation of the driver's head posture is obtained by using the direct linear change method to solve flight angle.
  • the training method of the head pose extraction network includes: establishing a two-dimensional UV position map and its corresponding UV texture map according to the feature vector of the key points of the face, wherein the UV texture map contains the face UV Texture map and mask UV texture map; The UV texture map of the face is multiplied by the mask UV texture map to obtain the target UV texture map; the target UV texture map is remapped to obtain a face image with a mask; A head pose extraction network is trained using face images with masks.
  • the training method of the head pose extraction network further includes: calculating the loss value of the head pose extraction network according to the features output by the backbone network and the subsidiary network in combination with the labeled real value of the sample data; optimizing the head pose according to the loss value. Training parameters for the pose extraction network.
  • the head pose extraction network may include a backbone network and an auxiliary network.
  • the head pose extraction network is a convolutional neural network structure trained using sample data.
  • the sample data includes sample images and yaw angles corresponding to head poses in the sample images.
  • the texture map of the mask can be used for training, so that the head pose extraction network can recognize the head movement of the driver wearing a mask.
  • 68 face key point information can be input into the 3D face reconstruction PRNet network to obtain the UV position map and the corresponding UV texture map; the face UV texture map is multiplied by the UV texture map of the mask to obtain the new UV Texture map; remap the obtained new UV texture map with the face UV texture map to obtain a face image with a mask, and add the obtained face image with a mask to the head pose extraction network for training .
  • Fig. 2 is a schematic diagram of the training principle of an optional movement feature of the driver's head image according to an embodiment of the present application.
  • the face area in the image is guaranteed Under the consistent transformation of coordinates and key point coordinates, transformations of rotation, affine, color channel, and color space are added.
  • the penalty item weight weight
  • the backbone network backbone2 and the auxiliary network Auxiliary The feature is combined with the labeled real value of the key points of the sample to calculate the loss value (compute loss).
  • determining the driver's head movement according to the yaw angle includes: when the yaw angle is greater than or equal to a preset threshold, determining that the driver's head turns, wherein the turning movement includes at least one of the following: One: Turn left, turn right, turn down, turn up.
  • the yaw angle threshold of the above two cases is generally set at 20-30 The degree is set according to the actual environment of the driving test; when judging whether the driver has turned his head left or right, it is specifically: within the specified time, whether the actual value of the yaw angle is greater than or equal to the set yaw angle threshold, if so, then It is determined that the driver has turned the head; otherwise, it is determined that the driver has not turned the head.
  • the preset yaw angle threshold is 25 degrees, and the angle is positive to the right, and the angle is negative to the left ;
  • the method of the embodiment of the present application collects the visual image data of the driver during the driving test, marks 68 key points on the face data, performs deep learning on the visual image data of the driver's head, and extracts the driver's head posture 1* 136 eigenvectors, the yaw angle is calculated from the eigenvectors, and then the driver's head movement is judged through continuous frame images.
  • the data is enhanced through the network training process, and the driver's head movement can be accurately recognized during the driving process, which can be effectively applied to various items of the driving test and used as the basis for judging.
  • the face mask data is added to the method of the embodiment of the present application for training, which further improves the detection and recognition accuracy and is easy to implement.
  • the storage medium may include: a flash disk, a read-only memory (Read-Only Memory, ROM), a random access device (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • the integrated units in the above embodiments are realized in the form of software function units and sold or used as independent products, they can be stored in the above computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium.
  • Several instructions are included to make one or more computer devices (which may be personal computers, servers or network devices, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the disclosed client can be implemented in other ways.
  • the device embodiments described above are only illustrative, for example, the division of the units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Psychiatry (AREA)
  • Multimedia (AREA)
  • Social Psychology (AREA)
  • Image Analysis (AREA)

Abstract

一种驾驶员头部动作的确定方法、存储介质、电子装置,所述方法包括:对包含驾驶员头部姿态的图像数据进行预处理,得到连续帧图像(S102);从连续帧图像中提取驾驶员头部姿态的移动特征(S104);根据移动特征确定驾驶员头部姿态的偏航角(S106);根据偏航角的大小确定所述驾驶员的头部动作(S108)。

Description

驾驶员头部动作的确定方法、存储介质、电子装置 技术领域
本申请涉及数据处理技术领域,具体而言,涉及一种驾驶员头部动作的确定方法、存储介质、电子装置。
背景技术
近年来驾驶员行为监控预警技术已成为智能交通领域研究的重点之一,不仅成为辅助安全驾驶的标配,也是机动车驾驶考试领域研究的重点。每年都会有大量的学员通过驾驶员专业技能的培训并最终通过考核拿到驾照。
现阶段的机动车驾驶员技能考试采用的是计算机评判和考官人工评判相结合的方式。目前仅实现了部分考试项目的计算机自动考核评分,还有部分考试项目仍需人工评判。如:在驾驶考试学员上车考试的过程中,需要考官人工评判学员有没有观看左右后视镜等。由于涉及到人工评判,导致机动车驾驶考试行业中会出现不公平、不公正的行为。
针对相关技术中,无法准确识别机动车驾驶员的头部姿态的问题,目前尚未有有效的解决办法。
发明内容
本申请实施例提供了一种驾驶员头部动作的确定方法、存储介质、电子装置,以至少解决相关技术中无法准确识别机动车驾驶员的头部姿态的问题。
在本申请的一个实施例中,提出了一种驾驶员头部动作的确定方法,包括:对包含驾驶员头部姿态的图像数据进行预处理,得到连续帧图像;从所述连续帧图像中提取所述驾驶员头部姿态的移动特征;根据所述移动特征确定所述驾驶员头部姿态的偏航角;根据所述偏航角的大小确定所述驾驶员的头部动作。
在一实施例中,在对包含驾驶员头部姿态的图像数据进行预处理之前,所述方法还包括:通过摄像头实时采集包含所述驾驶员头部姿态的所述图像数据;将所述图像数据上传至待处理队列。
在一实施例中,在对包含驾驶员头部姿态的图像数据进行预处理,得到连续帧图像之后,所述方法还包括:对所述连续帧图像进行归一化处理,使用目标检测算法对所述连续帧图像中的头部区域进行感兴趣区域的提取。
在一实施例中,所述从所述连续帧图像中提取所述驾驶员头部姿态的移动特征,包括:从所述感兴趣区域中提取所述驾驶员头部姿态的移动特征。
在一实施例中,所述根据所述移动特征确定所述驾驶员头部姿态的偏航角,包括:将包含所述感兴趣区域的图像缩放至预设尺寸;所述预设尺寸的图像输入头部姿态提取网络中,其中,所述头部姿态提取网络包括主干网络和附属网络;通过所述主干网络提取所述驾驶员头部姿态的移动特征,得到人脸关键点特征向量;根据所述人脸关键点特征向量,通过所述附属网络计算所述驾驶员头部姿态的偏航角。
在一实施例中,所述头部姿态提取网络的训练方法包括:根据所述人脸关键点特 征向量建立二维UV位置映射图及其对应的UV纹理映射图,其中,所述UV纹理映射图中包含人脸UV纹理映射图和口罩UV纹理映射图;将所述人脸UV纹理映射图乘以所述口罩UV纹理映射图,得到目标UV纹理映射图;将所述目标UV纹理映射图进行重映射得到带有口罩的人脸图像;使用所述带有口罩的人脸图像训练所述头部姿态提取网络。
在一实施例中,所述头部姿态提取网络的训练方法还包括:根据所述主干网络和所述附属网络输出的特征,结合样本数据的标注真实值计算所述头部姿态提取网络的损失值;根据所述损失值优化所述头部姿态提取网络的训练参数。
在一实施例中,所述根据所述偏航角确定所述驾驶员的头部动作,包括:在所述偏航角大于或等于预设阈值时,确定所述驾驶员的头部发生转动动作,其中,所述转动动作包括以下至少之一:向左转动、向右转动、向下转动、向上转动。
在本申请的一个实施例中,还提出了一种计算机可读的存储介质,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
在本申请的一个实施例中,还提出了一种电子装置,包括存储器和处理器,其特征在于,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述任一项方法实施例中的步骤。
通过本申请实施例提供的驾驶员头部动作的确定方法,对包含驾驶员头部姿态的图像数据进行预处理,得到连续帧图像;从连续帧图像中提取驾驶员头部姿态的移动特征;根据移动特征,通过神经网络模型确定驾驶员头部姿态的偏航角;根据偏航角的大小确定驾驶员的头部动作。解决了相关技术中无法准确识别机动车驾驶员的头部姿态的问题。通过本申请提供的方法,可以通过计算机进行机器识别驾驶员的头部动作,同时结合神经网络模型的深度学习和训练,可以使得识别过程更加精确。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1是根据本申请实施例的一种可选的驾驶员头部动作的确定方法的流程图;
图2是根据本申请实施例的一种可选的驾驶员头部图像的移动特征的训练原理图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
本申请实施例提供了一种驾驶员头部动作的确定方法。图1是根据本申请实施例的一种可选的驾驶员头部动作的确定方法的流程图,如图1所示,所述方法包括:
步骤S102,对包含驾驶员头部姿态的图像数据进行预处理,得到连续帧图像;
步骤S104,从连续帧图像中提取驾驶员头部姿态的移动特征;
步骤S106,根据移动特征确定驾驶员头部姿态的偏航角;
步骤S108,根据偏航角的大小确定驾驶员的头部动作。
通过上述方法,解决了相关技术中无法准确识别机动车驾驶员的头部姿态的问题。可以通过计算机进行机器识别驾驶员的头部动作,同时结合神经网络模型的深度学习和训练,可以使得识别过程更加精确,克服了驾驶员考试过程中采用人工评判带来的不足。
在一实施例中,在对包含驾驶员头部姿态的图像数据进行预处理之前,所述方法还包括:通过摄像头实时采集包含驾驶员头部姿态的图像数据;将图像数据上传至待处理队列。
需要说明的是,上述摄像头可以是安装在车辆内的视频RGB摄像头,主要是针对主驾驶位驾驶员的头部动作图像进行实时采集,并将采集到的图像数据传至队列中。
在一实施例中,在对包含驾驶员头部姿态的图像数据进行预处理,得到连续帧图像之后,所述方法还包括:对连续帧图像进行归一化处理,使用目标检测算法对连续帧图像中的头部区域进行感兴趣区域的提取。
在一实施例中,从连续帧图像中提取驾驶员头部姿态的移动特征,包括:从感兴趣区域中提取驾驶员头部姿态的移动特征。
在一实施例中,根据移动特征确定驾驶员头部姿态的偏航角,包括:将包含感兴趣区域的图像缩放至预设尺寸;预设尺寸的图像输入头部姿态提取网络中,其中,头部姿态提取网络包括主干网络和附属网络,通过所述主干网络提取驾驶员头部姿态的移动特征,得到人脸关键点特征向量;根据人脸关键点特征向量,通过附属网络计算驾驶员头部姿态的偏航角。
需要说明的是,本申请实施例中涉及的预设尺寸,可以是根据实践需要设置的任意尺寸,例如本申请实施例中以112*122的固定尺寸为例进行解释说明,该预设尺寸也可以设为114*144或其他任意尺寸,本申请实施例对此不做限定。
关于上述确定偏航角的过程,可以通过以下示例中所述的步骤实现。将感兴趣区域放缩到固定的112*122尺寸大小,将固定尺寸图片放入到头部姿态提取网络中,头部姿态提取网络具体为:设置两个主干网络,分别为主干网络1,backbone 1和主干网络2,backbone2。主干网络1采用resnet18的前6层网络用于提取图像特征;附属网络Auxiliary采用多层Conv+BN+Act算子融合,用于计算头部姿态的偏航角。主干网络backbone1连接的主干网络backbone2采用resenet18的后两层,其用于提取人脸关键点特征向量,本示例中人脸关键点特征向量为一维的1*136向量。采用18层的卷积神经网络结构resnet18可以使得计算量比较小,方便快速计算出驾驶员头部姿态的偏航角。人脸关键点特征向量也可以是二维或多维向量,本申请实施例对此不做限定。
根据上述得到的一维1*136人脸关键点特征向量得出二维2*68坐标,已知视频RGB摄像头的内在参数,(U,V,W)代表世界坐标系中的三维点的位置,R和t分别代表相对相机坐标系中的世界坐标系的旋转矩阵和平移向量,由公式(1)计算相机坐标系中点的坐标(X,Y,Z);
Figure PCTCN2022087552-appb-000001
公式(1)等价形式如下:
Figure PCTCN2022087552-appb-000002
由上述等式可知,根据目标在三维世界坐标系中的14个点的坐标和对应投射到二维图像坐标系中的点集之间的变换关系矩阵进行求解得到旋转矩阵R和平移向量t,假设不存在径向畸变的情况下,图像坐标系中的任意一点p坐标(x,y)的计算为公式(3):
Figure PCTCN2022087552-appb-000003
式中,f x,f y分别为在x,y方向上的焦距长度;c x,c y分别为光学中心;s为缩放因子,采用直接线性变化方法进行求解得到驾驶员头部姿态的偏航角。
在一实施例中,头部姿态提取网络的训练方法包括:根据人脸关键点特征向量建立二维UV位置映射图及其对应的UV纹理映射图,其中,UV纹理映射图中包含人脸UV纹理映射图和口罩UV纹理映射图;将人脸UV纹理映射图乘以口罩UV纹理映射图,得到目标UV纹理映射图;将目标UV纹理映射图进行重映射得到带有口罩的人脸图像;使用带有口罩的人脸图像训练头部姿态提取网络。
在一实施例中,头部姿态提取网络的训练方法还包括:根据主干网络和附属网络输出的特征,结合样本数据的标注真实值计算头部姿态提取网络的损失值;根据损失值优化头部姿态提取网络的训练参数。头部姿态提取网络可以包括主干网络和附属网络,头部姿态提取网络是使用样本数据训练得到的卷积神经网络结构,样本数据包括样本图像以及样本图像中的头部姿态对应的偏航角。
进一步地,结合上述示例,在训练头部姿态提取网络的时候,可以结合口罩的纹理映射图进行训练,使得头部姿态提取网络可以识别戴口罩的驾驶员的头部动作。例如,可以输入68个人脸关键点信息进入三维人脸重建PRNet网络得到UV位置映射图和相应的UV纹理映射图;人脸UV纹理映射图乘以口罩的UV纹理映射图,以得到的新UV纹理映射图;将得到的新UV纹理映射图与人脸UV纹理映射图进行重映射来得到带有口罩的人脸图像,并将得到带有口罩的人脸图像加入头部姿态提取网络进行训练。
图2是根据本申请实施例的一种可选的驾驶员头部图像的移动特征的训练原理图,如图2所示,训练的过程中,在图像预处理部分,保证图像中人脸区域坐标和关键点坐标一致性变换下,加入旋转、仿射、颜色通道、颜色空间的变换。为了解决数据不平衡的问题,在计算头部姿态提取网络后根据样本的分布,对偏差较大的样本(groud truth)乘以惩罚项权重(weight);最后根据主干网络backbone2及附属网络Auxiliary的特征结合样本关键 点的标注真实值计算损失值(compute loss)。
在一实施例中,根据偏航角确定驾驶员的头部动作,包括:在偏航角大于或等于预设阈值时,确定驾驶员的头部发生转动动作,其中,转动动作包括以下至少之一:向左转动、向右转动、向下转动、向上转动。
向左转动时应用于驾驶员是否观察左后视镜的判断,向右转动时应用于驾驶员是否观察右后视镜判断,上述两种情况的偏航角阈值一般设置在20-30度,根据驾驶考试的实际环境设定;在判断驾驶员是否发生左右转头动作具体为:在规定时间内,偏航角的实际值是否大于或等于所设定的偏航角阈值,若是则认定为驾驶员发出转头动作;否则认定为驾驶员未发出转头动作。
示例中,驾驶考试车辆行驶前,判断驾驶员是否有发出观看左后视镜动作具体为:预先设定偏航角阈值为25度,且角度为正数表示向右,角度为负数表示向左;实时获取驾驶员头部姿态的偏航角信息,若偏航角为-25度,且考试车辆在5s时间内开始行驶,则认定为驾驶员已发出行驶前的观察左后视镜动作。
本申请实施例的方法通过采集驾驶员在驾驶考试过程中的视觉图像数据,对人脸数据标注68个关键点,对驾驶员头部视觉图像数据进行深度学习,抽取驾驶员头部姿态1*136特征向量,由该特征向量计算偏航角,进而通过连续帧图像来判别驾驶员头部动作。
本申请实施例的方法中通过网络训练过程,实现对数据进行增强,能够在驾驶过程中对驾驶人员的头部动作进行精准识别,有效地应用于驾驶考试各项目中并作为评判依据。此外,本申请实施例的方法中加入了口罩人脸数据进行训练,进一步地提高了检测识别精度,且易于实现。
可选地,在本实施例中,本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
上述实施例中的集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在上述计算机可读取的存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在存储介质中,包括若干指令用以使得一台或多台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。
在本申请的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的客户端,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。

Claims (10)

  1. 一种驾驶员头部动作的确定方法,其特征在于,包括:
    对包含驾驶员头部姿态的图像数据进行预处理,得到连续帧图像;
    从所述连续帧图像中提取所述驾驶员头部姿态的移动特征;
    根据所述移动特征确定所述驾驶员头部姿态的偏航角;
    根据所述偏航角的大小确定所述驾驶员的头部动作。
  2. 根据权利要求1所述的方法,其特征在于,在对包含驾驶员头部姿态的图像数据进行预处理之前,所述方法还包括:
    通过摄像头实时采集包含所述驾驶员头部姿态的所述图像数据;
    将所述图像数据上传至待处理队列。
  3. 根据权利要求1所述的方法,其特征在于,在对包含驾驶员头部姿态的图像数据进行预处理,得到连续帧图像之后,所述方法还包括:
    对所述连续帧图像进行归一化处理,使用目标检测算法对所述连续帧图像中的头部区域进行感兴趣区域的提取。
  4. 根据权利要求3所述的方法,其特征在于,所述从所述连续帧图像中提取所述驾驶员头部姿态的移动特征,包括:
    从所述感兴趣区域中提取所述驾驶员头部姿态的移动特征。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述移动特征确定所述驾驶员头部姿态的偏航角,包括:
    将包含所述感兴趣区域的图像缩放至预设尺寸;
    将所述预设尺寸的图像输入头部姿态提取网络中,其中,所述头部姿态提取网络包括主干网络和附属网络;
    通过所述主干网络提取所述驾驶员头部姿态的移动特征,得到人脸关键点特征向量;
    根据所述人脸关键点特征向量,通过所述附属网络计算所述驾驶员头部姿态的偏航角。
  6. 根据权利要求5所述的方法,其特征在于,所述头部姿态提取网络的训练方法包括:
    根据所述人脸关键点特征向量建立二维UV位置映射图及其对应的UV纹理映射图,其中,所述UV纹理映射图中包含人脸UV纹理映射图和口罩UV纹理映射图;
    将所述人脸UV纹理映射图乘以所述口罩UV纹理映射图,得到目标UV纹理映射图;
    将所述目标UV纹理映射图进行重映射得到带有口罩的人脸图像;
    使用所述带有口罩的人脸图像训练所述头部姿态提取网络。
  7. 根据权利要求6所述的方法,其特征在于,所述头部姿态提取网络的训练方法还包括:
    根据所述主干网络和所述附属网络输出的特征,结合样本数据的标注真实值计算所述头部姿态提取网络的损失值;
    根据所述损失值优化所述头部姿态提取网络的训练参数。
  8. 根据权利要求1至7任一项所述的方法,其特征在于,所述根据所述偏航角确定所述驾驶员的头部动作,包括:
    在所述偏航角大于或等于预设阈值时,确定所述驾驶员的头部发生转动动作,其中,所述转动动作包括以下至少之一:向左转动、向右转动、向下转动、向上转动。
  9. 一种计算机可读的存储介质,其特征在于,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行权利要求1至8任一项中所述的方法。
  10. 一种电子装置,包括存储器和处理器,其特征在于,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行权利要求1至8任一项中所述的方法。
PCT/CN2022/087552 2021-05-28 2022-04-19 驾驶员头部动作的确定方法、存储介质、电子装置 WO2022247527A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110591032.3A CN113239861B (zh) 2021-05-28 2021-05-28 驾驶员头部动作的确定方法、存储介质、电子装置
CN202110591032.3 2021-05-28

Publications (1)

Publication Number Publication Date
WO2022247527A1 true WO2022247527A1 (zh) 2022-12-01

Family

ID=77135546

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/087552 WO2022247527A1 (zh) 2021-05-28 2022-04-19 驾驶员头部动作的确定方法、存储介质、电子装置

Country Status (2)

Country Link
CN (1) CN113239861B (zh)
WO (1) WO2022247527A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239861B (zh) * 2021-05-28 2024-05-28 多伦科技股份有限公司 驾驶员头部动作的确定方法、存储介质、电子装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160237A (zh) * 2019-12-27 2020-05-15 智车优行科技(北京)有限公司 头部姿态估计方法和装置、电子设备和存储介质
US20200218883A1 (en) * 2017-12-25 2020-07-09 Beijing Sensetime Technology Development Co., Ltd. Face pose analysis method, electronic device, and storage medium
CN111539333A (zh) * 2020-04-24 2020-08-14 湖北亿咖通科技有限公司 驾驶员的注视区域识别及分心检测方法
CN112329566A (zh) * 2020-10-26 2021-02-05 易显智能科技有限责任公司 一种精准感知机动车驾驶人员头部动作的视觉感知***
CN113239861A (zh) * 2021-05-28 2021-08-10 多伦科技股份有限公司 驾驶员头部动作的确定方法、存储介质、电子装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200218883A1 (en) * 2017-12-25 2020-07-09 Beijing Sensetime Technology Development Co., Ltd. Face pose analysis method, electronic device, and storage medium
CN111160237A (zh) * 2019-12-27 2020-05-15 智车优行科技(北京)有限公司 头部姿态估计方法和装置、电子设备和存储介质
CN111539333A (zh) * 2020-04-24 2020-08-14 湖北亿咖通科技有限公司 驾驶员的注视区域识别及分心检测方法
CN112329566A (zh) * 2020-10-26 2021-02-05 易显智能科技有限责任公司 一种精准感知机动车驾驶人员头部动作的视觉感知***
CN113239861A (zh) * 2021-05-28 2021-08-10 多伦科技股份有限公司 驾驶员头部动作的确定方法、存储介质、电子装置

Also Published As

Publication number Publication date
CN113239861A (zh) 2021-08-10
CN113239861B (zh) 2024-05-28

Similar Documents

Publication Publication Date Title
US10684681B2 (en) Neural network image processing apparatus
CN108073914B (zh) 一种动物面部关键点标注方法
CN105574518B (zh) 人脸活体检测的方法和装置
JP6664163B2 (ja) 画像識別方法、画像識別装置及びプログラム
Krull et al. Learning analysis-by-synthesis for 6D pose estimation in RGB-D images
CN103530599B (zh) 一种真实人脸和图片人脸的区别方法和***
CN107038422B (zh) 基于空间几何约束深度学习的疲劳状态识别方法
JP7015152B2 (ja) キーポイントデータに関する加工装置、方法及びプログラム
CN108256421A (zh) 一种动态手势序列实时识别方法、***及装置
US20120308124A1 (en) Method and System For Localizing Parts of an Object in an Image For Computer Vision Applications
CN111144207B (zh) 一种基于多模态信息感知的人体检测和跟踪方法
WO2005111936A1 (ja) パラメタ推定方法、パラメタ推定装置および照合方法
CN108875586B (zh) 一种基于深度图像与骨骼数据多特征融合的功能性肢体康复训练检测方法
WO2020237942A1 (zh) 一种行人3d位置的检测方法及装置、车载终端
CN108470178B (zh) 一种结合深度可信度评价因子的深度图显著性检测方法
CN106934380A (zh) 一种基于HOG和MeanShift算法的室内行人检测和跟踪方法
CN109359577A (zh) 一种基于机器学习的复杂背景下人数检测***
CN110135277B (zh) 一种基于卷积神经网络的人体行为识别方法
WO2022247527A1 (zh) 驾驶员头部动作的确定方法、存储介质、电子装置
CN114333046A (zh) 舞蹈动作评分方法、装置、设备和存储介质
CN103544478A (zh) 一种全方位人脸检测的方法及***
CN113963237B (zh) 模型训练、戴口罩状态检测方法、电子设备及存储介质
CN107122726A (zh) 一种多姿态行人检测方法
CN111310720A (zh) 基于图度量学习的行人重识别方法及***
CN115049842B (zh) 一种飞机蒙皮图像损伤检测与2d-3d定位方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22810260

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22810260

Country of ref document: EP

Kind code of ref document: A1