CN118238823A

CN118238823A - Head posture acquisition method, acquisition device, acquisition system and vehicle

Info

Publication number: CN118238823A
Application number: CN202410185266.1A
Authority: CN
Inventors: 杨曦; 贾澜鹏; 罗林
Original assignee: Great Wall Motor Co Ltd
Current assignee: Great Wall Motor Co Ltd
Priority date: 2024-02-19
Filing date: 2024-02-19
Publication date: 2024-06-25

Abstract

The application provides a head posture acquisition method, an acquisition device, an acquisition system and a vehicle, wherein the method comprises the following steps: acquiring an initial head gesture of a user acquired by an inertial measurement unit at an initial moment and a first image of a target shooting object acquired by a first camera at the initial moment, wherein the inertial measurement unit and the first camera are worn on the head of the user; acquiring a second image of a target shooting object acquired by a first camera at a target moment; generating a head posture variation based on the image corner of the first image and the image corner of the second image; based on the initial head pose and the head pose variation, a target head pose of the user at a target moment is generated. The method can accurately acquire the head gesture of the user.

Description

Head posture acquisition method, acquisition device, acquisition system and vehicle

Technical Field

The present application relates to the field of vehicles, and more particularly, to a head pose acquisition method, an acquisition device, an acquisition system, and a vehicle.

Background

With the continuous development of automobile technology, the automobile cabins are more and more intelligent and humanized. The vehicle cabin enhances human-vehicle interaction through various technologies, and future vehicles are not single vehicles any more, so that the intelligent and humanized aspects are important for attracting users.

At present, the head gesture of a user is often used as input to realize multiple functions in a vehicle cabin. As the implementation of the personalized function has a dependency on the head posture, as the demands of the personalized function in the vehicle cabin are increasing, how to accurately acquire the head posture of the user becomes a problem to be solved.

Disclosure of Invention

The application provides a head gesture acquisition method, an acquisition device, an acquisition system and a vehicle.

In a first aspect, there is provided a method of acquiring a head pose, the method comprising: acquiring an initial head gesture of a user acquired at an initial moment by an inertial measurement unit and a first image of a target shooting object acquired at the initial moment by a first camera; the inertial measurement unit and the first camera are worn on the head of the user; acquiring a second image of the target shooting object acquired by the first camera at a target moment; generating a head posture variation based on the image corner of the first image and the image corner of the second image; and generating a target head posture of the user at the target moment based on the initial head posture and the head posture change amount.

According to the technical scheme, the first image of the target shooting object acquired at the initial moment and the second image of the target shooting object acquired at the target moment are used for determining the head posture change amount, the target head posture of the user at the target moment is generated based on the initial head posture and the head posture change amount of the user acquired at the initial moment by the inertia measurement unit, the head posture change amount is determined through calculation, the initial head posture of the user is acquired at the initial moment by the inertia measurement unit, and further the influence of instability of the inertia measurement unit on the acquired head posture is avoided as much as possible, so that the head posture of the user can be accurately acquired.

In addition, because the inertial measurement unit needs to wait for a long time to acquire the head gesture, compared with the inertial measurement unit which acquires the head gesture at different moments and determines the head gesture variable quantity, in the scheme of the application, the head gesture variable quantity is determined through calculation, so that the head gesture acquisition speed can be improved, and the head gesture of a user can be accurately and rapidly acquired.

With reference to the first aspect, in some possible implementations, the target shooting object includes a two-dimensional code array, and the method further includes: converting the first image and the second image into a target coordinate system of a target two-dimensional code in the two-dimensional code array to obtain a first two-dimensional code image and a second two-dimensional code image; and generating the head posture change amount based on the image corner of the first two-dimensional code image and the image corner of the second two-dimensional code image.

According to the technical scheme, under the condition that the target shooting object comprises the two-dimensional code array, the first two-dimensional code image and the second two-dimensional code image are obtained through coordinate system conversion of the images, and then the head posture change amount is generated based on the image corner of the first two-dimensional code image and the image corner of the second two-dimensional code image, and the accuracy of obtaining the head posture change amount is improved through coordinate system conversion of the images, so that the accuracy of the head posture obtained through collection is further improved.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, the method further includes: acquiring an external reference matrix of each two-dimensional code in the first image, an external reference matrix of each two-dimensional code in the second image and an external reference matrix of the target two-dimensional code; based on the external parameter matrix of each two-dimensional code in the first image and the external parameter matrix of the target two-dimensional code, converting the first image to the target coordinate system to obtain the first two-dimensional code image; and converting the second image to the target coordinate system based on the external parameter matrix of each two-dimensional code in the second image and the external parameter matrix of the target two-dimensional code to obtain the second two-dimensional code image.

According to the technical scheme, the external parameter matrix of each two-dimensional code in the first image and the external parameter matrix of each two-dimensional code in the second image are obtained, and then the external parameter matrix of each two-dimensional code in the first image and the external parameter matrix of the target two-dimensional code are obtained, so that the first image and the second image are converted to the target coordinate system, the first two-dimensional code image and the second two-dimensional code image are obtained, and the accuracy of the head posture change amount is facilitated to be obtained through the conversion of the coordinate system, and the accuracy of the acquired head posture is further improved.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, the method further includes: acquiring a first face image of the user acquired by a second camera at the initial moment; converting the initial head gesture into a coordinate system of the second camera to obtain a reference head gesture corresponding to the first face image; acquiring a second face image of the user acquired by the second camera at the target moment; converting the target head gesture into a coordinate system of the second camera to obtain a reference head gesture corresponding to the second face image; and generating a mapping relation between the face image and the head gesture based on the first face image, the reference head gesture corresponding to the first face image, the second face image and the reference head gesture corresponding to the second face image.

According to the technical scheme, the initial head gesture is converted into the coordinate system of the second camera to obtain the reference head gesture corresponding to the first face image, the target head gesture is converted into the coordinate system of the second camera to obtain the reference head gesture corresponding to the second face image, the mapping relation between the face image and the head gesture is further generated, model training is conducted through the mapping relation between the face image and the head gesture, and then the recognition effect of the trained model is higher.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, the method further includes: acquiring a first conversion matrix of the target shooting object and the first camera, a second conversion matrix of the target shooting object and the third camera, a third conversion matrix of a calibration plate and the third camera and a fourth conversion matrix of the calibration plate and the second camera; determining a first target conversion matrix of the first camera and the second camera based on the first conversion matrix, the second conversion matrix, the third conversion matrix and the fourth conversion matrix; and converting the initial head gesture to a coordinate system of the second camera based on the first target conversion matrix to obtain a reference head gesture corresponding to the first face image.

According to the technical scheme, the first target conversion matrix is determined through the first conversion matrix, the second conversion matrix, the third conversion matrix and the fourth conversion matrix, and then the initial head gesture is converted to the coordinate system of the second camera based on the first target conversion matrix, so that the reference head gesture corresponding to the first face image is obtained.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, the method further includes: acquiring a first conversion matrix of the target shooting object and the first camera and a fifth conversion matrix of the target shooting object and the second camera; determining a second target conversion matrix of the first camera and the second camera based on the first conversion matrix and the fifth conversion matrix; and converting the initial head gesture to a coordinate system of the second camera based on the second target conversion matrix to obtain a reference head gesture corresponding to the first face image.

According to the technical scheme, the second target conversion matrix is determined based on the first conversion matrix and the fifth conversion matrix, and the initial head gesture is converted into the coordinate system of the second camera through the second target conversion matrix, so that the reference head gesture corresponding to the first face image is obtained.

With reference to the first aspect and the foregoing implementation manner, in some possible implementation manners, the second camera is an infrared camera.

According to the technical scheme, the definition of the face image shot by the infrared camera is higher, and the accuracy of the acquired head posture data is higher. In addition, the recognition effect of the neural network model obtained by taking the acquired head gesture as training data for training is more accurate.

In a second aspect, there is provided a head pose acquisition device comprising: the acquisition module is used for acquiring the initial head gesture of the user acquired by the inertial measurement unit at the initial moment and a first image of a target shooting object acquired by the first camera at the initial moment; the inertial measurement unit and the first camera are worn on the head of the user; the acquisition module is used for acquiring a second image of the target shooting object acquired by the first camera at the target moment; the generation module is used for generating a head posture variation based on the image corner of the first image and the image corner of the second image; and the determining module is used for generating a target head gesture of the user at the target moment based on the initial head gesture and the head gesture change amount.

In a third aspect, a head pose acquisition system is provided, including an inertial measurement unit, a first camera, and a controller; the inertial measurement unit is used for acquiring the initial head posture of the user at the initial moment; the first camera is used for acquiring a first image of a target shooting object at the initial moment and acquiring a second image of the target shooting object at the target moment; the controller is used for generating a head posture change amount based on the image corner of the first image and the image corner of the second image, and generating a target head posture of the user at the target moment based on the initial head posture and the head posture change amount.

In a fourth aspect, a vehicle is provided that includes a memory for storing executable program code and a processor; the processor is configured to invoke and run the executable program code from the memory, so that the vehicle performs the acquisition method of the first aspect or any one of the possible implementation manners of the first aspect.

In a fifth aspect, a computer readable storage medium is provided, the computer readable storage medium storing computer program code which, when run on a computer, causes the computer to perform the acquisition method of the first aspect or any one of the possible implementations of the first aspect.

In a sixth aspect, there is provided a computer program product comprising: computer program code which, when run on a computer, causes the computer to perform the acquisition method of the first aspect or any one of the possible implementations of the first aspect.

Drawings

FIG. 1 is a schematic flow chart of a method for acquiring head pose according to an embodiment of the present application;

FIG. 2 is a schematic illustration of a first image and a second image provided by an embodiment of the present application;

FIG. 3 is a schematic view of a scene for generating head pose variation provided by an embodiment of the present application;

fig. 4 is a schematic view of a scene of a method for acquiring a head gesture according to an embodiment of the present application;

FIG. 5 is a schematic view of another head pose acquisition method according to an embodiment of the present application;

FIG. 6 is a schematic flow chart of another head pose acquisition method provided by an embodiment of the present application;

FIG. 7 is a schematic flow chart of a method for acquiring a head pose according to an embodiment of the present application;

FIG. 8 is a schematic frame diagram of a head pose acquisition system according to an embodiment of the present application;

FIG. 9 is a schematic view of a head pose acquisition system according to an embodiment of the present application;

Fig. 10 is a schematic structural diagram of a head posture acquisition device according to an embodiment of the present application;

Fig. 11 is a schematic structural view of a vehicle according to an embodiment of the present application.

Detailed Description

The technical scheme of the application will be clearly and thoroughly described below with reference to the accompanying drawings. Wherein, in the description of the embodiments of the present application, unless otherwise indicated, "/" means or, for example, a/B may represent a or B: the text "and/or" is merely an association relation describing the associated object, and indicates that three relations may exist, for example, a and/or B may indicate: the three cases where a exists alone, a and B exist together, and B exists alone, and furthermore, in the description of the embodiments of the present application, "plural" means two or more than two.

The terms "first," "second," and the like, are used below for descriptive purposes only and are not to be construed as implying or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature.

In the actual application process, the head gesture of the user is often used as input to realize multiple functions in the vehicle cabin. For example, the collected head gestures are used as a training set, and a neural network is obtained through training so as to realize the personalized function in the vehicle cabin. At present, the head posture of a user is acquired through an inertial measurement unit, but the inertial measurement unit needs to wait for a long time when reading data, so that the acquisition speed of the head posture is difficult to ensure in the data acquisition process, and in addition, the data acquired for many times through the inertial measurement unit has a large error, so that the accuracy of the head posture is difficult to ensure. In view of this, the present application proposes a head pose acquisition method, an acquisition device, a vehicle, and a storage medium, by which the speed and accuracy of acquiring a head pose can be improved.

Fig. 1 is a schematic flow chart of a method for acquiring a head pose according to an embodiment of the present application.

By way of example, the method shown in FIG. 1 may be performed by a controller or chip of a vehicle; wherein the controller may be the controller 820 shown in fig. 8.

Illustratively, as shown in FIG. 1, the method 100 includes the following processes:

s110, acquiring an initial head posture of a user acquired by the inertial measurement unit at an initial moment and a first image of a target shooting object acquired by the first camera at the initial moment.

Optionally, an inertial measurement unit (Inertial Measurement Unit, IMU) is used to measure an initial head pose of the user, the initial head pose including pitch angle, yaw angle and roll angle. The inertial measurement unit comprises three single-axis accelerometers and three single-axis gyroscopes, wherein the accelerometers detect acceleration signals of independent three axes of a user in a corresponding coordinate system, the gyroscopes detect angular velocity signals of the corresponding coordinate system, and the initial head gesture of the user is determined by measuring the angular velocity and the acceleration of the user in a three-dimensional space. It will be appreciated that in order to improve the accuracy of the data collected by the inertial measurement unit, more sensors may be provided for each axis, and the data of the sensors in the inertial measurement unit may be determined according to the actual situation, which is not specifically limited herein.

It should be noted that, the inertial measurement unit and the first camera are worn on the head of the user, and the inertial measurement unit and the first camera are rigidly connected together. The inertial measurement unit and the first camera are in the same coordinate system by adjusting the internal parameters of the inertial measurement unit and the internal parameters of the first camera.

Further, since the inertial measurement unit is in the same coordinate system as the first camera, the inertial measurement unit acquires an initial head pose of the user in the first camera coordinate system at an initial time. The first camera acquires a first image of a target shooting object in a first camera coordinate system at an initial moment.

Alternatively, the target photographing object may include, but is not limited to, a two-dimensional code image, a calibration plate, a two-dimensional code array, and the like, and is not particularly limited herein.

S120, acquiring a second image of the target shooting object acquired by the first camera at the target moment.

It can be appreciated that since the inertial measurement unit is in the same coordinate system as the first camera, the first camera captures a second image of the target photographic subject in the first camera coordinate system at the target instant.

Alternatively, the target moment may be any moment at which the target head pose of the user needs to be acquired. For example, if the initial time is the first time, the target time may be the second time, the third time, the fourth time, and so on in order, which is not particularly limited herein.

And S130, generating a head posture change amount based on the image corner of the first image and the image corner of the second image.

In one example, the target photographic subject includes a two-dimensional code array, and the head pose variation is generated based on the image corner of the first image and the image corner of the second image. Specifically, the first image and the second image are converted into a target coordinate system of a target two-dimensional code in the two-dimensional code array, and a first two-dimensional code image and a second two-dimensional code image are obtained.

It can be understood that the first image and the second image comprise at least one two-dimensional code, and the head posture change amount is determined by converting the coordinate systems of the two-dimensional codes in the first image and the second image into the same coordinate system. The method comprises the steps of obtaining an external reference matrix of each two-dimensional code in a first image, an external reference matrix of each two-dimensional code in a second image and an external reference matrix of a target two-dimensional code, converting the first image into a target coordinate system based on the external reference matrix of each two-dimensional code in the first image and the external reference matrix of the target two-dimensional code, obtaining a first two-dimensional code image, and converting the second image into the target coordinate system based on the external reference matrix of each two-dimensional code in the second image and the external reference matrix of the target two-dimensional code, so as to obtain a second two-dimensional code image.

Fig. 2 is a schematic diagram of a first image and a second image provided by an embodiment of the present application.

As shown in fig. 2, the first image includes two-dimensional codes numbered 1 to 6, the second image includes two-dimensional codes numbered 5 to 10, the reference matrix of the two-dimensional codes numbered 1 to 10 is Rex _i, i=0 …, the reference matrix of the target two-dimensional code is Rex ₅, the reference matrix of each two-dimensional code is multiplied by the reference matrix of the target two-dimensional code, that is, the coordinate system of each two-dimensional code in the first image is converted to the target coordinate system, the coordinate system of each two-dimensional code in the second image is converted to the target coordinate system, and the first two-dimensional code image and the second two-dimensional code image are obtained.

According to the mode, the external parameter matrix of each two-dimensional code in the first image and the external parameter matrix of each two-dimensional code in the second image are obtained, and then the external parameter matrix of each two-dimensional code in the first image and the external parameter matrix of the target two-dimensional code are obtained, so that the first image and the second image are converted into the target coordinate system, the first two-dimensional code image and the second two-dimensional code image are obtained, the first image and the second image are converted into the same coordinate system through the conversion of the coordinate system, the head posture change amount is determined according to different images in the same coordinate system, the accuracy of the head posture change amount is improved, and the accuracy of the acquired head posture is improved.

Further, after the first two-dimensional code image and the second two-dimensional code image are obtained, based on the image corner of the first two-dimensional code image and the image corner of the second two-dimensional code image, the head posture change amount is generated. Specifically, because the first two-dimensional code image and the second two-dimensional code image are in the target coordinate system, according to the corner points in the first two-dimensional code image and the second two-dimensional code image, namely, the corresponding relation of the characteristic pixel points between the first two-dimensional code image and the second two-dimensional code image is determined, so that the position relation of the first two-dimensional code image and the second two-dimensional code image is determined, and further the head posture change quantity is generated.

According to the mode, under the condition that the target shooting object comprises the two-dimensional code array, the first two-dimensional code image and the second two-dimensional code image are obtained through coordinate system conversion of the images, and then the head posture change amount is generated based on the image corner of the first two-dimensional code image and the image corner of the second two-dimensional code image, and the accuracy of obtaining the head posture change amount is improved through coordinate system conversion of the images, so that the accuracy of the head posture obtained through collection is further improved.

In another example, the target shooting object includes a two-dimensional code array, a conversion relation between each two-dimensional code in the two-dimensional code array and a target coordinate system can be obtained, and then the first image and the second image are converted into the target coordinate system through the conversion relation, so that a first two-dimensional code image and a second two-dimensional code image are obtained, and based on image corner points of the first two-dimensional code image and image corner points of the second two-dimensional code, a head posture change amount is generated.

Specifically, each two-dimensional code in the two-dimensional code array has a corresponding world coordinate system, the conversion relation between each world coordinate system and the target coordinate system needs to be acquired, a third camera can be used for shooting the arranged two-dimensional code array, and continuity between shot images is ensured, namely, the next image contains at least one two-dimensional code in the previous image. And identifying the two-dimensional codes in the two-dimensional code array, enabling each two-dimensional code in each shot image to have a unique code, detecting the two-dimensional codes in the shot images by using a two-dimensional code detection algorithm, and calculating an external parameter matrix of each two-dimensional code in the two-dimensional code array to be Rex _i, wherein i= … n. Taking the two-dimensional code with the smallest label as the target two-dimensional code, the conversion relation from each two-dimensional code to the target coordinate system is expressed as follows:

Wherein Rtrans _i represents a conversion relation from a two-dimensional code with a reference number i to a target coordinate system, each two-dimensional code in a first image and each two-dimensional code of a second image are obtained, the first two-dimensional code image and the second two-dimensional code image are determined according to the conversion relation corresponding to each two-dimensional code, and then the head posture change amount is generated according to the image corner of the first two-dimensional code image and the image corner of the second two-dimensional code image.

Fig. 3 is a schematic view of a scene for generating a head posture change amount according to an embodiment of the present application.

Exemplary, a schematic view of the acquisition scene of the head posture change amount shown in (a) in fig. 3, a schematic view of the first image corner shown in (b) in fig. 3, and a schematic view of the second image corner shown in (c) in fig. 3.

As shown in (a) of fig. 3, the target shooting object includes a two-dimensional code array, a first camera collects a first image of the two-dimensional code array at an initial time, collects a second image of the two-dimensional code array at a target time, and converts the first image and the second image into a target coordinate system of a target two-dimensional code in the two-dimensional code array by obtaining a conversion relationship corresponding to each two-dimensional code in the two-dimensional code array, so as to obtain a first two-dimensional code image and a second two-dimensional code image.

Further, refer to (b) in fig. 3 and (c) in fig. 3. Determining a first shooting visual angle of a first camera at the initial moment according to a first image corner of a first two-dimensional code image acquired at the initial moment; and determining a second shooting visual angle of the first camera at the target moment according to a second image corner of the second two-dimensional code image acquired at the target moment. The change amounts of the first shooting view angle and the second shooting view angle are the change amounts of the camera pose at the initial moment and the camera pose at the target moment.

In addition, the first camera is worn on the head of the user, and the conversion relationship from the user coordinate system to the first camera coordinate system is RT _H2Cam, and since the first camera is rigidly connected to the head of the user, RT _H2Cam is a constant. The head position will not change during the process of the user collecting images at the initial time and the target time. Therefore, the amount of change in the camera pose at the initial time and the camera pose at the target time is the amount of change in the head pose of the user.

Assuming that the camera pose at the initial time t ₁ is RT ₁, the camera pose at the target time t ₂ is RT ₂, and the camera pose is used to represent the position and direction of the first camera. Then the initial time t ₁ to the target time t ₂, the head posture change amount is Δ_t1-t2＝RT₂*RT_H2Cam*(RT₁*RT_H2Cam)^-1＝RT₂*RT₁ ^-1.

In yet another example, the target shooting object includes a calibration plate, the third camera shoots the calibration plate at an initial time to obtain a first image, shoots the calibration plate at a target time to obtain a second image, and further determines the variation of the head gesture according to the image corner of the first image and the image corner of the second image.

S140, generating a target head posture of the user at a target moment based on the initial head posture and the head posture change amount.

It should be noted that, the inertial measurement unit and the first camera are worn on the head of the user, and the inertial measurement unit and the first camera are rigidly connected together. In addition, because the acquired head gesture may be used for subsequent training of the neural network model, the inertial measurement unit may be fixedly disposed behind the brain of the user, so that the data acquired by the second camera may not have other reference points, so as not to affect the training effect of the subsequent neural network model.

The collected initial head pose of the user is located in a first camera coordinate system, and the target head pose of the user at the target moment in the first camera coordinate system is determined through the product of the initial head pose and the head pose change quantity.

The method comprises the steps of obtaining a first face image of a user acquired by a second camera at an initial moment, and converting an initial head gesture into a coordinate system of the second camera to obtain a reference head gesture corresponding to the first face image; the method comprises the steps of obtaining a second face image of a user, collected by a second camera at a target moment, converting a target head gesture into a coordinate system of the second camera to obtain a reference head gesture corresponding to the second face image, and generating a mapping relation between the face image and the head gesture based on the first face image, the reference head gesture corresponding to the first face image, the second face image and the reference head gesture corresponding to the second face image.

Alternatively, the second camera may be an infrared camera, an RGB camera, a laser camera, or the like, which is not particularly limited herein.

Preferably, the second camera is an infrared camera, and since the infrared camera can observe a photographic object in a dark light environment by capturing infrared radiation, a clear image can be provided in a completely dark condition, and a higher observation capability is provided. The infrared camera also has the advantages of high concealment, temperature detection function and the like, so that the definition of the face image shot by the infrared camera is higher, and the accuracy of the acquired head posture data is higher. In addition, the recognition effect of the neural network model obtained by taking the acquired head gesture as training data for training is more accurate.

Optionally, the head posture prediction is one of important directions in the vision field, and the head posture prediction predicts three-dimensional posture angles of the head, namely pitch angle, yaw angle and roll angle, through two-dimensional images, and determines the real intention of the user through the three-dimensional posture angles, so that the personalized function in the cabin is realized. Therefore, the acquired face image and the reference head posture corresponding to the face image can be used as a training sample set, and the head posture prediction model is obtained through training.

Specifically, the step of acquiring training sample data includes: the first face image of the user acquired by the second camera at the initial moment is converted into a coordinate system of the second camera to obtain the first face image and a reference head posture corresponding to the first face image; and converting the target head gesture into a coordinate system of the second camera by the second face image of the user acquired at the target moment to obtain the second face image and a reference head gesture corresponding to the second face image. It is to be understood that the above example is only a sample data acquiring manner, and a manner of acquiring a sample data set based on this manner is not described herein.

After the training sample data is obtained in the manner disclosed above, model training is performed by the sample data. Specifically, extracting sample image features of a sample image based on a feature extraction network of a head pose prediction model; the sample image is marked with a sample head gesture corresponding to the face image; based on a prediction network of the head posture prediction model, predicting to obtain a predicted head posture corresponding to the sample image; obtaining a loss value based on the sample head pose and the predicted head pose to which the sample image belongs; based on the loss values, network parameters of the head pose prediction model are adjusted.

Further, the trained head pose prediction model may be used to predict the head pose of the target object. For example, the fatigue driving detection of the driver in the vehicle is performed, the head gesture of the driver is identified through the head gesture prediction model, and the head movement and gesture change of the driver are analyzed through the head gesture, so that the driver is reminded of resting or taking corresponding measures, such as emergency braking, automatic driving and the like, under the condition that the fatigue driving of the driver is determined.

According to the mode, the initial head gesture is converted into the coordinate system of the second camera to obtain the reference head gesture corresponding to the first face image, the target head gesture is converted into the coordinate system of the second camera to obtain the reference head gesture corresponding to the second face image, the mapping relation between the face image and the head gesture is further generated, model training is conducted through the mapping relation between the face image and the head gesture, and therefore the recognition effect of the trained model is higher.

In one example, the initial head pose is converted into a second camera coordinate system to obtain a reference head pose corresponding to the first face image. Specifically, a first conversion matrix of a target shooting object and a first camera, a second conversion matrix of the target shooting object and a third camera, a third conversion matrix of a calibration plate and the third camera, and a fourth conversion matrix of the calibration plate and the second camera are obtained, a first target conversion matrix of the first camera and the second camera is determined based on the first conversion matrix, the second conversion matrix, the third conversion matrix and the fourth conversion matrix, namely, the first conversion matrix, the second conversion matrix, the third conversion matrix and the fourth conversion matrix are multiplied, and a first target conversion matrix of the first camera and the second camera is determined. Based on the first target transformation matrix, transforming the initial head gesture to a coordinate system of the second camera to obtain a reference head gesture corresponding to the first face image.

Fig. 4 is a schematic view of a scene of a method for acquiring a head gesture according to an embodiment of the present application.

Illustratively, a first target transformation matrix acquisition schematic shown in fig. 4 (a), and a reference head pose acquisition schematic shown in fig. 4 (b).

Exemplary, the first transformation matrix acquisition schematic shown in fig. 4 (a) acquires the first transformation matrix of the target shooting object and the first camera when the target shooting object is located in front of the userSecond transformation matrix/>, of target shooting object and third cameraThird conversion matrix/>, of calibration plate and third cameraAnd a fourth conversion matrix/>, of the calibration plate and the second cameraMultiplying the first conversion matrix, the second conversion matrix, the third conversion matrix and the fourth conversion matrix to determine a first target conversion matrix of the first camera and the second camera, wherein the specific expression is as follows:

Wherein, Representing a first target transformation matrix. Based on the first target transformation matrix, transforming the initial head gesture to a coordinate system of the second camera to obtain a reference head gesture corresponding to the first face image.

The schematic diagram of reference head pose acquisition shown in fig. 4 (b) may be obtained by further obtaining a conversion matrix of the first camera and the two-dimensional code array and a conversion matrix of the second camera and the two-dimensional code array when the target shooting object includes the two-dimensional code array, and converting the initial head pose into a coordinate system of the second camera according to the conversion matrix of the first camera and the two-dimensional code array and the conversion matrix of the second camera and the two-dimensional code array, so as to obtain the reference head pose corresponding to the first face image.

According to the mode, the first target conversion matrix is determined through the first conversion matrix, the second conversion matrix, the third conversion matrix and the fourth conversion matrix, and then the initial head gesture is converted to the coordinate system of the second camera based on the first target conversion matrix, so that the reference head gesture corresponding to the first face image is obtained.

In another example, the initial head pose is converted into a coordinate system of the second camera, and a reference head pose corresponding to the first face image is obtained. Specifically, a first conversion matrix of a target shooting object and a first camera and a fifth conversion matrix of the target shooting object and a second camera are obtained, a second target conversion matrix of the first camera and the second camera is determined based on the first conversion matrix and the fifth conversion matrix, and then the initial head gesture is converted to a coordinate system of the second camera based on the second target conversion matrix, so that a reference head gesture corresponding to a first face image is obtained.

Fig. 5 is a schematic view of a scene of another method for acquiring a head pose according to an embodiment of the present application.

Exemplary, as shown in FIG. 5, in the case that the target photographing object is located behind the user, a first transformation matrix of the target photographing object and the first camera is obtainedAnd a fifth conversion matrix of the target shooting object and the second cameraMultiplying the first conversion matrix and the fifth conversion matrix to determine a second target conversion matrix of the first camera and the second camera, wherein the specific expression is as follows:

Wherein, And representing a second target transformation matrix, and transforming the initial head gesture to a coordinate system of a second camera based on the second target transformation matrix to obtain a reference head gesture corresponding to the first face image.

According to the mode, the second target conversion matrix is determined based on the first conversion matrix and the fifth conversion matrix, and the initial head gesture is converted into the coordinate system of the second camera through the second target conversion matrix, so that the reference head gesture corresponding to the first face image is obtained.

Fig. 6 is a schematic flow chart of another method for acquiring a head pose according to an embodiment of the present application.

By way of example, the method shown in FIG. 6 may be performed by a controller or chip in a vehicle; wherein the controller may be controller 820 as described in fig. 8.

Illustratively, as shown in FIG. 6, the method 600 includes the following processes:

s601, acquiring the initial head posture of the user acquired by the inertial measurement unit at the initial moment.

The inertial measurement unit and the first camera are worn on the head of the user, and the inertial measurement unit and the first camera are fixedly connected together. The inertial measurement unit and the first camera are in the same coordinate system by adjusting the internal parameters of the inertial measurement unit and the internal parameters of the first camera. Thus, the initial head pose of the user acquired at the initial moment is in the first camera coordinate system.

S602, acquiring a first image and a second image.

The first camera may capture a first image of the target subject at an initial time and a second image of the target subject at the target time.

S603, obtaining an external reference matrix of each two-dimensional code in the first image, an external reference matrix of each two-dimensional code in the second image and an external reference matrix of the target two-dimensional code.

The target shooting object comprises a two-dimensional code array, and the external reference matrix of each two-dimensional code in the first image, the external reference matrix of each two-dimensional code in the second image and the external reference matrix of the target two-dimensional code are obtained through a two-dimensional code detection algorithm.

S604, converting the first image to a target coordinate system based on the external parameter matrix of each two-dimensional code in the first image and the external parameter matrix of the target two-dimensional code to obtain a first two-dimensional code image.

Optionally, the target two-dimensional code may be any two-dimensional code in the two-dimensional code array.

The method includes the steps of obtaining an external reference matrix of each two-dimensional code in a first image and an external reference matrix of a target two-dimensional code, and then converting the first image to a target coordinate system based on the external reference matrix of each two-dimensional code in the first image and the external reference matrix of the target two-dimensional code to obtain a first two-dimensional code image.

S605, converting the second image to a target coordinate system based on the external reference matrix of each two-dimensional code in the second image and the external reference matrix of the target two-dimensional code to obtain a second two-dimensional code image.

The second image is converted to a target coordinate system based on the outer parameter matrix of each two-dimensional code and the outer parameter matrix of the target two-dimensional code in the second image after the outer parameter matrix of each two-dimensional code and the outer parameter matrix of the target two-dimensional code in the second image are obtained, so that a second two-dimensional code image is obtained.

S606, generating the head posture change amount based on the image corner of the first two-dimensional code image and the image corner of the second two-dimensional code image.

The camera pose at the initial moment is determined based on the image corner of the first two-dimensional code image, the camera pose at the target moment is determined through the image corner of the second two-dimensional code image, and then the head pose change amount is generated according to the camera pose at the initial moment and the camera pose at the target moment.

Optionally, the method for acquiring the camera pose may refer to the method in the foregoing disclosed embodiment, which is not described herein.

S607, based on the initial head pose and the head pose variation amount, a target head pose of the user at the target time is generated.

For example, an initial head pose and a head pose change amount of a coordinate system of a first camera at an initial moment are determined, and a target head pose of a user at a target moment in a coordinate system of a second camera is determined through a product of the initial head pose and the head pose change amount.

S608, a first conversion matrix of the target shooting object and the first camera, a second conversion matrix of the target shooting object and the third camera, a third conversion matrix of the calibration plate and the third camera, and a fourth conversion matrix of the calibration plate and the second camera are obtained.

For example, in the case where the target photographing object is located in front of the user, a first conversion matrix of the target photographing object and the first camera, a second conversion matrix of the target photographing object and the third camera, a third conversion matrix of the calibration plate and the third camera, and a fourth conversion matrix of the calibration plate and the second camera are obtained.

S609, determining a first target conversion matrix of the first camera and the second camera based on the first conversion matrix, the second conversion matrix, the third conversion matrix, and the fourth conversion matrix.

For example, after determining the first, second, third and fourth conversion matrices, a first target conversion matrix of the first and second cameras is determined by multiplying the first, second, third and fourth conversion matrices.

S610, a first face image and a second face image are acquired.

The first face image of the user acquired by the second camera at the initial moment and the second face image of the user acquired by the second camera at the target moment are acquired.

S611, based on the first target transformation matrix, transforming the initial head gesture to a coordinate system of the second camera to obtain a reference head gesture corresponding to the first face image, and transforming the target head gesture to the coordinate system of the second camera to obtain the reference head gesture corresponding to the second face image.

The method includes the steps of determining a target head posture of a user in a coordinate system of a first camera at a target moment, determining an initial head posture of the user in the coordinate system of the first camera at an initial moment, determining a first target conversion matrix of the first camera and a second camera, determining a first face image and a second face image, converting the initial head posture to the coordinate system of the second camera based on the first target conversion matrix to obtain a reference head posture corresponding to the first face image, and converting the target head posture to the coordinate system of the second camera to obtain the reference head posture corresponding to the second face image.

S612, generating a mapping relation between the face image and the head gesture based on the first face image, the reference head gesture corresponding to the first face image, the second face image and the reference head gesture corresponding to the second face image.

By way of example, model training can be performed by using the acquired face image and the reference head gesture corresponding to the face image, gesture recognition can be performed by using the trained model, accuracy of personalized function recognition can be improved, and driving safety can be further improved.

Fig. 7 is a schematic flow chart of another method for acquiring a head pose according to an embodiment of the present application.

By way of example, the method shown in FIG. 7 may be performed by a controller or chip in a vehicle; wherein the controller may be controller 820 as described in fig. 8.

Illustratively, as shown in FIG. 7, the method 700 includes the following processes:

s710, acquiring the initial head gesture of the user acquired by the inertial measurement unit at the initial moment.

S720, acquiring a first image and a second image.

Alternatively, the target photographing object may include a two-dimensional code array, a calibration plate, a two-dimensional code image, a calibration plate array, and the like, which are not particularly limited herein.

S730, generating a head pose variation based on the image corner of the first image and the image corner of the second image.

If the target shooting object includes a calibration plate, the head posture change amount is generated by an image corner of a first image of the calibration plate acquired by the first camera at the initial moment and an image corner of a second image of the calibration plate acquired by the first camera at the target moment.

Specifically, the camera pose at the initial moment corresponding to the first image is determined through the image corner of the first image, and the camera pose at the target moment corresponding to the second image is determined through the image corner of the second image. And generating a head posture change amount according to the camera posture at the initial moment and the camera posture at the target moment. The specific manner may refer to the foregoing disclosed embodiments, and is not described herein.

S740, generating a target head pose of the user at the target time based on the initial head pose and the head pose variation.

S750, a first conversion matrix of the target shooting object and the first camera and a fifth conversion matrix of the target shooting object and the second camera are obtained.

For example, in a case where the target photographic subject is located behind the user, a first conversion matrix of the target photographic subject and the first camera, and a fifth conversion matrix of the target photographic subject and the second camera are acquired.

S760, determining a second target conversion matrix of the first camera and the second camera based on the first conversion matrix and the fifth conversion matrix.

For example, after determining the first conversion matrix of the target photographing object and the first camera and the fifth conversion matrix of the target photographing object and the second camera, the second target conversion matrix of the first camera and the second camera is determined by multiplying the first conversion matrix and the fifth conversion matrix.

S770, a first face image and a second face image are acquired.

The first face image of the user acquired by the second camera at the initial moment and the second face image of the user acquired by the second camera at the target moment are acquired. It will be appreciated that each face image corresponds to a head pose of the user.

S780, converting the initial head gesture to a coordinate system of a second camera based on the second target conversion matrix to obtain a reference head gesture corresponding to the first face image, and converting the target head gesture to the coordinate system of the second camera to obtain the reference head gesture corresponding to the second face image.

The method includes the steps of determining a target head posture of a user in a coordinate system of a first camera at a target moment, determining an initial head posture of the user in the coordinate system of the first camera at an initial moment, determining a second target conversion matrix of the first camera and a second camera, determining a first face image and a second face image, converting the initial head posture to the coordinate system of the second camera based on the second target conversion matrix to obtain a reference head posture corresponding to the first face image, and converting the target head posture to the coordinate system of the second camera to obtain the reference head posture corresponding to the second face image.

S790, generating a mapping relation between the face image and the head gesture based on the first face image, the reference head gesture corresponding to the first face image, the second face image and the reference head gesture corresponding to the second face image.

It can be understood that through the above mode, more face images and the reference head gestures corresponding to the face images can be obtained, further model training is performed through the collected face images and the reference head gestures corresponding to the face images, gesture recognition is performed through the trained models, accuracy of personalized function recognition is improved, and driving safety is further improved.

Fig. 8 is a schematic frame diagram of a head posture acquisition system according to an embodiment of the present application.

The head pose acquisition system 800, illustratively, includes an inertial measurement unit 810, a controller 820, and a first camera 830. The inertial measurement unit 810 and the first camera 830 are worn on the head of the user. The inertial measurement unit 810 is configured to acquire an initial head pose of a user at an initial moment; the first camera 830 is configured to acquire a first image of a target photographic subject at an initial time and acquire a second image of the target photographic subject at the target time; the controller 820 is configured to generate a head pose change amount based on the image corner of the first image and the image corner of the second image, and generate a target head pose of the user at a target moment based on the initial head pose and the head pose change amount.

Fig. 9 is a schematic view of a scene of a head gesture acquisition system according to an embodiment of the present application.

For example, the target shooting object is located in front of the user, the target shooting object includes a two-dimensional code array, and the head gesture acquisition system may further include a second camera and a third camera. The second camera is used for collecting a first face image of the user at the initial moment and collecting a second face image of the user at the target moment. The third camera is used for converting the coordinate system of each two-dimensional code in the two-dimensional code array into a target coordinate system. In addition, the third camera is also used for converting the coordinate system of the first camera into the coordinate system of the second camera. The conversion manner of the coordinate system may refer to the manner in the foregoing disclosed embodiments, and will not be described herein.

Further, converting the initial head gesture into a coordinate system of a second camera through a controller to obtain a reference head gesture corresponding to the first face image; and converting the target head gesture into a coordinate system of the second camera through the controller to obtain a reference head gesture corresponding to the second face image. And generating a mapping relation between the face image and the head gesture based on the first face image, the reference head gesture corresponding to the first face image, the second face image and the reference head gesture corresponding to the second face image.

In another example, in a case where the target photographic subject is located behind the user, the head pose acquisition system may further include a second camera for acquiring a first face image of the user at an initial time and acquiring a second face image of the user at a target time. And converting the coordinate system of the first camera into the coordinate system of the second camera through the target shooting object. The conversion manner of the coordinate system may refer to the manner in the foregoing disclosed embodiments, and will not be described herein.

It should be understood that the above description is intended to aid those skilled in the art in understanding the embodiments of the present application, and is not intended to limit the embodiments of the present application to the specific values or particular scenarios illustrated. It will be apparent to those skilled in the art from the foregoing description that various equivalent modifications or variations can be made, and such modifications or variations are intended to be within the scope of the embodiments of the present application.

The method for acquiring the head posture provided by the embodiment of the application is described in detail above with reference to fig. 1 to 9; an embodiment of the acquisition device of the present application will be described in detail with reference to fig. 10 and 11. It should be understood that the apparatus in the embodiments of the present application may perform the methods of the foregoing embodiments of the present application, that is, specific working procedures of the following various products may refer to corresponding procedures in the foregoing method embodiments.

Fig. 10 is a schematic structural diagram of a head posture acquisition device according to an embodiment of the present application.

Illustratively, as shown in FIG. 9, the acquisition device 900 includes:

Acquisition module 910: the method comprises the steps of acquiring an initial head gesture of a user acquired at an initial moment by an inertial measurement unit and a first image of a target shooting object acquired at the initial moment by a first camera; the inertial measurement unit and the first camera are worn on the head of the user;

the acquisition module 920: the method comprises the steps of acquiring a second image of a target shooting object acquired by a first camera at a target moment;

the generating module 930: the method comprises the steps of generating a head posture change amount based on image corner points of a first image and image corner points of a second image;

Determination module 940: the method is used for generating the target head posture of the user at the target moment based on the initial head posture and the head posture variation.

Optionally, as an embodiment, the target shooting object includes a two-dimensional code array, and the generating module 930 is specifically configured to:

converting the first image and the second image into a target coordinate system of a target two-dimensional code in the two-dimensional code array to obtain a first two-dimensional code image and a second two-dimensional code image; and generating the head posture variation based on the image corner of the first two-dimensional code image and the image corner of the second two-dimensional code image.

Optionally, as an embodiment, the generating module 930 is specifically configured to:

Acquiring an external reference matrix of each two-dimensional code in a first image, an external reference matrix of each two-dimensional code in a second image and an external reference matrix of a target two-dimensional code; converting the first image to a target coordinate system based on the external reference matrix of each two-dimensional code in the first image and the external reference matrix of the target two-dimensional code to obtain a first two-dimensional code image; and converting the second image into a target coordinate system based on the external reference matrix of each two-dimensional code in the second image and the external reference matrix of the target two-dimensional code to obtain a second two-dimensional code image.

Optionally, as an embodiment, the collecting device 900 is further configured to:

Acquiring a first face image of a user acquired by a second camera at an initial moment; converting the initial head gesture into a coordinate system of a second camera to obtain a reference head gesture corresponding to the first face image; acquiring a second face image of a user acquired by a second camera at a target moment; converting the target head gesture into a coordinate system of a second camera to obtain a reference head gesture corresponding to a second face image; and generating a mapping relation between the face image and the head gesture based on the first face image, the reference head gesture corresponding to the first face image, the second face image and the reference head gesture corresponding to the second face image.

Optionally, as an embodiment, the collecting device 900 is specifically further configured to:

Acquiring a first conversion matrix of a target shooting object and a first camera, a second conversion matrix of the target shooting object and a third camera, a third conversion matrix of a calibration plate and the third camera and a fourth conversion matrix of the calibration plate and the second camera; determining a first target conversion matrix of the first camera and the second camera based on the first conversion matrix, the second conversion matrix, the third conversion matrix and the fourth conversion matrix; based on the first target transformation matrix, transforming the initial head gesture to a coordinate system of the second camera to obtain a reference head gesture corresponding to the first face image.

Acquiring a first conversion matrix of a target shooting object and a first camera and a fifth conversion matrix of the target shooting object and a second camera; determining a second target conversion matrix of the first camera and the second camera based on the first conversion matrix and the fifth conversion matrix; and converting the initial head gesture to a coordinate system of a second camera based on the second target conversion matrix to obtain a reference head gesture corresponding to the first face image.

Optionally, as an embodiment, the second camera is an infrared camera.

It should be noted that the above-mentioned acquisition device 900 is embodied in the form of a functional unit. The term "module" herein may be implemented in software and/or hardware, and is not specifically limited thereto.

For example, a "module" may be a software program, a hardware circuit, or a combination of both that implements the functionality described above. The hardware circuitry may include Application Specific Integrated Circuits (ASICs), electronic circuits, processors (e.g., shared, proprietary, or group processors, etc.) and memory for executing one or more software or firmware programs, merged logic circuits, and/or other suitable components that support the described functions.

Thus, the elements of the examples described in the embodiments of the present application can be implemented in electronic hardware, or in a combination of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

Illustratively, as shown in FIG. 10, the vehicle 10 includes: the device comprises a memory 11 and a processor 12, wherein executable program codes 13 are stored in the memory 11, and the processor 12 is used for calling and executing the executable program codes 13 to execute a head gesture acquisition method.

The memory 11 may be used for storing a related program of the head pose acquisition method provided in the embodiment of the present application; the processor 12 may call a related program of the head pose acquisition method stored in the memory 11 to execute the head pose acquisition method according to the embodiment of the present application; for example, acquiring an initial head posture of a user acquired by an inertial measurement unit at an initial time and a first image of a target shooting object acquired by a first camera at the initial time, wherein the inertial measurement unit and the first camera are worn on the head of the user; acquiring a second image of a target shooting object acquired by a first camera at a target moment; generating a head posture variation based on the image corner of the first image and the image corner of the second image; based on the initial head pose and the head pose variation, a target head pose of the user at a target moment is generated.

In this embodiment, the functional modules of the apparatus may be divided according to the above method example, for example, each functional module may be corresponding to one processing module, or two or more functions may be integrated into one processing module, where the integrated modules may be implemented in a hardware form. It should be noted that, in this embodiment, the division of the modules is schematic, only one logic function is divided, and another division manner may be implemented in actual implementation.

In the case of dividing the respective functional modules by the respective functions, the apparatus may further include an acquisition module, a generation module, a determination module, and the like. It should be noted that, all relevant contents of each step related to the above method embodiment may be cited to the functional description of the corresponding functional module, which is not described herein.

It should be understood that the apparatus provided in this embodiment is used to perform the above-described method for acquiring a head posture, so that the same effects as those of the above-described implementation method can be achieved.

In case of an integrated unit, the apparatus may comprise a processing module, a memory module. Wherein, when the device is applied to a vehicle, the processing module can be used for controlling and managing the action of the vehicle. The memory module may be used to support the vehicle in executing mutual program code, etc.

Wherein a processing module may be a processor or controller that may implement or execute the various illustrative logical blocks, modules, and circuits described in connection with the present disclosure. A processor may also be a combination of computing functions, including for example one or more microprocessors, digital Signal Processing (DSP) and microprocessor combinations, etc., and a memory module may be a memory.

In addition, the device provided by the embodiment of the application can be a chip, a component or a module, wherein the chip can comprise a processor and a memory which are connected; the memory is used for storing instructions, and when the processor calls and executes the instructions, the chip can be made to execute the head gesture acquisition method provided by the embodiment.

The present application also provides a computer readable storage medium having stored therein computer program code which, when run on a computer, causes the computer to execute the above-mentioned related method steps to implement a head pose acquisition method provided by the above-mentioned embodiments. The computer readable storage medium may include, among other things, any type of disk including floppy disks, optical disks, digital versatile disks (Digital Video Disc, DVD), compact disk Read-Only Memory (CD-ROM), micro-drives, and magneto-optical disks, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable programmable Read-Only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ ONLY MEMORY, EEPROM), dynamic random access Memory (Dynamic Random Access Memory, DRAM), image random access Memory (Video Random Access Memory, VRAM), flash Memory devices, magnetic or optical cards, nanosystems (including molecular Memory ICs), or any type of medium or device suitable for storing instructions and/or data.

The present application also provides a computer program product, which when run on a computer, causes the computer to perform the above-mentioned related steps to implement a method for acquiring a head pose according to the above-mentioned embodiments.

The vehicle, the computer readable storage medium, the computer program product or the chip provided by the present application are used for executing the corresponding method provided above, and therefore, the advantages achieved by the present application may refer to the advantages in the corresponding method provided above, and will not be described herein.

It will be appreciated by those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to perform all or part of the functions described above.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another apparatus, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims

1. A method of acquiring head pose, the method comprising:

Acquiring an initial head gesture of a user acquired at an initial moment by an inertial measurement unit and a first image of a target shooting object acquired at the initial moment by a first camera; the inertial measurement unit and the first camera are worn on the head of the user;

acquiring a second image of the target shooting object acquired by the first camera at a target moment;

Generating a head posture variation based on the image corner of the first image and the image corner of the second image;

And generating a target head posture of the user at the target moment based on the initial head posture and the head posture change amount.

2. The method according to claim 1, wherein the target photographic subject includes a two-dimensional code array, the generating the head pose change amount based on the image corner of the first image and the image corner of the second image includes:

converting the first image and the second image into a target coordinate system of a target two-dimensional code in the two-dimensional code array to obtain a first two-dimensional code image and a second two-dimensional code image;

And generating the head posture change amount based on the image corner of the first two-dimensional code image and the image corner of the second two-dimensional code image.

3. The method of claim 2, wherein the converting the first image and the second image into the target coordinate system of the target two-dimensional code in the two-dimensional code array to obtain the first two-dimensional code image and the second two-dimensional code image comprises:

acquiring an external reference matrix of each two-dimensional code in the first image, an external reference matrix of each two-dimensional code in the second image and an external reference matrix of the target two-dimensional code;

based on the external parameter matrix of each two-dimensional code in the first image and the external parameter matrix of the target two-dimensional code, converting the first image to the target coordinate system to obtain the first two-dimensional code image;

And converting the second image to the target coordinate system based on the external parameter matrix of each two-dimensional code in the second image and the external parameter matrix of the target two-dimensional code to obtain the second two-dimensional code image.

4. The method according to claim 1, wherein the method further comprises:

Acquiring a first face image of the user acquired by a second camera at the initial moment;

Converting the initial head gesture into a coordinate system of the second camera to obtain a reference head gesture corresponding to the first face image;

acquiring a second face image of the user acquired by the second camera at the target moment;

Converting the target head gesture into a coordinate system of the second camera to obtain a reference head gesture corresponding to the second face image;

And generating a mapping relation between the face image and the head gesture based on the first face image, the reference head gesture corresponding to the first face image, the second face image and the reference head gesture corresponding to the second face image.

5. The method of claim 4, wherein the converting the initial head pose into a second camera coordinate system to obtain a reference head pose corresponding to the first face image comprises:

acquiring a first conversion matrix of the target shooting object and the first camera, a second conversion matrix of the target shooting object and the third camera, a third conversion matrix of a calibration plate and the third camera and a fourth conversion matrix of the calibration plate and the second camera;

Determining a first target conversion matrix of the first camera and the second camera based on the first conversion matrix, the second conversion matrix, the third conversion matrix and the fourth conversion matrix;

And converting the initial head gesture to a coordinate system of the second camera based on the first target conversion matrix to obtain a reference head gesture corresponding to the first face image.

6. The method of claim 4, wherein the converting the initial head pose into the coordinate system of the second camera to obtain the reference head pose corresponding to the first face image includes:

Acquiring a first conversion matrix of the target shooting object and the first camera and a fifth conversion matrix of the target shooting object and the second camera;

Determining a second target conversion matrix of the first camera and the second camera based on the first conversion matrix and the fifth conversion matrix;

And converting the initial head gesture to a coordinate system of the second camera based on the second target conversion matrix to obtain a reference head gesture corresponding to the first face image.

7. The method of any one of claims 4 to 6, wherein the second camera is an infrared camera.

8. A head pose acquisition device, the acquisition device comprising:

The acquisition module is used for acquiring the initial head gesture of the user acquired by the inertial measurement unit at the initial moment and a first image of a target shooting object acquired by the first camera at the initial moment; the inertial measurement unit and the first camera are worn on the head of the user;

the acquisition module is used for acquiring a second image of the target shooting object acquired by the first camera at the target moment;

the generation module is used for generating a head posture variation based on the image corner of the first image and the image corner of the second image;

and the determining module is used for generating a target head gesture of the user at the target moment based on the initial head gesture and the head gesture change amount.

9. The head posture acquisition system is characterized by comprising an inertial measurement unit, a first camera and a controller;

the inertial measurement unit is used for acquiring the initial head posture of the user at the initial moment;

The first camera is used for acquiring a first image of a target shooting object at the initial moment and acquiring a second image of the target shooting object at the target moment;

The controller is used for generating a head posture change amount based on the image corner of the first image and the image corner of the second image, and generating a target head posture of the user at the target moment based on the initial head posture and the head posture change amount.

10. A vehicle, characterized in that the vehicle comprises:

A memory for storing executable program code;

a processor for calling and running the executable program code from the memory to cause the vehicle to perform the head pose acquisition method according to any of claims 1 to 7.