CN108629283B

CN108629283B - Face tracking method, device, equipment and storage medium

Info

Publication number: CN108629283B
Application number: CN201810283645.9A
Authority: CN
Inventors: 范晓
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2018-04-02
Filing date: 2018-04-02
Publication date: 2022-04-08
Anticipated expiration: 2038-04-02
Also published as: CN108629283A

Abstract

The present disclosure provides a face tracking method, apparatus, device and storage medium, the method comprising: in the current image frame, if two or more tracked faces meet a preset shielding condition, acquiring target face posture information of the non-shielded tracked faces; comparing the target face pose information with sample face pose information prestored in a tracker, and determining the tracker corresponding to the target face pose information; and assigning the non-occluded tracked face to the determined tracker for tracking. By applying the embodiment of the disclosure, the face tracking accuracy can be improved.

Description

Face tracking method, device, equipment and storage medium

Technical Field

The present application relates to the field of target tracking technologies, and in particular, to a method, an apparatus, a device, and a storage medium for face tracking.

Background

The face tracking technology is an important module in the industries of video processing, security assurance, computer intelligence and the like, and provides important information such as target positioning, target prediction and the like for other technical modules. In the process of tracking the human face, the two tracked human faces may be shielded, and mutual shielding of the faces may cause tracking errors, which greatly affects the tracking accuracy. Therefore, in this case, how to achieve correct tracking is very important.

At present, a Kalman filter and a mean shift algorithm can be adopted to realize an anti-shielding face tracking method, and the problem that a tracked face is easily interfered by surrounding similar chrominance objects is solved. However, since different faces are relatively similar objects, when the object that blocks the face is the face, a tracking error may still occur.

Disclosure of Invention

To overcome the problems in the related art, the present disclosure provides a face tracking method, apparatus, device, and storage medium.

According to a first aspect of the embodiments of the present disclosure, there is provided a face tracking method, the method including:

in the current image frame, if two or more tracked faces meet a preset shielding condition, acquiring target face posture information of the non-shielded tracked faces;

comparing the target face pose information with sample face pose information prestored in a tracker, and determining the tracker corresponding to the target face pose information;

and assigning the non-occluded tracked face to the determined tracker for tracking.

In an optional implementation manner, the preset occlusion condition includes:

at least two tracked faces overlap; or the like, or, alternatively,

the distance between at least two tracked faces is smaller than or equal to a preset distance threshold.

In an optional implementation manner, the tracked face meeting the preset occlusion condition includes: the method comprises the following steps that the occluded tracked face and the non-occluded tracked face are correspondingly provided with corresponding trackers, and the method further comprises the following steps:

and if the number of the occluded tracked faces is one, the occluded tracked faces are allocated to an unassigned tracker for tracking processing.

In an optional implementation manner, the comparing the target face pose information with sample face pose information pre-stored in a tracker to determine the tracker corresponding to the target face pose information includes:

determining the similarity between the target face pose information and sample face pose information prestored in each tracker;

and determining the tracker corresponding to the sample face pose information with the highest similarity to the target face pose information as the tracker corresponding to the target face pose information.

In an optional implementation manner, the face pose information includes an angle of turning up and down, an angle of turning left and right, and an angle of internal rotation of the tracked face; the similarity is determined based on the manhattan distance between the target face pose information and sample face pose information pre-stored in each tracker.

In an optional implementation manner, the face pose information includes angle information of a tracked face and predefined face key point information for representing a face pose, where the angle information includes an angle of turning up and down, an angle of turning left and right, and an angle of rotation;

the determining of the similarity between the target face pose information and the sample face pose information prestored in each tracker includes:

determining a Manhattan distance between a target angle vector and a sample angle vector, wherein the target angle vector is formed by angle information in the target face attitude information, and the sample angle vector is formed by angle information in the sample face attitude information in a tracker;

determining the cosine distance of an included angle between a target key point vector and a sample key point vector, wherein the target key point vector is formed by predefined face key point information in the target face pose information, and the sample key point vector is formed by predefined face key point information in sample face pose information in a tracker;

and performing weighted summation on the Manhattan distance and the included angle cosine distance to obtain an attitude distance, wherein the attitude distance is used for indicating the similarity between target face attitude information and sample face attitude information, and the smaller the attitude distance is, the greater the similarity is.

In an optional implementation manner, the predefined face key point information includes two or more of the following:

the ratio of the horizontal distance between the centers of the two eyes to the distance between the two eyes;

the ratio of the distance from the tip of the nose to the left eye to the distance from the tip of the nose to the right eye;

the ratio of the distance from the tip of the nose to the center of the two eyes to the distance from the tip of the nose to the center of the two corners of the mouth.

According to a second aspect of the embodiments of the present disclosure, there is provided a face tracking apparatus, the apparatus comprising:

the information acquisition module is configured to acquire target face posture information of an unobstructed tracked face in a current image frame if two or more tracked faces meet a preset occlusion condition;

a tracker determination module configured to compare the target face pose information with sample face pose information prestored in a tracker and determine a tracker corresponding to the target face pose information;

a face tracking module configured to assign the non-occluded tracked face to the determined tracker for tracking.

In an optional implementation manner, the preset occlusion condition includes:

at least two tracked faces overlap; or the like, or, alternatively,

In an optional implementation manner, the tracked face meeting the preset occlusion condition includes: the face tracking module is further configured to:

In an alternative implementation, the tracker determination module includes:

the similarity determination submodule is configured to determine similarity between the target face pose information and sample face pose information prestored in each tracker;

a tracker determination sub-module configured to determine a tracker corresponding to sample face pose information that is most similar to the target face pose information as the tracker corresponding to the target face pose information.

the similarity determination submodule is specifically configured to:

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the steps of any one of the methods described above.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

by applying the embodiment of the disclosure, in the current image frame, if two or more tracked faces meet the preset shielding condition, target face posture information of the non-shielded tracked faces is obtained, the target face posture information is compared with sample face posture information prestored in the tracker, and the tracker corresponding to the target face posture information is determined, so that the non-shielded tracked faces are distributed to the determined tracker for tracking, and the non-shielded tracked faces can be distributed to the correct tracker for tracking.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a schematic view of an application scenario of a face tracking method according to an exemplary embodiment of the present disclosure.

FIG. 2 is a flow chart illustrating a face tracking method according to an exemplary embodiment of the present disclosure.

FIG. 3 is a flow chart illustrating another face tracking method according to an exemplary embodiment of the present disclosure.

FIG. 4 is a block diagram illustrating a face tracking device according to an exemplary embodiment of the present disclosure.

FIG. 5 is a block diagram illustrating another face tracking device according to an exemplary embodiment of the present disclosure.

FIG. 6 is a block diagram illustrating an apparatus for face tracking according to an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "at … …" depending on the context.

In the process of tracking the human face, the two tracked human faces may be shielded, and mutual shielding of the faces may cause tracking errors, which greatly affects the tracking accuracy. Therefore, in this case, how to achieve correct tracking is very important.

The face tracking method provided by the embodiment of the disclosure can be applied to a scene in which at least two tracked faces are overlapped. Wherein, one or more tracked faces may obscure one or more other tracked faces. The example is given by overlapping two tracked faces. As shown in fig. 1, fig. 1 is a schematic view of an application scenario of a face tracking method according to an exemplary embodiment of the present disclosure. When two tracked faces overlap during motion, i.e. one tracked face is occluded by the other tracked face in the scene. As shown in fig. 1, the tracked face a and the tracked face B are initially spaced by a certain distance, and then the running tracks of the two tracked faces overlap or approximately overlap, so that the tracked face a is blocked by the tracked face B. The tracked face A corresponds to a tracker, and the tracked face B also corresponds to a tracker. At this moment, because two tracked faces are overlapped, the tracking algorithm can solve the problem that the tracked faces are easily interfered by surrounding similar colorimetric objects, when two very close objects (faces) are shielded, correct tracking cannot be realized, and after the two tracked faces are separated, a wrong tracking condition may occur, that is, a tracker which initially tracks the tracked face a starts to track the tracked face B, so that the face tracking accuracy is reduced.

In view of this, the present disclosure provides a face tracking method, where in a current image frame, if two or more tracked faces meet a preset occlusion condition, target face pose information of an unobstructed tracked face is obtained, the target face pose information is compared with sample face pose information prestored in a tracker, and a tracker corresponding to the target face pose information is determined, so that the unobstructed tracked face is assigned to the determined tracker for tracking, and the unobstructed tracked face is assigned to a correct tracker for tracking. The embodiments of the present disclosure are described below with reference to the accompanying drawings.

As shown in fig. 2, fig. 2 is a flowchart illustrating a face tracking method according to an exemplary embodiment of the present disclosure, including the following steps:

in step 201, in a current image frame, if two or more tracked faces meet a preset occlusion condition, target face posture information of the tracked faces which are not occluded is obtained;

in step 202, comparing the target face pose information with sample face pose information prestored in a tracker, and determining the tracker corresponding to the target face pose information;

in step 203, the non-occluded tracked face is assigned to the determined tracker for tracking.

The method of the embodiments of the present disclosure may be performed by an electronic device having computing capabilities, for example, the electronic device may be a smartphone, a tablet, a PDA (Personal Digital Assistant), a computer, a video surveillance server, or the like.

The disclosed embodiment tracks the face by using a tracker, and the tracker (T) may include two parts, namely a filter (F) required by face tracking and pose information (P) of the face, wherein T is (F, P). The method helps to remove tracking errors when the faces are mutually shielded by using the pose information of the faces.

In an alternative implementation, the filter may be a correlation filter, so as to implement improvement based on a target tracking algorithm (MOSSE) based on the correlation filter, and obtain a new tracking algorithm. The MOSSE algorithm is illustrated below:

in an initial frame, namely a first frame, generating sample data (F) according to a face region of an image, and generating a response image (G) according to the center position of a face; generating a correlation filter (H) by combining the sample data and the response graph, specifically, the correlation filter (H) may be: h₁＝G₁/F₁。

Where the index 1 indicates the first frame. In this step, sample data may be generated according to the result of the manual recognition. The sample data represents an image of an area where a human face is located. The response map may be determined by combining the sample data with a gaussian function. The value in the response map indicates the likelihood that the position corresponding to the value is a human face.

In the second frame image, generating sample data of the current frame according to the sample data in the initial frame, and performing correlation filtering and correlation filtering on the first frameAnd carrying out convolution on the sample data of the second frame to obtain a response image of the second frame. When generating sample data of a second frame from sample data in a first frame, a position range may be determined by enlarging a position of the sample data in the first frame, and an image in the position range in an image of the second frame may be used as the sample data. And finding the position corresponding to the maximum response value in the response image, wherein the position is the position of the center position of the face in the second frame image. And updating the relevant filter according to the sample data of the second frame and the response graph of the second frame: h₂＝G₂/F₂. Where 2 denotes the second frame, the updated correlation filter may be combined with the filter of the first frame to form an updated filter of the second frame, which is used for the determination of the response map in the image of the third frame.

In the ith frame, determining sample data of the ith frame according to the identification result of the ith-1 frame, performing convolution on the sample data and the updated filter of the ith-1 frame to obtain a response image of the ith frame, finding a position corresponding to the maximum response value in the response image, wherein the position is the position of the center position of the face in the image of the ith frame, and updating a related filter according to the sample data of the ith frame and the response image of the ith frame: h_i＝G_i/F_i。

Where i represents the sequence number of the frame, the updated correlation filter may be used for determining the response map in the image of the (i + 1) th frame. When sample data of the i-th frame is specified based on the recognition result of the i-1-th frame, a position range may be specified by enlarging the position of the recognition result in the i-1-th frame, and an image in the position range in the image of the i-th frame may be used as the sample data of the i-th frame.

The scene to which the embodiment of the present disclosure is applied may be that two or more tracked faces (assumed to include at least a tracked face a and a tracked face B) exist in an image, and each tracked face corresponds to a corresponding tracker for tracking the tracked face respectively. In the current image frame, if two or more tracked faces meet a preset shielding condition, target face posture information of the tracked faces which are not shielded is obtained.

The preset shielding condition is a preset condition for triggering execution of face pose information acquisition. The tracked faces meeting the preset shielding condition have a mutual shielding relation.

In one embodiment, the preset occlusion condition may be that at least two tracked faces overlap. The overlap may be complete overlap or partial overlap. From the view angle of the camera, the tracked face close to the camera can be shielded from the tracked face far away from the camera. The tracked face may face or nearly face the camera. Whether overlap occurs in the tracked face can be determined based on whether the tracked face has a response graph or whether a maximum value meeting the requirement exists in the response graph.

For example, a first response graph may be determined according to sample data of a current frame and a filter in a tracker corresponding to the tracked face a, and a second response graph may be determined according to sample data of a current frame and a filter in a tracker corresponding to the tracked face B, and when the first response graph or the second response graph does not exist, or a maximum value meeting requirements does not exist in the first response graph or the second response graph, it indicates that the tracked face a or the tracked face B overlaps. It should be noted that the sample data of the current frame used in determining the first response map is often different from the sample data of the current frame used in determining the second response map, because the sample data of the current frame is determined according to the recognition result of the previous frame, and the recognition results of the two tracked faces in the previous frame are generally different.

The tracked face which is not occluded can be a face existing in the response image, or a face with the maximum response value can be determined in the response image.

In the embodiment, when at least two tracked faces are overlapped, the target face pose information of the tracked face which is not shielded is obtained, the tracked face which is not shielded can be distributed to a correct tracker based on the face pose information, and the tracking accuracy is improved.

In another embodiment, the preset occlusion condition may be that the distance between at least two tracked faces is less than or equal to a preset distance threshold. Wherein the preset distance threshold may be a distance threshold characterizing overlap or near overlap of tracked objects.

Therefore, if the preset distance threshold is set as the distance threshold representing the overlap of the tracked objects, the overlap judgment of the tracked faces can be realized. If the preset distance threshold is set as the distance threshold representing the tracked object approaching to overlap, the target face posture information of the unshielded tracked face can be obtained when the tracked objects overlap, and the face tracking is assisted by the face posture; and when two tracked objects are about to be overlapped or just separate from the overlapped objects, target face posture information of the unshielded tracked face is also acquired, and face tracking is assisted by using the face posture.

The tracked faces meeting the preset occlusion condition can comprise the occluded tracked faces and the non-occluded tracked faces. For ease of description, the non-occluded tracked faces may be referred to as a first type of tracked face, and the occluded tracked faces may be referred to as a second type of tracked face.

Due to the application scenario of the embodiment of the present disclosure, it may be that one or more tracked faces occlude one or more other tracked faces. Therefore, the number of the non-occluded tracked faces is at least 1, and when the number of the non-occluded tracked faces is multiple, the non-occluded tracked faces can be distinguished through the face pose information. The number of the occluded tracked faces can be 1 or more. When the preset distance threshold is set as the distance threshold representing that the tracked object is close to overlap, the number of the tracked faces occluded in a certain period of time may also be 0.

And each tracked face is correspondingly provided with a corresponding tracker, and in one embodiment, if the number of the blocked tracked faces is one, the blocked tracked faces are allocated to the unassigned trackers for tracking. It can be known that since the non-occluded tracked face has determined its tracker, and the number of occluded tracked faces is one, the remaining one tracker can be directly determined as the tracker for tracking the occluded tracked face.

Taking the example of including two tracked faces, from the two trackers, the tracker that determines the non-occluded tracked face based on the face pose information assigns the other tracked face to the other tracker for tracking processing.

Therefore, when the number of the blocked tracked faces is one, the tracker to which the blocked tracked faces belong can be quickly determined, and the blocked tracked faces are distributed to the unassigned trackers for tracking.

It can be understood that, when the number of the occluded tracked faces is multiple, the tracker that determines the non-occluded tracked face based on the face pose information of the tracker may be updated with the image frame when the occluded tracked face gradually evolves into the non-occluded tracked face.

Regarding the face pose information, the face pose information may be information for representing a face pose, and the sample face pose information has a corresponding relationship with the tracker, so that the target face pose information may be compared with sample face pose information pre-stored in the tracker to determine sample face pose information corresponding to the target face pose information, and then the tracker corresponding to the target face pose information may be obtained.

The tracker pre-stores sample face pose information, and the sample face pose information pre-stored in the pre-tracker can be face pose information of a tracked face tracked by the tracker. Therefore, by comparing the target face pose information with the sample face pose information prestored in the tracker, the tracker to which the tracked face corresponding to the target face pose information belongs can be determined, and the face is tracked by using the correct tracker.

In one example, sample face pose information in the tracker may be obtained when initially tracking the corresponding tracked face.

In another example, when the tracked faces may overlap, the face pose information of the tracked faces may be obtained as sample face pose information and stored in the corresponding tracker, so that the obtaining step is executed when needed, and the storage resource and the calculation resource of the device may be saved. Specifically, in the current image frame, if the distance between two tracked faces is smaller than a specific distance threshold, face pose information of the tracked faces is respectively obtained, and the face pose information is stored in a corresponding tracker as sample face pose information according to the corresponding relationship between the tracked faces and the tracker.

For example, taking the tracked face meeting the preset occlusion condition as an example that the tracked face includes a tracked face a and a tracked face B, assuming that the tracked face a corresponds to the tracker a and the tracked face B corresponds to the tracker B, after the face pose information of the tracked face a is acquired, the face pose information is stored in the tracker a as sample face pose information, and after the face pose information of the tracked face B is acquired, the face pose information is stored in the tracker B as sample face pose information.

In an optional implementation manner, the similarity between the target face pose information and sample face pose information prestored in each tracker is determined; and determining the tracker corresponding to the sample face pose information with the highest similarity to the target face pose information as the tracker corresponding to the target face pose information.

Therefore, the embodiment can rapidly determine the sample face pose information similar to the target face pose information through the similarity of the face pose information, and further obtain the tracker to which the tracked face belongs.

When the human face is shielded, the two human face trackers may track the same human face position, and at this time, feature point positioning needs to be carried out on a tracked object to obtain attitude information (Pt), and the tracked human face is classified to a correct tracker by comparing the Pt with the attitude information in the two trackers by utilizing the characteristic that the human face attitude information has continuity.

In one example, the face pose information may include an angle of flip up and down, an angle of flip left and right, and an angle of internal rotation of the tracked face; the similarity is determined based on the manhattan distance between the target face pose information and sample face pose information pre-stored in each tracker.

The face pose information may include an angle of up-down turning (pitch), an angle of left-right turning (yaw), and an angle of inner rotation (roll) of the tracked face. Correspondingly, the target face pose information may include an angle of up-down turning, an angle of left-right turning, and an angle of inner rotation of the non-occluded tracked face. The sample face pose information may include the angle of the face being tracked that the tracker desires to track, flipped up and down, flipped left and right, and rotated inside.

In one example, the pose information of the face may be estimated according to the positions of the feature points of the face by using a position from imaging and Scaling with algorithms, and an angle vector (pitch, yaw, roll) may be obtained. The target angle vector can be obtained based on the angle of the up-down turning, the angle of the left-right turning and the angle of the internal rotation in the target face posture information, and the sample angle vector can be obtained based on the angle of the up-down turning, the angle of the left-right turning and the angle of the internal rotation in the sample face posture information. Meanwhile, different sample face pose information can obtain different sample angle vectors. Therefore, the manhattan distance between the target angle vector and each sample angle vector can be calculated and used as the pose distance, and the similarity between the target face pose information and the sample face pose information can be indicated through the pose distance. The smaller the attitude distance is, the greater the similarity is, and the tracker corresponding to the sample human face attitude information with the smallest attitude distance can be determined as the tracker of the unshielded tracked human face.

In another optional implementation manner, the face pose information includes angle information of a tracked face and predefined face key point information for representing a face pose, where the angle information includes an angle of turning up and down, an angle of turning left and right, and an angle of rotation;

The face pose information comprises angle information of a tracked face and predefined face key point information used for representing face pose. Accordingly, the target face pose information may include angular information of the non-occluded tracked face (e.g., flip up and down, flip left and right, and inner rotation), and predefined face keypoint information for characterizing the face pose of the tracked face. The sample face pose information may include angular information of the tracked face that the tracker desires to track (e.g., flip up and down, flip left and right, and inner rotation), as well as predefined face keypoint information for characterizing the face pose of the tracked face.

In order to extract the pose information of the face, a feature point positioning algorithm, such as an esr (explicit Shape regression) algorithm, may be used to obtain the position of the feature point of the face in the image, and extract the pose information of the face based on the feature point of the face. The predefined face key point information is used for representing the face pose, and can be information which is greatly influenced by the face pose and is formed by key points. In one implementation, the predefined face keypoint information includes two or more of:

the ratio of the distance from the tip of the nose to the center of the two eyes and the distance from the tip of the nose to the center of the two corners of the mouth.

It can be understood that the predefined face key point information can also be other information representing the face pose, especially the ratio information obtained from the key point positions, which can obviously reflect the change of the face pose, thereby improving the accuracy of the tracker. Other predefined face key point information is not listed here.

Specifically, the posture information P includes the following information:

P＝(P_2D,P_3D)

pose information of the face: keypoint vector P_2DComprises (r1, r2, r3)

r1 ratio of horizontal distance between the centers of the two eyes to the interocular distance.

r2 ratio of distance from tip of nose to left eye and distance from tip of nose to right eye.

r3 ratio of the distance from the tip of the nose to the center of the two eyes and the distance from the tip of the nose to the center of the two corners of the mouth.

Pose information of the face: angle vector P_3DComprising (pitch, yaw, roll)

Wherein, the pose information (pitch, yaw, roll) of the face is estimated by using a POSIT algorithm according to the position of the feature point of the face.

pitch represents the angle of the flip up and down.

Raw: representing the angle of the left and right flip.

Ball: representing the angle of rotation in the plane.

Correspondingly, the pose distance is calculated by the target face pose information Pt and the pre-stored face pose information of each sample (assuming that two trackers are included and further including two groups of sample face pose information Px and Py), respectively, and the formula is as follows:

D_p＝w*D_2D+(1-w)*D_3D

wherein D is_2DRepresenting P in target face pose information and sample face pose information_2DDistance between the cosine of the corresponding included angleFrom, D_3DRepresenting P in target face pose information and sample face pose information_3DThe corresponding Manhattan Distance (Manhattan Distance). Therefore, the pose distance between the target face pose information and each group of sample face pose information can be obtained, and the face is distributed to the trackers with smaller pose distances.

According to the embodiment, the human face pose is represented by the predefined human face key point information influenced by the human face pose, so that the pose distance obtained by calculation can reflect the similarity of pose information, and an accurate tracker is obtained.

After determining the tracker corresponding to the target face pose information, the non-occluded tracked face may be assigned to the determined tracker for tracking.

After assigning the non-occluded tracked face to the determined tracker, the filter of the tracker may be updated to enable tracking of the face in the next frame of image. The filter in the tracker corresponding to the occluded tracked face may not be updated. The tracker further comprises a filter, and the method further comprises:

determining a position area image of the unoccluded tracked face in the current image frame according to the position of the unoccluded tracked face in the image of the previous frame of the current image frame;

convolving the position area image with a filter in a tracker corresponding to the unoccluded tracked face in the previous frame image to obtain a response image of the unoccluded tracked face in the current image frame;

determining a position corresponding to a maximum response value in the response image as a position of the unoccluded tracked face in the current image frame;

and determining a filter in a tracker corresponding to the unoccluded tracked human face in the current image frame according to the response image and the position area image.

Wherein the filter in the current image frame is used for determining the response image of the unoccluded tracked face in the next image frame. It should be noted that, the filter in the tracker corresponding to the non-occluded tracked face in the previous frame image refers to the filter updated in the previous frame image.

Further, when the two tracked faces are overlapped and separated and the distance between the two tracked faces is smaller than a preset distance threshold, the tracker to which the tracked faces belong is continuously determined. For example, when two tracked faces are overlapped and separated and the distance between the two tracked faces is smaller than a preset distance threshold, acquiring target face posture information of the tracked faces which are not shielded; comparing the target face pose information with sample face pose information prestored in a tracker, and determining the tracker corresponding to the target face pose information; and assigning the non-occluded tracked face to the determined tracker for tracking.

According to the face tracking method provided by the embodiment of the disclosure, when two or more tracked faces meet the preset shielding condition, the face tracking algorithm is assisted by face posture information, so that tracking errors caused by mutual shielding of the faces are avoided, and the tracking accuracy is improved.

The various technical features in the above embodiments can be arbitrarily combined, so long as there is no conflict or contradiction between the combinations of the features, but the combination is limited by the space and is not described one by one, and therefore, any combination of the various technical features in the above embodiments also belongs to the scope disclosed in the present specification.

The following is an example of one of the combinations.

As shown in fig. 3, fig. 3 is a flowchart illustrating another face tracking method according to an exemplary embodiment of the present disclosure, the method including:

in step 301, in a current image frame, if two or more tracked faces meet a preset occlusion condition, target face pose information of an unobstructed tracked face is obtained;

in step 302, the similarity between the target face pose information and the pre-stored sample face pose information in each tracker is determined, and the tracker corresponding to the sample face pose information with the highest similarity to the target face pose information is determined as the tracker corresponding to the target face pose information.

In step 303, the non-occluded tracked face is assigned to the determined tracker for tracking.

In step 304, if the number of occluded tracked faces is one, the occluded tracked faces are assigned to the unassigned trackers for tracking processing.

The related art in fig. 3 is the same as that in fig. 2, and is not repeated herein.

It can be seen from the above embodiments that when two or more tracked faces meet the preset blocking condition, the face tracking algorithm is assisted by the face posture information, so that the tracker to which the unblocked tracked face belongs can be determined, and when the number of the blocked tracked faces is one, the tracker to which the blocked tracked face belongs can be quickly determined, so that the blocked tracked face is allocated to the unassigned tracker for tracking, and the tracking accuracy is improved.

Corresponding to the embodiment of the face tracking method, the disclosure also provides embodiments of a face tracking device, equipment applied by the device and a storage medium.

As shown in fig. 4, fig. 4 is a block diagram of a face tracking apparatus according to an exemplary embodiment of the present disclosure, the apparatus comprising:

an information obtaining module 410, configured to obtain target face pose information of an unobstructed tracked face in a current image frame if two or more tracked faces meet a preset occlusion condition;

a tracker determination module 420 configured to compare the target face pose information with sample face pose information pre-stored in a tracker, and determine a tracker corresponding to the target face pose information;

a face tracking module 430 configured to assign the non-occluded tracked face to the determined tracker for tracking.

In an optional implementation manner, the preset occlusion condition includes:

at least two tracked faces overlap; or the like, or, alternatively,

In an optional implementation manner, the tracked face meeting the preset occlusion condition includes: an occluded tracked face and an unoccluded tracked face, each tracked face being provided with a corresponding tracker, the face tracking module 430 being further configured to:

As shown in fig. 5, fig. 5 is a block diagram of another face tracking apparatus according to an exemplary embodiment of the present disclosure, which is based on the foregoing embodiment shown in fig. 4, and the tracker determination module 420 includes:

a similarity determination submodule 421 configured to determine similarity between the target face pose information and sample face pose information prestored in each tracker;

a tracker determination sub-module 422 configured to determine a tracker corresponding to sample face pose information with the highest similarity to the target face pose information as the tracker corresponding to the target face pose information.

the similarity determination submodule 421 is specifically configured to:

Correspondingly, the present disclosure also provides an electronic device, which includes a processor; a memory for storing processor-executable instructions; wherein the processor is configured to:

Accordingly, the present disclosure also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of any of the methods described above.

The present disclosure may take the form of a computer program product embodied on one or more storage media including, but not limited to, disk storage, CD-ROM, optical storage, and the like, having program code embodied therein. Computer-usable storage media include permanent and non-permanent, removable and non-removable media, and information storage may be implemented by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of the storage medium of the computer include, but are not limited to: phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technologies, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic tape storage or other magnetic storage devices, or any other non-transmission medium, may be used to store information that may be accessed by a computing device.

The specific details of the implementation process of the functions and actions of each module in the device are referred to the implementation process of the corresponding step in the method, and are not described herein again.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the disclosed solution. One of ordinary skill in the art can understand and implement it without inventive effort.

As shown in fig. 6, fig. 6 is a block diagram illustrating an apparatus for face tracking according to an exemplary embodiment.

For example, the apparatus 600 may be provided as a computer device. Referring to fig. 6, the apparatus 600 includes a processing component 622 that further includes one or more processors and memory resources, represented by memory 632, for storing instructions, such as applications, that are executable by the processing component 622. The application programs stored in memory 632 may include one or more modules that each correspond to a set of instructions. Further, the processing component 622 is configured to execute instructions to perform the face tracking method described above.

The apparatus 600 may also include a power component 626 configured to perform power management of the apparatus 600, a wired or wireless network interface 650 configured to connect the apparatus 600 to a network, and an input/output (I/O) interface 658. The apparatus 600 may operate based on an operating system stored in the memory 632.

Wherein the instructions in the memory 632, when executed by the processing component 622, enable the apparatus 600 to perform a face tracking method comprising:

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

The above description is only exemplary of the present disclosure and should not be taken as limiting the disclosure, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims

1. A method for face tracking, the method comprising:

assigning the non-occluded tracked face to the determined tracker for tracking;

the tracker corresponding to the target face pose information is a tracker corresponding to sample face pose information with the highest similarity to the target face pose information; the face posture information comprises an up-down turning angle, a left-right turning angle and an internal rotation angle of the tracked face; the similarity is determined based on the manhattan distance between the target face pose information and sample face pose information pre-stored in each tracker.

2. The method of claim 1, wherein the preset occlusion condition comprises:

at least two tracked faces overlap; or the like, or, alternatively,

3. The method of claim 1, wherein the tracked face satisfying the preset occlusion condition comprises: the method comprises the following steps that the occluded tracked face and the non-occluded tracked face are correspondingly provided with corresponding trackers, and the method further comprises the following steps:

4. The method of claim 1, wherein the face pose information further comprises predefined face keypoint information for characterizing the face pose;

the similarity between the target face pose information and sample face pose information prestored in each tracker is determined based on the following modes:

5. The method of claim 4, wherein the predefined face keypoint information comprises two or more of:

6. An apparatus for face tracking, the apparatus comprising:

a face tracking module configured to assign the non-occluded tracked face to the determined tracker for tracking;

the tracker determination module includes:

a tracker determination sub-module configured to determine a tracker corresponding to sample face pose information having a highest similarity to the target face pose information as the tracker corresponding to the target face pose information;

the face posture information comprises an up-down turning angle, a left-right turning angle and an internal rotation angle of the tracked face; the similarity is determined based on the manhattan distance between the target face pose information and sample face pose information pre-stored in each tracker.

7. The apparatus of claim 6, wherein the preset occlusion condition comprises:

at least two tracked faces overlap; or the like, or, alternatively,

8. The apparatus of claim 6, wherein the tracked face satisfying the preset occlusion condition comprises: the face tracking module is further configured to:

9. The apparatus of claim 6, wherein the face pose information further comprises predefined face keypoint information for characterizing the face pose;

the similarity determination submodule is specifically configured to:

10. The apparatus of claim 9, wherein the predefined face keypoint information comprises two or more of:

11. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

assigning the non-occluded tracked face to the determined tracker for tracking;

12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.