CN110009662B - Face tracking method and device, electronic equipment and computer readable storage medium - Google Patents

Face tracking method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN110009662B
CN110009662B CN201910262510.9A CN201910262510A CN110009662B CN 110009662 B CN110009662 B CN 110009662B CN 201910262510 A CN201910262510 A CN 201910262510A CN 110009662 B CN110009662 B CN 110009662B
Authority
CN
China
Prior art keywords
face
information
tracking
detection frame
similarity matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910262510.9A
Other languages
Chinese (zh)
Other versions
CN110009662A (en
Inventor
杨弋
周舒畅
张一山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Megvii Technology Co Ltd
Original Assignee
Beijing Megvii Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Megvii Technology Co Ltd filed Critical Beijing Megvii Technology Co Ltd
Priority to CN201910262510.9A priority Critical patent/CN110009662B/en
Publication of CN110009662A publication Critical patent/CN110009662A/en
Application granted granted Critical
Publication of CN110009662B publication Critical patent/CN110009662B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides a face tracking method, a face tracking device, electronic equipment and a computer readable storage medium, and relates to the technical field of image processing. The method comprises the following steps: the method comprises the steps of processing at least one frame image in a video stream to obtain detection frame information of at least one face, determining attribute information corresponding to the at least one face based on the detection frame information, and tracking the at least one face based on the detection frame information of the at least one face and the attribute information corresponding to the at least one face. According to the embodiment of the application, the probability that the tracking tracks of the faces are alternated when the faces are tracked is reduced, the accuracy of tracking the faces is improved, and the user experience can be improved.

Description

Face tracking method and device, electronic equipment and computer readable storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for face tracking, an electronic device, and a computer-readable storage medium.
Background
With the development of information technology, a target object tracking technology has also developed, and the target object tracking technology performs trajectory tracking on a target object in each frame image of a video stream.
On many intelligent cameras or face capturing machines, tracking of a target object is achieved by tracking a detection frame for the target object in each frame of image, for example, tracking of the detection frame for a face on each frame of image is achieved to achieve face tracking, but an existing face tracking technology is only applicable to a relatively simple scene, for example, under a scene where only one target face or one face to be tracked exists in each frame of image, if for some complex scenes, for example, under a scene shown in fig. 1a, when real tracks of two faces are staggered in a video, an error track with alternative tracking tracks of the face may occur in the obtained tracking track based on the existing face tracking technology, as shown in fig. 1b, so that accuracy of face tracking is low, and user experience is poor.
Therefore, how to track the human face more accurately becomes a key problem.
Disclosure of Invention
The application provides a face tracking method, a face tracking device, electronic equipment and a computer-readable storage medium, which are used for solving the technical problems of low accuracy of tracking a target object and poor user experience.
In a first aspect, a method for face tracking is provided, where the method includes:
processing at least one frame image in a video stream to obtain detection frame information of at least one face;
determining attribute information corresponding to at least one face based on the detection frame information;
and tracking the at least one face based on the detection frame information of the at least one face and the attribute information corresponding to the at least one face.
In a possible implementation manner, tracking at least one face based on detection frame information of the at least one face and attribute information corresponding to the at least one face includes:
matching with the existing tracking track information according to the detection frame information of at least one face and the attribute information of at least one face;
and updating the existing tracking track information based on the matching result so as to realize the tracking processing of at least one face.
In a possible implementation mode, matching with the existing tracking track information according to the detection frame information of at least one face and the attribute information of at least one face; updating the existing tracking track information based on the matching result, including:
calculating a similarity matrix according to the existing tracking track information and the detection frame information and attribute information of at least one face;
and updating the existing tracking track information according to the similarity matrix.
In a possible implementation manner, updating the existing tracking track information according to the similarity matrix includes:
determining elements which are not larger than a preset threshold value in the similarity matrix;
determining a set of matching edges based on elements which are not greater than a preset threshold value in the similarity matrix through a bipartite graph optimal matching algorithm, wherein any matching edge in the set of matching edges represents any group of matched tracking track information and detection frame information and attribute information of the human face;
and updating the existing tracking track information according to the matching edge set.
In one possible implementation, updating the existing tracking trajectory information includes at least one of:
if the face information corresponding to any frame image in the video stream does not contain the existing tracking track information, deleting the tracking track information which is not contained in the face information corresponding to any frame image in the existing tracking track information;
if the existing tracking track information does not contain the face information corresponding to any frame image in the video stream, adding the face information corresponding to any frame image in the existing tracking track information;
the face information includes: the detection frame information of the human face and the attribute information corresponding to the human face.
In a possible implementation manner, the attribute information corresponding to any face includes at least one of the following:
age information; gender information.
In a possible implementation manner, calculating a similarity matrix according to existing tracking trajectory information and detection frame information and attribute information of at least one human face includes:
calculating any element in the similarity matrix according to a specific formula;
determining a similarity matrix according to each element in the calculated similarity matrix;
the specific formula is:
Aij=(Ti1-fj1)2×a+(Ti2-fj2)2×b+(Ti3-fj3)2×c+(Ti4-fj4) X d, wherein, Ti1F age information corresponding to a face in any one of the existing tracking tracksj1Age information corresponding to the face detection frame; t isi2Probability information f of male or female sex corresponding to face in any one of the existing tracking tracksj2Probability information that the gender corresponding to the face detection frame is male or female; t isi3-fj3Representing the Euclidean distance of the central point position between any tracking track in the existing tracking tracks and the face detection frame; t isi4-fj4The characteristic distance between any tracking track in the existing tracking tracks and the face frame is used.
In a possible implementation manner, determining attribute information corresponding to at least one face based on the detection box information includes:
and outputting an attribute feature vector corresponding to at least one face through the trained network model based on the detection frame information.
In a second aspect, an apparatus for face tracking is provided, the apparatus comprising:
the processing module is used for processing at least one frame image in the video stream to obtain detection frame information of at least one human face;
the determining module is used for determining attribute information corresponding to at least one face based on the detection frame information;
and the tracking module is used for tracking at least one face based on the detection frame information of at least one face and the attribute information corresponding to at least one face.
In one possible implementation, the tracking module includes: a matching unit and an updating unit, wherein,
the matching unit is used for matching the existing tracking track information according to the detection frame information of at least one face and the attribute information of at least one face;
and the updating unit is used for updating the existing tracking track information based on the matching result of the matching unit so as to realize the tracking processing of at least one face.
In a possible implementation manner, the matching unit is specifically configured to calculate a similarity matrix according to existing tracking trajectory information and detection frame information and attribute information of at least one human face;
and the updating unit is specifically used for updating the existing tracking track information according to the similarity matrix.
In a possible implementation manner, the updating unit is specifically configured to determine an element in the similarity matrix that is not greater than a preset threshold;
the updating unit is specifically used for determining a set of matching edges based on elements, not larger than a preset threshold value, in the similarity matrix through a bipartite graph optimal matching algorithm, wherein any matching edge in the set of matching edges represents any group of matched tracking track information and detection frame information and attribute information of a human face;
and the updating unit is specifically further used for updating the existing tracking track information according to the matching edge set.
In a possible implementation manner, the updating unit is specifically configured to delete, when face information corresponding to any frame image in the video stream does not include existing tracking track information, tracking track information that is not included in the face information corresponding to any frame image in the existing tracking track information; and/or the presence of a gas in the gas,
the updating unit is specifically used for adding the face information corresponding to any frame image in the existing tracking track information when the existing tracking track information does not contain the face information corresponding to any frame image in the video stream;
the face information includes: the detection frame information of the human face and the attribute information corresponding to the human face.
In a possible implementation manner, the attribute information corresponding to any face includes at least one of the following:
age information; gender information.
In a possible implementation manner, the matching unit is specifically configured to calculate any element in the similarity matrix according to a specific formula;
the matching unit is specifically used for determining a similarity matrix according to each element in the calculated similarity matrix;
the specific formula is:
Aij=(Ti1-fj1)2×a+(Ti2-fj2)2×b+(Ti3-fj3)2×c+(Ti4-fj4) X d, wherein, Ti1F age information corresponding to a face in any one of the existing tracking tracksj1Age information corresponding to the face detection frame; t isi2Probability information f of male or female sex corresponding to face in any one of the existing tracking tracksj2Probability information that the gender corresponding to the face detection frame is male or female; t isi3-fj3Representing the Euclidean distance of the central point position between any tracking track in the existing tracking tracks and the face detection frame; t isi4-fj4The characteristic distance between any tracking track in the existing tracking tracks and the face frame is used.
In a possible implementation manner, the determining module is specifically configured to output an attribute feature vector corresponding to at least one human face based on the detection box information and through the trained network model.
In a third aspect, an electronic device is provided, which includes:
one or more processors;
a memory;
one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to: and executing corresponding operations of the method for tracking the face shown in the first aspect of the present application or any possible implementation manner of the first aspect.
In a fourth aspect, there is provided a computer readable storage medium storing at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of face tracking as shown in the first aspect or any one of the possible implementations of the first aspect.
The beneficial effect that technical scheme that this application provided brought is:
compared with the prior art, the method, the device, the electronic equipment and the computer-readable storage medium for tracking the human face have the advantages that at least one frame of image in a video stream is processed to obtain detection frame information of at least one human face, attribute information corresponding to the at least one human face is determined based on the detection frame information, and the at least one human face is tracked based on the detection frame information of the at least one human face and the attribute information corresponding to the at least one human face. When the method and the device are used for tracking at least one face, not only the attribute information of the face in each detection frame needs to be detected according to the detection frame information in the frame image, but also the probability of the tracking track alternation of a plurality of faces when the plurality of faces are tracked can be reduced, the accuracy of tracking the faces is improved, and the user experience can be improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
FIG. 1a is a schematic diagram of real trajectory interlacing of two target objects in a video;
FIG. 1b is a schematic diagram of two target objects with tracking tracks that alternate erroneously;
fig. 1c is a schematic flowchart of a method for tracking a human face according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a face tracking apparatus according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device for face tracking according to an embodiment of the present application;
FIG. 4 is a diagram illustrating an example of a representation of a detection box in an embodiment of the present application;
fig. 5 is a schematic flow chart of face tracking in a certain application scenario.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
The embodiment of the present application provides a method for tracking a human face, as shown in fig. 1c, the method includes:
step S101, processing at least one frame image in the video stream to obtain detection frame information of at least one human face.
For the embodiment of the present application, the detection frame may be identification information for indicating an area of the target object in the image, and may be identified by an arbitrary shape, for example, a rectangle, a square, or the like. For example, as shown in fig. 4, a square frame in the image is a detection frame for indicating a face region of a gorilla in the image.
For the embodiment of the present application, step S101 may specifically include: and (3) passing at least one frame image in the video stream through a preset model (detection network) to obtain the detection frame information of at least one face.
For the embodiment of the present application, the detection network may adopt any one of the following network structures:
a single-network object detection framework (SSD) network; SSD and Residual Neural Network (ResNet) 18; SSD and ResNet 50; SSD and ResNet 100; SSD and ShuffleNetV 2; a Regional Convolutional Neural Network (RCNN); RCNN and ResNet 18; RCNN and ResNet 50; RCNN and ResNet 100; RCNN and ShuffleNet V2; FasterRCNN; FasterRCNN and ResNet 18; FasterRCNN and ResNet 50; FasterRCNN and ResNet 100; FasterRCNN and ShuffleNet V2; YOLO-v 1; YOLO-v1 and ResNet 18; YOLO-v1 and ResNet 50; YOLO-v1 and ResNet 100; YOLO-v1 and ShuffleNet V2; YOLO-v 2; YOLO-v2 and ResNet 18; YOLO-v2 and ResNet 50; YOLO-v2 and ResNet 100; YOLO-v2 and ShuffleNet V2; YOLO-v 3; YOLO-v3 and ResNet 18; YOLO-v3 and ResNet 50; YOLO-v3 and ResNet 100; YOLO-v3 and ShuffleNet V2.
In another possible implementation manner of the embodiment of the present application, the detecting frame information includes: at least one of position information of the detection frame in the frame image and a size of the detection frame.
And S102, determining attribute information corresponding to at least one face based on the detection frame information.
In another possible implementation manner of the embodiment of the present application, step S102 may specifically include: and outputting an attribute feature vector corresponding to at least one face through the trained network model based on the detection frame information.
For the embodiment of the application, the human face is identified in any frame image based on the detection frame information; and determining at least one attribute feature vector corresponding to the face according to the recognized face image information through the trained network model.
For the embodiment of the present application, the attribute information corresponding to any face is a numerical value predicted by a preset model (i.e., the trained network model). In the embodiment of the present application, if the attribute is a discrete value (e.g., gender), the information predicted by the preset model is a probability that the attribute belongs to a certain type; if the attribute is a continuous value (e.g., age), a specific value of the attribute is obtained through a preset model.
For example, for the age attribute, the age attribute is obtained as an age value (unit: age) through a preset model, for example, 5 years; for the gender attribute, the probability of being a male is 0.8 and the probability of being a female is 0.2 through a preset model.
Because there may be a plurality of attribute information corresponding to any face, the attribute information corresponding to any face includes: and (4) attribute feature vectors corresponding to any human face.
For the embodiment of the application, the attribute information corresponding to at least one face in any frame of image is determined through the preset model and based on the detection frame information, so that the accuracy of the determined attribute information corresponding to at least one face can be improved, the efficiency of determining the attribute information corresponding to at least one face can also be improved, and the accuracy and the efficiency of face tracking can be further improved.
Step S103, tracking at least one face based on the detection frame information of at least one face and the attribute information corresponding to at least one face.
Compared with the prior art, the method for tracking the human face comprises the steps of processing at least one frame of image in a video stream to obtain detection frame information of at least one human face, determining attribute information corresponding to the at least one human face based on the detection frame information, and tracking the at least one human face based on the detection frame information of the at least one human face and the attribute information corresponding to the at least one human face. When at least one face is tracked, the embodiment of the application not only needs to track the face according to the detection frame information in the frame image but also needs to track the attribute information of the face in each detection frame, so that the probability of the alternation of the tracking tracks of a plurality of faces when the plurality of faces are tracked can be reduced, the accuracy of tracking the faces is improved, and the user experience can be further improved.
In another possible implementation manner of the embodiment of the present application, step S103 may specifically include: step S1031 (not shown in the figure) and step S1032 (not shown in the figure), wherein,
and step S1031, matching the information with the existing tracking track information according to the detection frame information of the at least one face and the attribute information of the at least one face.
And step S1032, updating the existing tracking track information based on the matching result so as to realize the tracking processing of at least one face.
In another possible implementation manner of the embodiment of the application, the attribute information corresponding to any face includes at least one of the following:
age information; gender information; skin color information; hair color information; iris color information; and (4) ornament information.
In another possible implementation manner of the embodiment of the present application, step S1031 may specifically include: calculating a similarity matrix according to the existing tracking track information and the detection frame information and attribute information of at least one face; step S1032 may specifically include: and updating the existing tracking track information according to the similarity matrix.
Another possible implementation manner of the embodiment of the present application is that, according to existing tracking trajectory information and detection frame information and attribute information of at least one human face, a similarity matrix is calculated, including: calculating any element in the similarity matrix according to a specific formula; determining a similarity matrix according to each element in the calculated similarity matrix;
wherein, the specific formula is:
Aij=(Ti1-fj1)2×a+(Ti2-fj2)2×b+(Ti3-fj2)2×c+(Ti4-fj4) X d, wherein, Ti1F age information corresponding to a face in any one of the existing tracking tracksj1Age information corresponding to the face detection frame; t isi2Probability information f of male or female sex corresponding to face in any one of the existing tracking tracksj2Probability information that the gender corresponding to the face detection frame is male or female; t isi3-fj3Representing the Euclidean distance of the central point position between any tracking track in the existing tracking tracks and the face detection frame; t isi4-fj4The characteristic distance between any tracking track in the existing tracking tracks and the face frame is used.
For example, assume that there are n original tracks (track information corresponding to original faces), which are respectively referred to as T _1, T _2,.. and T _ n, and there are m faces in a new frame, which are respectively referred to as f _1, f _2,. and.f _ m.
The ith row and the jth column of the calculated similarity matrix A are the square of the difference between the age attributes of the track T _ i and the face f _ j multiplied by a coefficient a, the square of the difference between the gender attributes of the track T _ i and the face f _ j for predicting the probability of the human being as a female is multiplied by a coefficient b, the square of the Euclidean distance of the central point positions of the track T _ i and the face f _ j is multiplied by a coefficient c, and the characteristic distance of the track T _ i and the face f _ j is multiplied by a coefficient d.
The above a, b, c and d are constants selected in advance, and what algorithm is used for the "distance of the face feature" is often determined by a specific face recognition algorithm. The features of each face are typically represented as a high-dimensional vector, and the distance of a feature is represented as the squared euclidean distance of two such vectors, or the cosine of the angle between such two vectors in the vector space.
Another possible implementation manner of the embodiment of the present application, updating existing tracking track information according to the similarity matrix, includes: determining elements which are not larger than a preset threshold value in the similarity matrix; determining a set of matching edges based on elements which are not greater than a preset threshold value in the similarity matrix through a bipartite graph optimal matching algorithm, wherein any matching edge in the set of matching edges represents any group of matched tracking track information and detection frame information and attribute information of the human face; and updating the existing tracking track information according to the matching edge set.
Specifically, a preselected threshold value T is used, and for all elements A _ ij in the similarity matrix, which are greater than T, the track T _ i and the face f _ j cannot be matched certainly; for all possible matching trajectory and face bigrams (where the definition of possible matching is the same as in the previous paragraph, matching is possible if a _ ij < ═ t), a bipartite graph optimal matching algorithm is used to obtain an optimal matching solution.
The output of the bipartite graph matching algorithm is a set of matching edges, each matching edge represents a set of matching tracks and faces, and it is ensured that any face is only matched to at most one track, and any track is only matched to at most one face.
For the embodiment of the application, the bipartite graph is also called a bipartite graph and is a special model in graph theory; let G ═ V, E be an undirected graph, and if vertex V can be partitioned into two mutually disjoint subsets (a, B), and the two vertices i and j associated with each edge (i, j) in the graph belong to the two different sets of vertices (i in a, j in B), respectively, graph G is called a bipartite graph.
Another possible implementation manner of the embodiment of the present application, updating the existing tracking trace information, includes at least one of step Sa (not shown in the figure) and step Sb (not shown in the figure), wherein,
step Sa, if the face information corresponding to any frame image in the video stream does not include the existing tracking track information, deleting the tracking track information that is not included in the face information corresponding to any frame image in the existing tracking track information.
And Sb, if the existing tracking track information does not contain the face information corresponding to any frame image in the video stream, adding the face information corresponding to any frame image in the existing tracking track information.
Wherein, the face information includes: the detection frame information of the human face and the attribute information corresponding to the human face.
For the embodiment of the application, for the tracks which are not matched with the human faces, if the human faces corresponding to the tracks are considered to leave the picture, the tracks are deleted from the original tracking track information; and for the faces which are not matched with the track, if the faces are considered as the new faces of the frame, adding corresponding face information in the existing tracking track.
The above method for tracking the target is described in detail, and the following method for tracking the target is described in a summary manner through an application scenario, and is specifically shown in fig. 5:
after any frame image in the video stream is preprocessed, the preprocessed frame image passes through a detection network, human face detection frame information in the frame image is output, attribute information corresponding to each human face is obtained through a human face attribute network based on the human face detection frame information, the human face frame information in the frame image and the attribute information corresponding to each human face are obtained through a human face tracking module, and human face tracking information is obtained.
The above embodiment introduces the flow of the face tracking method from the perspective of the flow of the method, and the following introduces the face tracking apparatus from the perspective of the virtual module with reference to the accompanying drawings, which are specifically as follows:
the embodiment of the present application provides an apparatus for face tracking, as shown in fig. 2, the apparatus 20 for face tracking may include a processing module 21, a determining module 22, and a tracking module 23, wherein,
the processing module 21 is configured to obtain detection frame information of at least one human face by processing at least one frame of image in the video stream.
And the determining module 22 is configured to determine attribute information corresponding to at least one human face based on the detection frame information.
And the tracking module 23 is configured to track at least one face based on the detection frame information of the at least one face and the attribute information corresponding to the at least one face.
In another possible implementation manner of the embodiment of the present application, the tracking module 23 includes: a matching unit and an updating unit, wherein,
and the matching unit is used for matching the existing tracking track information according to the detection frame information of at least one face and the attribute information of at least one face.
And the updating unit is used for updating the existing tracking track information based on the matching result of the matching unit so as to realize the tracking processing of at least one face.
In another possible implementation manner of the embodiment of the application, the matching unit is specifically configured to calculate the similarity matrix according to existing tracking track information and detection frame information and attribute information of at least one human face.
And the updating unit is specifically used for updating the existing tracking track information according to the similarity matrix.
In another possible implementation manner of the embodiment of the present application, the updating unit is specifically configured to determine an element in the similarity matrix, where the element is not greater than a preset threshold.
And the updating unit is specifically used for determining a matching edge set based on the elements which are not greater than the preset threshold value in the similarity matrix through a bipartite graph optimal matching algorithm, wherein any matching edge in the matching edge set represents any group of matched tracking track information and detection frame information and attribute information of the human face.
And the updating unit is specifically further used for updating the existing tracking track information according to the matching edge set.
In another possible implementation manner of the embodiment of the application, the updating unit is specifically configured to delete tracking track information that is not included in face information corresponding to any frame image in the existing tracking track information when the face information corresponding to any frame image in the video stream does not include the existing tracking track information; and/or the updating unit is specifically configured to add face information corresponding to any frame image in the existing tracking track information when the existing tracking track information does not include the face information corresponding to any frame image in the video stream.
Wherein, the face information includes: the detection frame information of the human face and the attribute information corresponding to the human face.
In another possible implementation manner of the embodiment of the present application, the attribute information corresponding to any face includes: at least one of age information and gender information.
In another possible implementation manner of the embodiment of the present application, the matching unit is specifically configured to calculate any element in the similarity matrix according to a specific formula.
And the matching unit is specifically used for determining the similarity matrix according to each element in the calculated similarity matrix.
Wherein, the specific formula is:
Aij=(Ti1-fj1)2×a+(Ti2-fj2)2×b+(Ti3-fj3)2×c+(Ti4-fj4) X d, wherein, Ti1F age information corresponding to a face in any one of the existing tracking tracksj1Age information corresponding to the face detection frame; t isi2Probability information f of male or female sex corresponding to face in any one of the existing tracking tracksj2Probability information that the gender corresponding to the face detection frame is male or female; t isi3-fj3Representing the Euclidean distance of the central point position between any tracking track in the existing tracking tracks and the face detection frame; t isi4-fj4The characteristic distance between any tracking track in the existing tracking tracks and the face frame is used.
In another possible implementation manner of the embodiment of the present application, the determining module 22 is specifically configured to output an attribute feature vector corresponding to at least one human face based on the detection box information and through the trained network model.
Compared with the prior art, the embodiment of the application provides a face tracking device, and the method and the device have the advantages that at least one frame of image in a video stream is processed to obtain the detection frame information of at least one face, the attribute information corresponding to the at least one face is determined based on the detection frame information, and then the at least one face is tracked based on the detection frame information of the at least one face and the attribute information corresponding to the at least one face. When at least one face is tracked, the embodiment of the application not only needs to track the face according to the detection frame information in the frame image but also needs to track the attribute information of the face in each detection frame, so that the probability of the alternation of the tracking tracks of a plurality of faces when the plurality of faces are tracked can be reduced, the accuracy of tracking the faces is improved, and the user experience can be further improved.
The face tracking apparatus of this embodiment may execute the face tracking method provided in the foregoing method embodiments, and the implementation principles thereof are similar, and are not described herein again.
The above embodiments describe a face tracking method from the perspective of a method flow and a face tracking device from the perspective of a virtual module, and an electronic device is described below with reference to the accompanying drawings from the perspective of a physical device to execute the face tracking method, which is specifically as follows:
an embodiment of the present application provides an electronic device, as shown in fig. 3, an electronic device 3000 shown in fig. 3 includes: a processor 3001 and a memory 3003. The processor 3001 is coupled to the memory 3003, such as via a bus 3002. Optionally, the electronic device 3000 may further comprise a transceiver 3004. It should be noted that the transceiver 3004 is not limited to one in practical applications, and the structure of the electronic device 3000 is not limited to the embodiment of the present application.
The processor 3001 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 3001 may also be a combination of computing functions, e.g., comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
Bus 3002 may include a path that conveys information between the aforementioned components. The bus 3002 may be a PCI bus or an EISA bus, etc. The bus 3002 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 3, but this does not mean only one bus or one type of bus.
Memory 3003 may be, but is not limited to, a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, an EEPROM, a CD-ROM or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
The memory 3003 is used for storing application program codes for performing the present scheme, and is controlled to be executed by the processor 3001. The processor 3001 is configured to execute application program code stored in the memory 3003 to implement any of the method embodiments shown above.
An embodiment of the present application provides an electronic device, where the electronic device includes: a memory and a processor; at least one program stored in the memory for execution by the processor, which when executed by the processor, implements: according to the method and the device, at least one frame of image in the video stream is processed to obtain the detection frame information of at least one face, the attribute information corresponding to the at least one face is determined based on the detection frame information, and then the at least one face is tracked based on the detection frame information of the at least one face and the attribute information corresponding to the at least one face. When at least one face is tracked, the embodiment of the application not only needs to track the face according to the detection frame information in the frame image but also needs to track the attribute information of the face in each detection frame, so that the probability of the alternation of the tracking tracks of a plurality of faces when the plurality of faces are tracked can be reduced, the accuracy of tracking the faces is improved, and the user experience can be further improved.
The electronic device of this embodiment may execute the method for face tracking provided by the above method embodiments, and the implementation principles thereof are similar, and are not described herein again.
The present application provides a computer-readable storage medium, on which a computer program is stored, which, when running on a computer, enables the computer to execute the corresponding content in the foregoing method embodiments. Compared with the prior art, the method and the device have the advantages that at least one frame of image in the video stream is processed to obtain the detection frame information of at least one face, the attribute information corresponding to the at least one face is determined based on the detection frame information, and then the at least one face is tracked based on the detection frame information of the at least one face and the attribute information corresponding to the at least one face. When at least one face is tracked, the embodiment of the application not only needs to track the face according to the detection frame information in the frame image but also needs to track the attribute information of the face in each detection frame, so that the probability of the alternation of the tracking tracks of a plurality of faces when the plurality of faces are tracked can be reduced, the accuracy of tracking the faces is improved, and the user experience can be further improved.
The computer-readable storage medium of this embodiment is suitable for the method for face tracking provided in the foregoing method embodiments, and the implementation principles thereof are similar, and are not described herein again.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (9)

1. A method of face tracking, comprising:
processing at least one frame image in a video stream to obtain detection frame information of at least one face;
determining attribute information corresponding to at least one face based on the detection frame information;
tracking at least one face based on the detection frame information of the at least one face and the attribute information corresponding to the at least one face;
tracking at least one face based on the detection frame information of the at least one face and the attribute information corresponding to the at least one face, including:
calculating a similarity matrix according to the existing tracking track information and the detection frame information and attribute information of at least one face;
updating the existing tracking track information according to the similarity matrix so as to realize the tracking processing of the at least one face;
calculating a similarity matrix according to the existing tracking track information and the detection frame information and attribute information of at least one face, wherein the similarity matrix comprises:
calculating any element in the similarity matrix according to a specific formula;
determining a similarity matrix according to each element in the calculated similarity matrix;
the specific formula is:
Aij=(Ti1-fj1)2×a+(Ti2-fj2)2×b+(Ti3-fj3)2×c+(Ti4-fj4)2x d, wherein, Ti1F age information corresponding to a face in any one of the existing tracking tracksj1Age information corresponding to the face detection frame; t isi2Probability information f of male or female sex corresponding to face in any one of the existing tracking tracksj2Probability information that the gender corresponding to the face detection frame is male or female; t isi3-fj3Representing the Euclidean distance of the central point position between any tracking track in the existing tracking tracks and the face detection frame; t isi4-fj4And a, b, c and d are constants which are selected in advance for the characteristic distance between any tracking track in the existing tracking tracks and the face frame.
2. The method of claim 1, wherein tracking at least one face based on the detection box information of the at least one face and the attribute information corresponding to the at least one face comprises:
matching with the existing tracking track information according to the detection frame information of the at least one face and the attribute information of the at least one face;
and updating the existing tracking track information based on the matching result so as to realize the tracking processing of the at least one face.
3. The method of claim 1, wherein updating the existing tracking trajectory information according to a similarity matrix comprises:
determining elements which are not larger than a preset threshold value in the similarity matrix;
determining a set of matching edges by a bipartite graph optimal matching algorithm based on elements not greater than a preset threshold in the similarity matrix, wherein any matching edge in the set of matching edges represents any group of matched tracking track information and detection frame information and attribute information of a human face;
and updating the existing tracking track information according to the matching edge set.
4. The method of claim 1, wherein the updating the existing tracking trajectory information comprises at least one of:
if the face information corresponding to any frame image in the video stream does not contain the existing tracking track information, deleting the tracking track information which is not contained in the face information corresponding to any frame image in the existing tracking track information;
if the existing tracking track information does not contain face information corresponding to any frame image in the video stream, adding the face information corresponding to any frame image in the existing tracking track information;
the face information includes: the detection frame information of the human face and the attribute information corresponding to the human face.
5. The method according to any one of claims 1 to 4, wherein the attribute information corresponding to any one face comprises at least one of the following:
age information; gender information.
6. The method according to any one of claims 1 to 4, wherein determining attribute information corresponding to at least one face based on the detection box information comprises:
and outputting the attribute feature vector corresponding to the at least one face through the trained network model based on the detection frame information.
7. An apparatus for face tracking, comprising:
the processing module is used for processing at least one frame image in the video stream to obtain detection frame information of at least one human face;
the determining module is used for determining attribute information corresponding to at least one face based on the detection frame information;
the tracking module is used for tracking at least one face based on the detection frame information of the at least one face and the attribute information corresponding to the at least one face;
wherein, the tracking module includes: a matching unit and an updating unit, wherein,
the matching unit is specifically used for calculating a similarity matrix according to the existing tracking track information and the detection frame information and attribute information of at least one face;
the updating unit is specifically configured to update the existing tracking trajectory information according to the similarity matrix, so as to perform tracking processing on the at least one face;
the matching unit is specifically used for calculating any element in the similarity matrix according to a specific formula; determining a similarity matrix according to each element in the calculated similarity matrix;
the specific formula is:
Aij=(Ti1-fj1)2×a+(Ti2-fj2)2×b+(Ti3-fj3)2×c+(Ti4-fj4)2x d, wherein, Ti1F age information corresponding to a face in any one of the existing tracking tracksj1Age information corresponding to the face detection frame; t isi2Probability information f of male or female sex corresponding to face in any one of the existing tracking tracksj2Probability information that the gender corresponding to the face detection frame is male or female; t isi3-fj3Representing the Euclidean distance of the central point position between any tracking track in the existing tracking tracks and the face detection frame; t isi4-fj4And a, b, c and d are constants which are selected in advance for the characteristic distance between any tracking track in the existing tracking tracks and the face frame.
8. An electronic device, comprising:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to: a method of performing face tracking according to any of claims 1 to 6.
9. A computer readable storage medium storing at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of face tracking according to any one of claims 1 to 6.
CN201910262510.9A 2019-04-02 2019-04-02 Face tracking method and device, electronic equipment and computer readable storage medium Active CN110009662B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910262510.9A CN110009662B (en) 2019-04-02 2019-04-02 Face tracking method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910262510.9A CN110009662B (en) 2019-04-02 2019-04-02 Face tracking method and device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN110009662A CN110009662A (en) 2019-07-12
CN110009662B true CN110009662B (en) 2021-09-17

Family

ID=67169613

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910262510.9A Active CN110009662B (en) 2019-04-02 2019-04-02 Face tracking method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110009662B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427905B (en) * 2019-08-08 2023-06-20 北京百度网讯科技有限公司 Pedestrian tracking method, device and terminal
CN111178217A (en) * 2019-12-23 2020-05-19 上海眼控科技股份有限公司 Method and equipment for detecting face image
CN111862624B (en) * 2020-07-29 2022-05-03 浙江大华技术股份有限公司 Vehicle matching method and device, storage medium and electronic device
CN113034548B (en) * 2021-04-25 2023-05-26 安徽科大擎天科技有限公司 Multi-target tracking method and system suitable for embedded terminal

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103632126A (en) * 2012-08-20 2014-03-12 华为技术有限公司 Human face tracking method and device
CN105488478A (en) * 2015-12-02 2016-04-13 深圳市商汤科技有限公司 Face recognition system and method
CN107316322A (en) * 2017-06-27 2017-11-03 上海智臻智能网络科技股份有限公司 Video tracing method and device and object identifying method and device
CN108230352A (en) * 2017-01-24 2018-06-29 北京市商汤科技开发有限公司 Detection method, device and the electronic equipment of target object
CN108932456A (en) * 2017-05-23 2018-12-04 北京旷视科技有限公司 Face identification method, device and system and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102306290B (en) * 2011-10-14 2013-10-30 刘伟华 Face tracking recognition technique based on video
KR102062310B1 (en) * 2013-01-04 2020-02-11 삼성전자주식회사 Method and apparatus for prividing control service using head tracking in an electronic device
WO2016179808A1 (en) * 2015-05-13 2016-11-17 Xiaoou Tang An apparatus and a method for face parts and face detection
US10997395B2 (en) * 2017-08-14 2021-05-04 Amazon Technologies, Inc. Selective identity recognition utilizing object tracking
CN108491832A (en) * 2018-05-21 2018-09-04 广西师范大学 A kind of embedded human face identification follow-up mechanism and method
CN109522843B (en) * 2018-11-16 2021-07-02 北京市商汤科技开发有限公司 Multi-target tracking method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103632126A (en) * 2012-08-20 2014-03-12 华为技术有限公司 Human face tracking method and device
CN105488478A (en) * 2015-12-02 2016-04-13 深圳市商汤科技有限公司 Face recognition system and method
CN108230352A (en) * 2017-01-24 2018-06-29 北京市商汤科技开发有限公司 Detection method, device and the electronic equipment of target object
CN108932456A (en) * 2017-05-23 2018-12-04 北京旷视科技有限公司 Face identification method, device and system and storage medium
CN107316322A (en) * 2017-06-27 2017-11-03 上海智臻智能网络科技股份有限公司 Video tracing method and device and object identifying method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Color Based New Algorithm for Detection and Single/Multiple Person Face Tracking in Different Background Video Sequence;Ranganatha S等;《I.J. Information Technology and Computer Science》;20181108;第39-48页 *
基于OpenCV的人脸检测与跟踪方法实现;王蓉等;《科学技术与工程》;20140831;第14卷(第24期);第115-118页 *

Also Published As

Publication number Publication date
CN110009662A (en) 2019-07-12

Similar Documents

Publication Publication Date Title
CN110009662B (en) Face tracking method and device, electronic equipment and computer readable storage medium
US11288838B2 (en) Image processing method and apparatus
Jampani et al. Video propagation networks
US9542621B2 (en) Spatial pyramid pooling networks for image processing
US10891465B2 (en) Methods and apparatuses for searching for target person, devices, and media
US20200356818A1 (en) Logo detection
CN108875487B (en) Training of pedestrian re-recognition network and pedestrian re-recognition based on training
CN110765860A (en) Tumble determination method, tumble determination device, computer apparatus, and storage medium
CN109492576B (en) Image recognition method and device and electronic equipment
CN110910422A (en) Target tracking method and device, electronic equipment and readable storage medium
CN113313053B (en) Image processing method, device, apparatus, medium, and program product
JP2022554068A (en) Video content recognition method, apparatus, program and computer device
CN109063776B (en) Image re-recognition network training method and device and image re-recognition method and device
KR20220076398A (en) Object recognition processing apparatus and method for ar device
CN112381071A (en) Behavior analysis method of target in video stream, terminal device and medium
CN112070071B (en) Method and device for labeling objects in video, computer equipment and storage medium
WO2023109361A1 (en) Video processing method and system, device, medium and product
CN114359564A (en) Image recognition method, image recognition device, computer equipment, storage medium and product
KR101942646B1 (en) Feature point-based real-time camera pose estimation method and apparatus therefor
CN109635749B (en) Image processing method and device based on video stream
JP2010257267A (en) Device, method and program for detecting object area
CN112257689A (en) Training and recognition method of face recognition model, storage medium and related equipment
CN109600627B (en) Video identification method and device
CN114241411B (en) Counting model processing method and device based on target detection and computer equipment
CN115984671A (en) Model online updating method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant