CN114333039B

CN114333039B - Method, device and medium for clustering human images

Info

Publication number: CN114333039B
Application number: CN202210200556.XA
Authority: CN
Inventors: 黄攀
Original assignee: Jinan Boguan Intelligent Technology Co Ltd
Current assignee: Jinan Boguan Intelligent Technology Co Ltd
Priority date: 2022-03-03
Filing date: 2022-03-03
Publication date: 2022-07-08
Anticipated expiration: 2042-03-03
Also published as: CN114333039A

Abstract

The application discloses a method, a device and a medium for portrait clustering, and relates to the field of image processing. The method comprises the steps of extracting features of all snap shots collected by image collection equipment, clustering the extracted face features and clustering human body features; when the confidence coefficient of the cluster features is larger than or equal to a first threshold value, generating a face clustering set and a human body clustering set; and when the preset requirements are met, clustering the snap shots to be clustered except the clustered snap shots to determine a new face clustering set and a new human body clustering set, merging all the new clustering sets into the face clustering set or the human body clustering set, and finally performing fusion clustering on the merged face clustering set and the merged human body clustering set to serve as reference data for face recognition. Therefore, after the first clustering is completed, clustering is performed on the snap shots of which the corresponding sets are not determined again when the preset requirements are met, the clustering recall rate is improved, and the accuracy of portrait clustering is further improved.

Description

Method, device and medium for clustering human images

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method, an apparatus, and a medium for clustering human images.

Background

The portrait clustering is the derivation of the face recognition and the human body image search at present, and is the core of the portrait big data algorithm. The method is mainly applied to population basic data generation, case solving clue generation and the like, a large number of human face features and human body features need to be clustered, the same person is classified into the same class, and a complete one-person one-file database is formed.

The mainstream algorithm of the existing portrait clustering is based on the comparison of feature vectors, including face features and human body features, and the face clustering and the human body clustering are firstly respectively carried out, and then the face and the human body are associated to carry out fusion clustering. As the orientation of the human body is changed frequently in the process of movement of the human body, the orientations of the human body captured under different cameras are different. The feature comparison algorithm has a certain precision range, and when the change range of the human body orientation is too large, the recall rate may be reduced, so that the accuracy of portrait clustering is reduced. For example, the face/body features of a front face may be missed during clustering, and for example, when a case is to be solved, the face features and the body features missed during clustering may cause inaccurate face clustering, which affects the accuracy of face recognition, and finally, the case cannot be solved.

Therefore, how to improve the accuracy of portrait clustering is a problem that needs to be solved urgently by those skilled in the art.

Disclosure of Invention

The application aims to provide a method, a device and a medium for portrait clustering, which are used for improving the accuracy of portrait clustering.

In order to solve the above technical problem, the present application provides a method for clustering human images, including:

acquiring each snapshot acquired by image acquisition equipment;

extracting features of the snap shots, wherein the feature extraction at least comprises face feature extraction and human body feature extraction;

clustering the extracted human face features and clustering human body features;

respectively generating a face clustering set and a human body clustering set under the condition that the confidence of the clustering features is greater than or equal to a first threshold; wherein each element in the set is a corresponding snapshot;

when the preset requirements are met, clustering the snapshot images to be clustered except the clustered snapshot images so as to obtain a new face clustering set and a new human body clustering set;

merging the new face cluster set into the face cluster set and merging the new body cluster set into the body cluster set;

and performing fusion clustering on the combined face clustering set and the combined human body clustering set to serve as reference data for face recognition.

Preferably, the acquiring of each snapshot acquired by the image acquisition device includes:

receiving a live image acquired by the image acquisition device;

detecting and tracking each target by using an algorithm, and caching snapshot pictures of each target;

under the condition that the human body orientation change of each target in the live image is judged, capturing and caching a plurality of to-be-determined capturing pictures corresponding to the human body orientation change of each target from the live image;

acquiring the definition of each snapshot to be determined;

and selecting the snapshot image to be determined with the highest definition as the snapshot image corresponding to each human body orientation of each target.

Preferably, the method further comprises the following steps:

acquiring a unique ID value generated for the same target in each snapshot image;

adding a mark of the corresponding image acquisition device to an ID value corresponding to each snapshot of each image acquisition device;

the preset requirements are as follows:

and the ID value corresponding to each snapshot to be clustered is the same as the ID values corresponding to the elements in the face clustering set and the human body clustering set.

Preferably, when the preset requirement is met, clustering the snap shots to be clustered except the already clustered snap shots so as to obtain a new face clustering set and a new human body clustering set comprises:

dividing the snapshot images to be clustered into a face candidate anchor point set and a human body candidate anchor point set according to the ID values corresponding to the snapshot images to be clustered and the characteristic extraction results; the ID values corresponding to elements in the same anchor point set are the same, and the ID values are the same as the ID values of one or more members in the face clustering set or the human body clustering set which are clustered correspondingly;

comparing the characteristic vector distance between each element in the face candidate anchor point set and the element with the same ID value in the face clustering set, and comparing the characteristic vector distance between each element in the human body candidate anchor point set and the element with the same ID value in the human body clustering set;

and determining a set corresponding to each snapshot to be clustered so as to acquire a new face clustering set and a new human body clustering set when the confidence of the clustering features is greater than or equal to a second threshold and smaller than the first threshold.

Preferably, the feature extraction further comprises: extracting the gender and the color attribute of the clothes of each target;

after feature vector distance comparison is performed between each element in the face candidate anchor set and an element in the face clustering set having the same ID value, and feature vector distance comparison is performed between each element in the human body candidate anchor set and an element in the human body clustering set having the same ID value, the method further includes:

and carrying out consistency judgment on the sex and/or the clothes color attribute of each element in the face candidate anchor point set and the element with the same ID value in the face clustering set, and carrying out consistency judgment on the sex and/or the clothes color attribute of each element in the human body candidate anchor point set and the element with the same ID value in the human body clustering set.

after clustering elements in the face candidate anchor point set and the human body candidate anchor point set, if the snapshot pictures to be clustered still exist, clustering the snapshot pictures to be clustered with the elements in the face anchor point set and the elements in the human body anchor point set;

and under the condition that the confidence of the cluster characteristics is greater than or equal to a third threshold and smaller than the first threshold, determining a set corresponding to each snapshot to be clustered so as to acquire a new face clustering set and a new human body clustering set.

Preferably, the merging the face clustering set and the human clustering set, performing fusion clustering, includes:

extracting human body features corresponding to all elements containing human body features from the combined human face clustering set and extracting human body features corresponding to all elements from the combined human body clustering set;

matching the human body features corresponding to each element containing human body features in the human face clustering set with the human body features corresponding to each element in the human body clustering set;

confirming that the matching is successful under the condition that the confidence of the clustering features is greater than a fourth threshold;

respectively counting the elements in the human body clustering set which have the maximum matching success frequency with all elements including human body features in the human face clustering set and the matching success frequency reaches a threshold value;

and respectively carrying out fusion clustering on each element containing human body features in the face clustering set and the corresponding element in the human body clustering set with the maximum matching success frequency and the matching success frequency reaching the threshold value.

In order to solve the above technical problem, the present application further provides a portrait clustering apparatus, including:

the first acquisition module is used for acquiring each snapshot acquired by the image acquisition equipment;

the extraction module is used for extracting the features of the snap shots, wherein the feature extraction at least comprises face feature extraction and human body feature extraction;

the first clustering module is used for clustering the extracted human face features and clustering the human body features;

the generating module is used for respectively generating a face clustering set and a human body clustering set under the condition that the confidence coefficient of the clustering features is greater than or equal to a first threshold value; wherein each element in the set is a corresponding snapshot;

the second clustering module is used for clustering the snapshot images to be clustered except the clustered snapshot images so as to acquire a new face clustering set and a new human body clustering set when the preset requirements are met;

an incorporating module for incorporating the new face cluster set into the face cluster set and incorporating the new body cluster set into the body cluster set;

and the fusion clustering module is used for performing fusion clustering on the combined face clustering set and the combined human body clustering set to serve as reference data for face recognition.

a memory for storing a computer program;

and the processor is used for realizing the steps of the portrait clustering method when the computer program is executed.

In order to solve the above technical problem, the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the method for clustering the portraits.

The method for clustering the portrait comprises the steps of firstly, acquiring various snapshot pictures acquired by image acquisition equipment; extracting the features of the snap pictures, clustering the extracted human face features and clustering the human body features; and under the condition that the confidence of the clustering features is greater than or equal to a first threshold, generating a face clustering set and a human body clustering set, clustering the snap shots to be clustered except the clustered snap shots to obtain a new face clustering set and a new human body clustering set when the preset requirement is met, merging the new face clustering set into the face clustering set and merging the new human body clustering set into the human body clustering set, and performing fusion clustering on the merged face clustering set and the merged human body clustering set to serve as reference data for human image recognition. Therefore, after the first clustering is completed, clustering is performed on the snap shots of which the corresponding sets are not determined again when the preset requirements are met, the clustering recall rate is improved, and the accuracy of portrait clustering is further improved.

In addition, the application also provides a device for clustering the portrait and a computer readable storage medium, which have the same beneficial effects as the method for clustering the portrait.

Drawings

In order to more clearly illustrate the embodiments of the present application, the drawings needed for the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.

Fig. 1 is a flowchart of a method for clustering human images according to this embodiment;

FIG. 2 is a main flow chart of primary clustering;

FIG. 3 is a detailed flow chart for obtaining a fused clustering result;

fig. 4 is a structural diagram of a portrait clustering apparatus according to an embodiment of the present application;

FIG. 5 is a block diagram of an apparatus for clustering portraits according to another embodiment of the present disclosure;

fig. 6 is a flowchart of multi-clustering provided in the embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the present application.

The core of the application is to provide a method, a device and a medium for portrait clustering, which are used for improving the accuracy of portrait clustering.

In the generation of human mouth basic data and the generation of solution clues, a large number of human face features and human body features are generally required to be clustered, and the same person is classified into the same class to form a complete database of one person and one file.

In order that those skilled in the art will better understand the disclosure, the following detailed description will be given with reference to the accompanying drawings. Fig. 1 is a flowchart of a method for clustering human images according to this embodiment. As shown in fig. 1, the method includes:

s10: and acquiring each snapshot acquired by the image acquisition equipment.

Because the human body orientation changes frequently in the motion process, the human body orientation captured by the image acquisition equipment is different. Common image acquisition equipment comprises a camera, a video camera, a mobile phone and the like, and the image acquisition equipment selected in the application is the camera. And performing deep learning human body target detection analysis on the live video of each camera, and performing tracking algorithm analysis on each target feature to form a tracking track. The tracking algorithm may be a tracking algorithm such as a Kernel Correlation Filter (KCF), and the specific tracking algorithm is not limited herein.

The human body orientation changes frequently during the movement of the human body, so that the human body orientation change judgment needs to be performed on each human body target. The human body orientation can be judged by a deep learning human body orientation model, and the human body orientation is divided into four orientations, namely a front side, a left side, a right side and a back side. And if the human body orientation model finds that the human body orientation changes, starting a snapshot algorithm. The specific snapshot algorithm is not limited, and the snapshot algorithm adopted in this embodiment is as follows: after the human body orientation is changed, the snapshot position is preferably performed before the next human body orientation change. And judging the current target by a special image definition optimal model. And when the optimal algorithm threshold is met, starting the current frame snapshot and caching the snapshot image. And if the target leaves the picture or the tracking time exceeds the preset time, the target is taken as a mark for ending the target tracking. In addition, after the human body orientation is changed, the human body orientation is continuously judged to be changed, and if the human body orientation is recovered to the original human body orientation within the N frames, the snapshot process of the human body orientation change is abandoned. It should be noted that, in order to improve the accuracy of portrait clustering, in implementation, a unique track Identity (ID) value is generated for each track of the target; meanwhile, when the snapshot is cached, the ID value corresponding to the snapshot is recorded. After the ID values corresponding to the snap shots are obtained, all the snap shots and the corresponding ID values of the targets may be submitted to a database for storage. It should be noted that when each human target appears, the default snapshot cache picture is stored first, and then the snapshot cache is added only if the orientation changes, otherwise, the snapshot picture is preferred from the initial snapshot cache. In addition, each image acquisition device has an ID tracked by the image acquisition device, and the image acquisition devices are independent of one another. When the snapshots of all the image capturing devices are put together, the ID values may coincide, so in order to avoid this, the ID names are added with the prefixes of the image capturing devices in actual operation, so that the IDs of different image capturing devices can be distinguished. Assuming that the adopted image acquisition devices are cameras, the ID values of the snap shots of different cameras can be distinguished by non-repeated prefix names, such as IPC1-ID1 and IPC2-ID 1.

S11: and carrying out feature extraction on each snapshot, wherein the feature extraction at least comprises face feature extraction and human body feature extraction.

In the above steps, each snapshot is obtained, and in order to realize the clustering of the human images, feature extraction is firstly performed on each snapshot. In the implementation, a specially trained deep learning algorithm model can be used for extracting the human face and human body feature vectors, and the feature extraction algorithm is not limited, as long as the feature extraction of the snapshot image can be realized. Preferably, an algorithm with high accuracy is selected for feature extraction. And extracting feature vectors of all the snap shots of each target. Since the face clustering is usually implemented according to the association between the face and the human body, the feature extraction at least includes face feature extraction and human body feature extraction, and in addition, in order to improve the accuracy of the human body clustering, in the implementation, the extracted features may also include attribute identification of gender and jacket color, and the like. Likewise, after feature extraction is complete, the extracted feature data may be saved in a database. And the ID of the database is a key, and the snapshot picture, the characteristic vector and the attribute value of the same target are stored in an associated manner.

S12: clustering the extracted human face features and clustering the human body features.

Firstly, a data set is divided into two sets, namely a first image set containing human face characteristics and a second image set without the human face characteristics. Then, the human face and the human body characteristics are respectively clustered to generate a human face clustering set and a human body clustering set. Each cluster set contains a number of target classes. The clustering Algorithm is based on cosine distance, and may use K-means (K-means) clustering Algorithm or Conditional Random Field Algorithm (CRF), mean shift clustering, etc., wherein the cosine distance may be replaced by other distances, such as euclidean distance, hamming distance, Manhattan (Manhattan) distance, etc. The specific clustering algorithm and the characteristic distance formula are not limited in the present application.

S13: respectively generating a face clustering set and a human body clustering set under the condition that the confidence of the clustering features is greater than or equal to a first threshold; wherein each element in the set is a corresponding snapshot.

And clustering the extracted human face features to generate a human face cluster set A and clustering the human body features to generate a human body cluster set B, wherein the human face cluster set A comprises a plurality of subsets, namely subclasses of the members Ai of A, i =1, 2 and 3 … …, the human body cluster set B comprises a plurality of subsets, namely subclasses of the members Bi of B, and i =1, 2 and 3 … …. The subset in A and the subset in B contain snapshots of different people. When the first clustering is performed, a higher value is generally selected for the confidence of the clustering features, for example, the first threshold is 95%, and a suitable value is determined for the value of the specific first threshold according to actual conditions, which is not limited herein. For example, in the first clustering, the confidence of the clustering features is n1, it is assumed that a face clustering set a after clustering includes two face subsets a1 and a2, a human clustering set B includes two human subsets B1 and B2, and each subset is a different snapshot of the same person. Suppose that: a1 contains two human face pictures, marked as A11 and A12; a2 contains three face pictures, marked as A21, A22 and A23; b1 includes two human body pictures, marked as B11 and B12; b2 contains three human body pictures, which are marked as B21, B22 and B23.

S14: and when the preset requirements are met, clustering the snapshot images to be clustered except the clustered snapshot images to determine a new face clustering set and a new human body clustering set.

The face features and the human body features are clustered in the steps to generate a face clustering set and a human body clustering set respectively, the feature comparison algorithm has a certain precision range, and when the change range of the human body orientation is too large, the recall rate is possibly reduced. For example, the features of a face/human body on the front face may miss the features of the face/human body on several sides during clustering, so that, in order to improve the recall rate of portrait clustering, the snap shots to be clustered except the clustered snap shots may be clustered again. And when the preset requirements are met, clustering the snapshot images to be clustered except the clustered snapshot images to determine a new face clustering set and a new human body clustering set. It is preferable in the present application that the preset requirement is to cluster the images to be clustered while satisfying the same ID value.

The first clustering is performed in the above steps, and this step can be regarded as second clustering. It should be noted that, in the second clustering, the confidence of the cluster features is reduced to 80% compared with the confidence of the cluster features in the first clustering, and the setting of the confidence threshold is not limited herein.

S15: merging the new face cluster set into the face cluster set and merging the new body cluster set into the body cluster set.

In step S13, a face cluster set and a human body cluster set are obtained through the first clustering, in step S14, a new face cluster set and a new human body cluster set are obtained through the second clustering, and each new cluster set obtained through the second clustering is merged with the snap shot subset with the same ID in the corresponding first clustering, so as to obtain a merged face cluster set and a merged human body cluster set.

S16: and performing fusion clustering on the combined face clustering set and the combined human body clustering set to serve as reference data for face recognition.

Since the portrait clustering is realized by the association of human faces and human bodies. The snapshot of the human face may contain both human face features and human body features, and the snapshot of the human body only contains the human body features, so that the targets containing the human body features are extracted from the human face cluster Ai, the human body features of each target are extracted from the human body cluster Bi, the human body features in the Ai are compared with the human body features of the Bi, and r can be flexibly set for the successful matching which is considered to meet the threshold r. Counting the B subclass with the largest number of successful Ai matching times and the coincidence rate reaching a threshold q, marking as Bj, and similarly, q can also be flexibly set, and finally merging Ai and Bj, and marking as ABi. Therefore, the human face features and the human body features of the same person are fused, and portrait clustering is achieved.

In the method for clustering the portrait, first, each snapshot acquired by an image acquisition device is acquired; then, extracting the characteristics of the snap pictures, clustering the extracted human face characteristics and clustering the human body characteristics; generating a face clustering set and a human body clustering set under the condition that the confidence of the clustering features is greater than or equal to a first threshold; and when the preset requirements are met, clustering the snapshot images to be clustered except the clustered snapshot images to obtain a new face cluster set and a new human body cluster set, merging the new face cluster set and the new human body cluster set into the face cluster set and the human body cluster set, and fusing and clustering the merged face cluster set and the merged human body cluster set to serve as reference data for face recognition. Therefore, after the first clustering is completed, clustering is performed on the snap shots of which the corresponding sets are not determined again when the preset requirements are met, the clustering recall rate is improved, and the accuracy of portrait clustering is further improved.

Because the human body orientation changes frequently in the motion process, the human body orientations captured by the image acquisition equipment are different, and when the human body orientation changes, the captured snapshot may be only a fuzzy image, so that the number of effective snapshots is reduced when the portraits are clustered, the accuracy of portrait clustering is reduced, and therefore a proper snapshot is selected. In an implementation, as a preferred embodiment, acquiring each snapshot acquired by the image acquisition device includes:

receiving a live image acquired by an image acquisition device;

detecting and tracking each human body target by using an algorithm, and caching a target snapshot picture;

under the condition that the human body orientation change of each target in the live image is judged, capturing and caching a plurality of to-be-determined capturing images corresponding to the human body orientation change of each target from the live image;

acquiring the definition of each snapshot to be determined;

and selecting the snapshot image with the highest definition to be determined as the snapshot image corresponding to each human body orientation of each target.

After the body orientation is changed, the snapshot position is preferably taken before the next body orientation change. In the process, a plurality of to-be-determined snapshot images corresponding to the human body orientation change of each target are snapshot and cached, and then the to-be-determined snapshot image with the highest definition is selected as the snapshot image corresponding to each human body orientation of each target. If 5 images are captured and cached in the process of changing one human body orientation into another human body orientation, if the 2 nd image is the image with the highest definition, the 2 nd image is selected as the captured image in the process. And when the human body orientation is judged to be changed again, the same method is adopted to obtain the snapshot in the process, and then the snapshot of the orientations of a plurality of human bodies of the person in the motion process is obtained. It should be noted that the number of the snap shots to be determined is not limited, and may be selected according to actual situations. When each human body target appears, each human body target is detected and tracked by using an algorithm, a target snapshot image is cached, a snapshot cache is newly added later if the human body orientation changes, and otherwise, the snapshot image is preferably selected from the initial snapshot cache.

The snapshot with the highest definition to be determined is selected as the snapshot corresponding to each human body orientation of each target, so that the problem that the recall rate of the portrait cluster is reduced due to the existence of the unclear snapshot can be solved.

In order to cluster the snapshot images to be clustered except the clustered snapshot images, the snapshot images to be clustered may be clustered according to the ID values. In implementation, a unique track ID value generated for the same target in each snapshot image is obtained; adding corresponding marks of the image acquisition equipment to the ID values corresponding to the snap shots of the image acquisition equipment so as to distinguish the ID values of the snap shots of different image acquisition equipment;

the preset requirements are as follows:

and the ID value corresponding to each snapshot to be clustered is the same as the ID value corresponding to the elements in the face clustering set and the human body clustering set.

In this embodiment, because the identical target in each snapshot corresponds to a unique track ID value, the snapshot to be clustered may be clustered with elements having the same ID value as elements in the face clustering set and the human body clustering set. It should be noted that, in implementation, in order to distinguish the IDs of different image capturing devices, a prefix of the image capturing device may be added to the naming of the ID. Assuming that the adopted image acquisition devices are cameras, the ID values of the snap shots of different cameras can be distinguished by non-repeated prefix names, such as IPC1-ID1 and IPC2-ID 1. And inquiring whether the ID values of the sub-class members Ai in the face cluster and the sub-class members Bi in the human body cluster have other snapshot pictures with the same ID value. Then, for each subclass member Ai and Bi, inquiring all snapshot pictures with the same ID value according to the ID value, checking all snapshot pictures with the same ID value, and screening out the snapshot pictures which are not aggregated into A and B. The method for finding the same tracking ID is that each subset has several ID values counted. As exemplified in the above embodiments, A11 and A12 in A1 are the same ID and are denoted by IPC1-ID1, A21, A22 and A23 in A2 have two tracking IDs respectively, wherein the IDs of A21 and A22 are the same and are denoted by IPC2-ID1 and the ID of A23 is denoted by IPC3-ID 1; the same applies to B1 and B2. For each ID value, such as IPC1-ID1, pictures of the same ID in the unclustered snapshot are left. The captured images acquired by different cameras can be processed separately by adding prefixes of the cameras in front of the ID values, so that the situation that the ID values are overlapped when the captured images of all the cameras are put together is prevented. In addition, the naming method of the prefix of the image capturing device and the sequence of the prefix and the ID value of the image capturing device are not limited as long as the snap shots captured by different image capturing devices can be distinguished.

When the snap shots correspond to the ID values, the snap shots to be clustered are clustered when the ID values corresponding to the snap shots to be clustered are the same as the ID values corresponding to the elements in the face clustering set and the human body clustering set, so that the recall rate of the portrait clustering after the first clustering can be increased.

In the above embodiment, when the preset requirement is met, that is, the ID value corresponding to each snapshot to be clustered is the same as the ID value corresponding to the elements in the face clustering set and the human body clustering set, clustering the elements to be clustered. In implementation, as a preferred embodiment, when a preset requirement is met, clustering the snap shots to be clustered except the snap shots already clustered to determine a new face cluster set and a new human body cluster set includes:

dividing the snap shots to be clustered into a human face candidate anchor point set and a human body candidate anchor point set according to the ID values corresponding to the snap shots to be clustered and the characteristic extraction results; the ID values of elements in the same anchor point set are the same, and the ID values of the elements are the same as the ID values of one or more members in the corresponding clustered human face clustering set or human body clustering set;

and determining a set corresponding to each snapshotted target to be clustered when the confidence of the clustering characteristics is greater than or equal to a second threshold and smaller than a first threshold.

As the example listed in the above embodiment, for each ID value, such as IPC1-ID1, there are left pictures of the same ID in the clustered snapshot picture, and it is assumed that 5 pictures, 2 face features and 3 pictures, which are only body features, are found, and are respectively marked as IPC1-ID1-Ci and IPC1-ID1-Di, which are all ID values of IPC1-ID 1.

And performing feature vector distance comparison on each element in the face candidate anchor point set and the element with the same ID value in the face clustering set, and performing feature vector distance comparison on each element in the human body candidate anchor point set and the element with the same ID value in the human body clustering set, namely performing feature clustering on IPC1-ID1-C1 and IPC1-ID1-C2 and A11 and A12 respectively, wherein the confidence coefficient is n 2. For elements of IPC1-ID1-Di, a subset of the same IDs as IPC1-ID1 is first found in the B set. If there is the same ID, then the IPC1-ID1-Ci similar operation as described above is performed. If not found, the IPC1-ID1-Di is abandoned, and the cluster is regrouped into a cluster to be clustered. In implementation, in the second clustering, the confidence of the cluster feature is greater than or equal to the second threshold and smaller than the first threshold, that is, the confidence value is decreased from the confidence value in the first clustering; the specific value of the confidence threshold is not limited, and in the above embodiment, the confidence of the first clustering is selected to be 95%, and the confidence value in the second clustering may be selected to be 80%.

The method for determining the new face clustering set and the new human body clustering set for the images to be clustered through the ID values and the feature extraction results can realize secondary clustering of the images to be clustered, and further improve the recall rate of the face clustering.

On the basis of the above embodiment, although the set corresponding to the snap shots to be clustered is found through feature vector distance comparison, there may be similarity between the facial features or the body features of two persons with different genders, or similarity between the facial features or the body features of two persons wearing different types. Therefore, in implementation, in order to improve the accuracy of the human image clustering, the feature extraction includes, in addition to extracting the human face features and the human body features: extracting the gender and the color attribute of the clothes of each target;

after feature vector distance comparison is performed on each element in the face candidate anchor point set and an element with the same ID value in the face clustering set, and feature vector distance comparison is performed on each element in the human body candidate anchor point set and an element with the same ID value in the human body clustering set, the method further comprises the following steps:

and carrying out consistency judgment on the sex and/or the clothes color attribute of each element in the human face candidate anchor point set and the element with the same ID value in the human face clustering set, and carrying out consistency judgment on the sex and/or the clothes color attribute of each element in the human body candidate anchor point set and the element with the same ID value in the human body clustering set.

In implementation, besides extracting attributes of gender and/or clothes color, other attributes of the target may also be extracted, which is not limited herein, and the gender and/or clothes color attribute is taken as an example for analysis in this embodiment. In the implementation, the facial features and the body features may be compared first, and then the consistency determination of the gender and/or the color attribute of the clothes may be performed, or the consistency determination of the gender and/or the color attribute of the clothes may be performed first, and then the facial features and the body features may be compared, which is not limited herein. The judgment of the consistency of the human face features, the human body feature comparison and the gender/clothes color attributes can be called as confidence screening. As exemplified in the above examples, IPC1-ID1-C1 and IPC1-ID1-C2 were feature clustered with A11 and A12, respectively, and then IPC1-ID1-C1 and IPC1-ID1-C2 were sex and/or clothes color screened with A11 and A12, respectively. When consistency judgment of gender and/or clothes color attributes is carried out on each element in the face candidate anchor point set and elements with the same ID value in the face clustering set or consistency judgment of gender and/or clothes color attributes is carried out on each element in the human body candidate anchor point set and elements with the same ID value in the human body clustering set, the selected confidence coefficient is higher than the confidence coefficient when feature comparison is carried out, specific confidence coefficient values are not limited, the confidence coefficient value is usually selected to be 99%, namely the gender and the clothes color are completely matched, and the same person can be determined.

According to the method, the consistency of the gender and/or the color attribute of the clothes is judged besides the clustering according to the face characteristics and the human body characteristics, and the clustering inaccuracy caused by the similarity of the face characteristics or the human body characteristics of two persons with different genders or the similarity of the face characteristics or the human body characteristics of two persons wearing different persons can be avoided.

On the basis of the above embodiment, it may happen that the ID value of the same person changes during walking, for example, the ID value of one person is replaced by the ID value of another person. In order to avoid this, in an implementation, as a preferred embodiment, when a preset requirement is met, clustering the snap shots to be clustered except the snap shots already clustered to obtain a new face clustering set and a new human body clustering set includes:

after clustering elements in the face candidate anchor point set and the human body candidate anchor point set, if all the snap shots to be clustered still exist, clustering the snap shots to be clustered with all the elements in the face anchor point set and all the elements in the human body anchor point set;

and determining a set corresponding to each snapshot to be clustered so as to acquire a new face clustering set and a new human body clustering set when the confidence of the clustering features is greater than or equal to a third threshold and smaller than a first threshold.

After the comparison of the face features/the human body features and the judgment of the consistency of the gender and/or the clothes color attributes in the embodiment, if the snapshot images to be clustered still exist, clustering is performed on the snapshot images to be clustered, elements in the face anchor point set and elements in the human body anchor point set. For the screened anchor point set, each member sum can be used as a new clustering core, and each new clustering core member is clustered again. The clustering results of members with the same ID can be merged, and finally, the new clustering result is merged into subclasses corresponding to Ai and Bi. In the above embodiment, after IPC1-ID1-C1 and IPC1-ID1-C2 are respectively subjected to feature clustering, gender and clothes color screening with A11 and A12, any element which does not reach the corresponding confidence value in feature comparison, gender and clothes color is screened out, for example, IPC1-ID1-C1 is screened out. Then IPC1-ID1-C2 makes a third clustering as a new anchor set. For example, IPC1-ID1-C2 and IPC1-ID1-Di are newly gathered into 3 snap shots respectively, and IPC1-ID1-Ci and IPC1-ID1-Di are merged with A1 and B1 respectively. This completes the cubic clustering of one ID value of one subset of a. Similar loop operations are performed for other ID values, such as IPC2-ID1, IPC3-ID1, etc. It should be noted that the confidence of the cluster feature in the third clustering is greater than or equal to the third threshold and smaller than the first threshold, that is, the confidence of the cluster feature in the third clustering is lower than the confidence of the cluster feature in the first clustering, and the setting of the specific value of the confidence threshold is not limited herein. Therefore, the snapshot images are clustered for three times, the clustering times of the snapshot images are not limited in implementation, and the recall rate of the portrait clustering can be improved. Furthermore, effective data can be provided when human image clustering is needed in case of solving a case or performing population data census and the like.

After the face features, the human body feature clusters and the gender/clothes color attribute consistency judgment, if the elements to be clustered still exist, the elements to be clustered are clustered for the third time, so that the recall rate of the portrait clusters can be improved.

In the above embodiment, the face clustering result and the human body clustering result are obtained, and then fusion clustering is performed according to the association between the face and the human body. In the implementation, people with the same wearing stature and the same stature may be successfully matched, so that verification needs to be performed again, and the accuracy of portrait clustering is improved. As a preferred embodiment, performing fused clustering on the combined face clustering set and the combined human body clustering set includes:

extracting human body features corresponding to all elements including human body features from the combined human face clustering set and extracting human body features corresponding to all elements from the combined human body clustering set;

matching the human body features corresponding to each element containing the human body features in the human face clustering set with the human body features corresponding to each element in the human body clustering set;

confirming that the matching is successful under the condition that the confidence coefficient of the cluster features is larger than a fourth threshold;

respectively counting the elements in the human body clustering set which have the maximum matching success times with all elements containing human body characteristics in the human face clustering set and the matching success times of which reach a threshold value;

and respectively carrying out fusion clustering on each element containing the human body features in the human face clustering set and the corresponding element in the human body clustering set with the maximum matching success frequency and the matching success frequency reaching a threshold value.

And in the process of carrying out fusion clustering on the combined face clustering set and the combined human body clustering set, the confidence coefficient of the clustering characteristics is greater than a fourth threshold value. The specific value of the fourth threshold is not limited. Fig. 2 is a main flow chart of primary clustering. As can be seen from fig. 2, it mainly includes S17: acquiring a face clustering result; s18: acquiring a human body clustering result; s19: and acquiring a fusion clustering result. In the implementation, the order of S17 and S18 is not limited. Fig. 3 is a specific flowchart for obtaining a fused clustering result. As shown in fig. 3, the flow includes the following steps.

S191: acquiring a new face clustering result Ai and a new human body clustering result Bi;

s192: extracting objects containing human body features from Ai, and extracting the human body features of each object from Bi;

s193: comparing the human body characteristics in the Ai with the human body characteristics of the Bi, wherein r can be flexibly set for the successful matching which is considered to meet the threshold r;

s194: counting the B subclass with the most Ai matching success times and the coincidence rate reaching a threshold value, and recording the B subclass as Bj;

s195: and combining Ai and Bj, and marking as ABi.

And finishing the feature clustering operation at this time. Since such clustering operations may be performed once a day, the clustering results after each clustering must be merged with the historical cluster set to update the database. Therefore, after S195, S196 is also included: ABi are clustered with historical class results. The method of merging is as follows: and (5) performing feature mean distance calculation on ABi and the clustered classes in the database, and merging ABi into the closest class to complete clustering. By feature mean, we mean the mean of all the features of interest in each class.

In the method for performing fusion clustering on the new face clustering set and the new human body clustering set provided by the embodiment, each element including human body features in the face clustering set is subjected to fusion clustering with the element in the human body clustering set which has the largest matching success frequency and the frequency reaching the threshold value, so that the situation that people with the same wear and different statures are successfully matched is effectively avoided, and the accuracy of portrait clustering is improved.

In the above embodiments, the method for clustering the portrait is described in detail, and the present application also provides embodiments corresponding to the device for clustering the portrait. It should be noted that the present application describes the embodiments of the apparatus portion from two perspectives, one from the perspective of the function module and the other from the perspective of the hardware.

Fig. 4 is a structural diagram of a portrait clustering device according to an embodiment of the present application. The present embodiment is based on the angle of the function module, including:

the acquisition module 10 is used for acquiring various snapshot images acquired by the image acquisition equipment;

the extraction module 11 is configured to perform feature extraction on each snapshot, where the feature extraction at least includes face feature extraction and human body feature extraction;

the first clustering module 12 is used for clustering the extracted human face features and clustering the human body features;

the generating module 13 is configured to generate a face clustering set and a human body clustering set respectively when the confidence of the clustering features is greater than or equal to a first threshold; wherein each element in the set is a corresponding snapshot;

the second clustering module 14 is configured to cluster the snap shots to be clustered except the clustered snap shots so as to obtain a new face clustering set and a new human body clustering set when a preset requirement is met;

an merging module 15, configured to merge the new face cluster set into the face cluster set and merge the new human body cluster set into the human body cluster set;

and the fusion clustering module 16 is configured to perform fusion clustering on the new face clustering set and the new human body clustering set to serve as reference data for human image recognition.

Since the embodiments of the apparatus portion and the method portion correspond to each other, please refer to the description of the embodiments of the method portion for the embodiments of the apparatus portion, which is not repeated here.

The device for clustering the human images provided by the embodiment clusters the snap shots through the acquisition module, the extraction module, the first clustering module, the generation module, the second clustering module, the merging module and the fusion clustering module. Therefore, after the device completes the first clustering, the snapshot images of the undetermined corresponding set are clustered again when the preset requirements are met, the clustering recall rate is improved, and the accuracy of portrait clustering is further improved.

Fig. 5 is a structural diagram of a portrait clustering device according to another embodiment of the present application. This embodiment is based on the hardware angle, as shown in fig. 5, the device for clustering human images includes:

a memory 20 for storing a computer program;

a processor 21 for implementing the steps of the method of portrait clustering as mentioned in the above embodiments when executing the computer program.

The device for clustering the human images provided by the embodiment may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, or a desktop computer.

The processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The Processor 21 may be implemented in hardware using at least one of a Digital Signal Processor (DSP), a Field-Programmable Gate Array (FPGA), and a Programmable Logic Array (PLA). The processor 21 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 21 may be a Graphics Processing Unit (GPU) which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, the processor 21 may further include an Artificial Intelligence (AI) processor for processing computational operations related to machine learning.

The memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing the following computer program 201, wherein after being loaded and executed by the processor 21, the computer program can implement the relevant steps of the method for clustering human images disclosed in any one of the foregoing embodiments. In addition, the resources stored in the memory 20 may also include an operating system 202, data 203, and the like, and the storage manner may be a transient storage manner or a permanent storage manner. Operating system 202 may include, among others, Windows, Unix, Linux, and the like. The data 203 may include, but is not limited to, data related to the above-mentioned methods of portrait clustering, and the like.

In some embodiments, the image clustering device may further include a display 22, an input/output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.

Those skilled in the art will appreciate that the structure shown in FIG. 5 does not constitute a limitation of the means for clustering the portraits and may include more or fewer components than those shown.

The device for clustering the human images provided by the embodiment of the application comprises a memory and a processor, wherein when the processor executes a program stored in the memory, the following method can be realized: the effect of the portrait clustering method is the same as that of the portrait clustering method.

Finally, the application also provides a corresponding embodiment of the computer readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps as set forth in the above-mentioned method embodiments.

It is to be understood that if the method in the above embodiments is implemented in the form of software functional units and sold or used as a stand-alone product, it can be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium and executes all or part of the steps of the methods described in the embodiments of the present application, or all or part of the technical solutions. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The computer-readable storage medium provided by the application comprises the method for clustering the portrait, and the effects are the same as those of the method.

In order to make those skilled in the art better understand the technical solution of the present application, the above-mentioned present application is further described in detail with reference to fig. 6, and fig. 6 is a flowchart of multiple clustering provided in the embodiment of the present application. The method mainly comprises the following steps:

s20: the multi-person faces the snapshot target association ID;

s21: clustering the track ID associated targets for multiple times;

s22: and fusing and clustering.

Wherein, S20 specifically includes:

s201: forming a target track by a tracking algorithm;

s202: maintaining an ID value for the same track;

s203: judging the orientation change of the human body;

s204: changing the orientation of the human body to generate a snapshot;

s205: the target is displayed;

s206: saving n snapshot results of the same ID and corresponding ID values;

s207: and extracting the feature vector, the gender and the clothes color attribute of each snapshot image.

S21 specifically includes:

s211: classifying the data set into a first image class and a second image class;

s212: face clustering and human body clustering;

s213: judging whether all snapshot pictures with the same ID value are contained in each member class;

s214: acquiring a snapshot picture which does not enter the current class;

s215: screening confidence;

s216: selecting a new anchor point snapshot picture;

s217: clustering again, the result is merged into the current class.

The fused cluster of step S22 is described in detail above, and will not be described herein again.

When the portrait is clustered, the human body orientation changes, so that the recall rate is easily reduced, and through the association ID on the track, the recall rate is effectively improved by clustering for multiple times by using the association targets with multiple human body orientations, so that effective data is provided for cases of solving cases or general population data survey and the like which need portrait clustering.

The method, the device and the medium for clustering the portrait provided by the application are described in detail above. The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.

It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

Claims

1. A method of portrait clustering, comprising:

acquiring each snapshot acquired by image acquisition equipment;

respectively generating a face clustering set and a human body clustering set under the condition that the confidence of the clustering features is greater than or equal to a first threshold; wherein each element in the set is the corresponding snapshot;

when the preset requirements are met, clustering the snap shots to be clustered except the snap shots already clustered so as to obtain a new face clustering set and a new human body clustering set;

performing fusion clustering on the combined face clustering set and the combined human body clustering set to serve as reference data for face recognition;

wherein the preset requirements are:

the ID value corresponding to each snapshot to be clustered is the same as the ID value corresponding to the elements in the face clustering set and the human body clustering set;

correspondingly, the clustering the snap shots to be clustered except the snap shots already clustered so as to obtain a new face clustering set and a new human body clustering set comprises: dividing the snap shots to be clustered into a face candidate anchor point set and a human body candidate anchor point set according to the ID values corresponding to the snap shots to be clustered and the characteristic extraction results; the ID values corresponding to elements in the same anchor point set are the same, and the ID values are the same as the ID values of one or more members in the face clustering set or the human body clustering set which are clustered correspondingly;

and determining a set corresponding to each snapshot to be clustered under the condition that the confidence of the cluster characteristics is greater than or equal to a second threshold and smaller than the first threshold so as to acquire a new face cluster set and a new human body cluster set.

2. The method for clustering human images according to claim 1, wherein the acquiring each snapshot captured by the image capturing device comprises:

receiving a live image acquired by the image acquisition device;

detecting and tracking each target by using an algorithm, and caching the snapshot image of each target;

under the condition that the human body orientation change of each target in the live-action image is judged, capturing and caching a plurality of to-be-determined capturing pictures corresponding to the human body orientation change of each target from the live-action image;

acquiring the definition of each snapshot to be determined;

3. The method of portrait clustering according to claim 1, wherein the feature extraction further comprises: extracting the gender and the clothes color attribute of each target;

after the feature vector distance comparison is performed between each element in the face candidate anchor set and the element in the face clustering set having the same ID value, and the feature vector distance comparison is performed between each element in the human body candidate anchor set and the element in the human body clustering set having the same ID value, the method further includes:

4. The method for clustering human images according to claim 1 or 3, wherein the clustering the snap shots to be clustered except the snap shots already clustered when a preset requirement is met so as to obtain a new face clustering set and a new human body clustering set comprises:

after clustering elements in the face candidate anchor point set and the human body candidate anchor point set, if the snapshot pictures to be clustered still exist, clustering the snapshot pictures to be clustered with each element in the face anchor point set and each element in the human body anchor point set;

and under the condition that the confidence of the cluster features is greater than or equal to a third threshold and smaller than the first threshold, determining a set corresponding to each snapshot to be clustered so as to acquire a new face clustering set and a new human clustering set.

5. The method for clustering human images according to claim 1, wherein the fused clustering of the combined face cluster set and the combined human body cluster set comprises:

confirming that the matching is successful under the condition that the confidence coefficient of the clustering features is greater than a fourth threshold value;

6. An apparatus for clustering portraits, comprising:

the generating module is used for respectively generating a face clustering set and a human body clustering set under the condition that the confidence coefficient of the clustering features is greater than or equal to a first threshold value; wherein each element in the set is the corresponding snapshot;

the second clustering module is used for clustering the snap shots to be clustered except the clustered snap shots so as to acquire a new face clustering set and a new human body clustering set when the preset requirements are met;

the fusion clustering module is used for performing fusion clustering on the combined face clustering set and the combined human body clustering set to serve as reference data for face recognition;

wherein the preset requirements are:

7. An apparatus for clustering portraits, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the method of portrait clustering according to any of the claims 1 to 5 when executing the computer program.

8. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the method for portrait clustering according to any one of claims 1 to 5.