CN114612985B

CN114612985B - Portrait query method, system, equipment and storage medium

Info

Publication number: CN114612985B
Application number: CN202210264194.0A
Authority: CN
Inventors: 李厚强; 周文罡; 谢乔康
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2022-03-17
Filing date: 2022-03-17
Publication date: 2024-04-02
Anticipated expiration: 2042-03-17
Also published as: CN114612985A

Abstract

The invention discloses a portrait inquiry method, a system, equipment and a storage medium, which designs a new inquiry framework, further excavates face information, increases the constraint among person inquiry results, seeks to reduce the influence of person changing and improves the portrait inquiry accuracy; specific: 1) When the face is searched, the high confidence face in the searched database is used for carrying out repeated iterative search, the face searching performance is further mined, a new query set is obtained, the accuracy is ensured, and the number of searched reliable samples is increased. 2) By increasing the constraint among the person query results, the database to be searched is reduced, and the searching performance is further improved. 3) The physical appearance characteristics are enhanced by using KNN characteristic expansion, the search result is improved, the Top-lambda similarity is designed, and the influence of different dressing on the characteristic similarity is eliminated as much as possible.

Description

Portrait query method, system, equipment and storage medium

Technical Field

The present invention relates to the field of portrait query technologies, and in particular, to a portrait query method, system, device, and storage medium.

Background

A portrait query refers to retrieving a person in a database from only one portrait drawing of the person that contains only face information, where the faces of the person's pictures in the database may not be visible. At present, research on person retrieval in academia and industry is mainly characterized by face recognition and pedestrian re-recognition, and the two problems are generally seen separately, however, on a higher level, the two tasks are actually all the same thing, namely, identity (ID) of a person is confirmed, only information used is different, one uses a face, and one uses body information. Limiting the available information to the face or body information can greatly limit the generalizability of the algorithm, and the portrait inquiry is just an application with more application prospects, which needs to utilize the face and body information at the same time.

In many cases, we can obtain only one portrait picture of a person, for example, want to find out the trace of the person in the monitoring of the whole city according to one portrait or portrait of the person; for another example, the movie industry wants to directly retrieve all the pictures of an actor in a movie television show. In these scenarios, the front face of a person is sometimes not visible in the database to be searched (e.g., surveillance video or movie pictures), and the wearing, environment, and even age-appearance of the same person may change. At this time, it is obviously insufficient to search only by face or body suit, how to design an effective portrait inquiry system, and make use of face and body information to fit the actual application scene more, and it is more challenging.

The portrait inquiry system mainly comprises two parts of face retrieval and body retrieval. The human face retrieval needs to find out images of the same person containing human faces in a database as much as possible according to the portrait pictures on the premise of ensuring accuracy, and the body retrieval uses the characteristics of body dressing and the like of the samples to do a second retrieval, and recalls more picture results containing human faces or not containing human faces of the person. Existing methods generally focus on designing better convolutional neural networks (Convolutional Neural Network, CNN) for extracting more robust face and body features, while ignoring the design of the overall search flow. In practice, a good search frame design often achieves a half-effort effect, and better performance is achieved on the basis of using the same features.

Based on the above description, the prior art mainly has the following technical problems:

1) In the face retrieval part, the existing method only uses a query sample (the portrait of the face to be queried) to directly query in the database, however, the face of the query sample is different from the face data distribution of the image in the database, which causes performance degradation. Face retrieval is the basis of the whole portrait inquiry, and if the potential of face recognition cannot be fully utilized, the performance of the whole system is reduced.

2) The existing method directly searches from the whole database, the search results of different query samples are not constrained, mutual exclusivity of the identities of the characters is not considered, namely the same character cannot belong to two identities at the same time, so that the search system cannot optimize the search results of a single query sample from global information, and the search results have certain limitation.

3) In the body retrieval part, the prior method does not specially design the appearance interference caused by the character changing, and directly uses the body dressing of all the results obtained by the first step of face retrieval to carry out secondary retrieval. However, the change in character dressing has a great influence on the appearance, and the use of the character dressing for body retrieval without distinction will greatly limit the performance of secondary retrieval.

Disclosure of Invention

The invention aims to provide a portrait inquiry method, a portrait inquiry system, portrait inquiry equipment and a storage medium, which can improve the retrieval efficiency and the portrait inquiry accuracy.

The invention aims at realizing the following technical scheme:

a portrait inquiry method, comprising:

face feature extraction is respectively carried out on all the portrait of the person to be queried in the video to be queried, and face feature and body feature extraction are respectively carried out on the images in the database;

carrying out iterative retrieval on the portrait of the current person to be queried in the video to be queried by utilizing the facial features to obtain a facial matching result; when searching for the last time, searching for an image with the average face feature similarity exceeding a threshold value with all images in the query set in the database by utilizing the query set obtained by the last search, combining the obtained current search result with the query set obtained by the last search to be used as the query set obtained at the current time, and determining the face matching result of the portrait of the person to be queried at present through the query set obtained at the last time; when searching for the first time, the query set only contains the portrait of the current person to be queried in the video to be queried, and the threshold value is gradually reduced when searching for each time;

reducing the database by using the face matching result of the portrait of the non-current person to be queried and the images of all other persons in the same frame with the images in the face matching result of the portrait of the current person to be queried to obtain a reduced database;

and respectively enhancing the body characteristics of the reduced database and the images in the face matching result, taking the face matching result as a query library, and searching in the reduced database by using the similarity of the enhanced body characteristics to obtain the search result of the portrait of the character to be queried currently.

A portrait inquiry system, comprising:

the feature extraction unit is used for respectively extracting facial features from all the portrait of the person to be queried in the video to be queried and respectively extracting facial features and somatic features from the images in the database;

the face iterative search and face matching result acquisition unit is used for carrying out iterative search on the portrait of the current person to be queried in the video to be queried by utilizing the face characteristics to obtain a face matching result; when searching for the last time, searching for an image with the average face feature similarity exceeding a threshold value with all images in the query set in the database by utilizing the query set obtained by the last search, combining the obtained current search result with the query set obtained by the last search to be used as the query set obtained at the current time, and determining the face matching result of the portrait of the person to be queried at present through the query set obtained at the last time; when searching for the first time, the query set only contains the portrait of the current person to be queried in the video to be queried, and the threshold value is gradually reduced when searching for each time;

the database reduction unit is used for reducing the database by utilizing the face matching result of the portrait of the person to be queried not currently and the images of all other persons in the same frame in the face matching result of the portrait of the person to be queried currently to obtain a reduced database;

the body retrieval and joint retrieval result generation unit is used for respectively enhancing body characteristics of the reduced database and images in the face matching result, taking the face matching result as a query library, and retrieving the similarity of the enhanced appearance characteristics in the reduced database to obtain the retrieval result of the portrait of the character to be queried currently.

A processing apparatus, comprising: one or more processors; a memory for storing one or more programs;

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the aforementioned methods.

A readable storage medium storing a computer program which, when executed by a processor, implements the method described above.

According to the technical scheme provided by the invention, a new query framework is designed, the face information is further mined, the constraint among the query results of the characters is increased, the influence of the changing of the characters is relieved, and the accuracy of portrait query is improved; specific: 1) When the face is searched, the high confidence face in the searched database is used for carrying out repeated iterative search, the face searching performance is further mined, a new query set is obtained, the accuracy is ensured, and the number of searched reliable samples is increased. 2) And the constraint among the person query results is increased, so that the base to be searched is reduced, and the searching performance is further improved. 3) By enhancing the physical appearance characteristics, the effect of different apparel on the similarity of the characteristics is largely eliminated.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a portrait inquiry method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating the calculation of body Top- λ similarity between two different database images and a query library according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a portrait inquiry system according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a processing apparatus according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.

The terms that may be used herein will first be described as follows:

the terms "comprises," "comprising," "includes," "including," "has," "having" or other similar referents are to be construed to cover a non-exclusive inclusion. For example: including a particular feature (e.g., a starting material, component, ingredient, carrier, formulation, material, dimension, part, means, mechanism, apparatus, step, procedure, method, reaction condition, processing condition, parameter, algorithm, signal, data, product or article of manufacture, etc.), should be construed as including not only a particular feature but also other features known in the art that are not explicitly recited.

The following describes in detail a portrait inquiry method, a portrait inquiry system, portrait inquiry equipment and a portrait inquiry storage medium. What is not described in detail in the embodiments of the present invention belongs to the prior art known to those skilled in the art. The specific conditions are not noted in the examples of the present invention and are carried out according to the conditions conventional in the art or suggested by the manufacturer.

Example 1

In order to solve the three technical problems in the prior art, the embodiment of the invention provides a portrait inquiry method, which adopts a more effective inquiry flow frame, fully excavates the face information, and mutually constrains the results among different portrait of people to be inquired, so as to further adjust. Meanwhile, during body retrieval, a special design is made for the problem of character reloading, so that the influence of character reloading on body retrieval is reduced as much as possible, and a better and more robust portrait inquiry result is obtained. As shown in fig. 1, a flowchart of a portrait query method is provided in an embodiment of the present invention, which mainly includes the following steps:

step 1, face feature extraction is respectively carried out on all the portrait of the person to be queried in the video to be queried, and face feature and body feature extraction is respectively carried out on the images in the database.

In the embodiment of the invention, for a video to be searched, it is assumed that the video contains M portrait of people to be searchedLet the database include N images, denoted +.>Wherein q _i Representing the i-th portrait of the person to be queried, g _l Representing the first image in the database.

In the embodiment of the invention, the face detection network is used for respectively carrying out face detection on the M character portraits to be inquired, and the face feature extraction network is used for extracting corresponding face features from all face detection results; considering that all the portrait of the person to be queried is queried from the database, all the images in the database are respectively processed through the same network to obtain the face characteristics. In addition, all images in the database are respectively subjected to body detection through a body detection network, and corresponding body characteristics are extracted from all body detection results through a body characteristic extraction network. These face features and body features are stored for use by subsequent processes. All four networks are convolutional neural networks (Convolutional Neural Network, CNN).

Step 2, carrying out iterative retrieval on the portrait of the current person to be queried in the video to be queried by utilizing the face characteristics to obtain a face matching result; when searching for the last time, searching for an image with the average face feature similarity exceeding a threshold value with all images in the query set in the database by utilizing the query set obtained by the last search, combining the obtained current search result with the query set obtained by the last search to be used as the query set obtained at the current time, and determining the face matching result of the portrait of the person to be queried at present through the query set obtained at the last time; when searching for the first time, the query set only comprises the portrait of the current person to be queried in the video to be queried, and the threshold value is gradually reduced when searching for each time.

In the embodiment of the invention, the iterative search times are set as T, the threshold value of each search is gradually reduced, and the threshold value during iterative search is expressed as: th (th) ₁ ≥th ₂ ≥…≥th _T Where th denotes a threshold value, t=1, 2, …, T.

Portrait q of the ith person to be queried _i As the portrait of the person currently to be queried. At the first retrieval, query setsRetrieving from said database a portrait q of said i-th person to be queried _i The similarity of the face features of (a) exceeds a threshold th ₁ The image of (2) is recorded as { g ] as the first search result ¹ ,g ² ,…,g ⁿ N represents the number of images retrieved for the first time, each g represents an image retrieved from the database, and the first retrieval result is compared with the i-th portrait q of the person to be queried _i Merging to obtain query set->

During the second retrieval, the query set obtained by the first retrieval is retrieved from the databaseThe average face feature similarity of all images in the image is more than a threshold th ₂ As a second retrieval result; second search result and said query set +.>Merging to obtain query set->

And so on, iterating for T times to search, and utilizing the obtained query setDetermining face matching result M _i ：

Taking the current search as the t-th search as an example, the query set obtained by the last search is recorded asCalculating images with average facial feature similarity exceeding a threshold between the images in the database and all the images in the query set by the following steps:

wherein,representing the set of queries obtained from the last retrieval +.>The number of images in q represents the query set +.>In (e.g., { q when t=2) _i ,g ¹ ,g ² ,…,g ⁿ Each image of } is taken as q into the calculation as described above), g) _j Representing a j-th image in the database; s (·, ·) is a face feature similarity calculation function, ++>Represents the jth image g in the database calculated at the time of the t-th retrieval _j Query set obtained from last search +.>Average face feature similarity of all images in the image.

Through threshold th _t The t-th search result is filtered out and is matched with the query set obtained by the last searchMerging the query sets obtained as the t-th time +.>

In the embodiment of the invention, the retrieval results obtained by each iteration retrieval are images from a database, and the images are arranged in descending order according to the average human face feature similarity with all the images in the query set; and when merging, placing the search result obtained by the previous search at the end of the query set obtained by the previous search.

And 3, reducing the database by using the face matching result of the portrait of the person to be queried and the images of all other persons in the same frame with the images in the face matching result of the portrait of the person to be queried, and obtaining a reduced database.

The principle of the step is as follows: based on two assumptions: 1) Rejection constraints (Exclusion Constraint, EC) if the face matches the result M _i Portrait q belonging to the person currently to be queried with high confidence _i Then the face matches the result M _i Portrait q of other person to be inquired _j The confidence level of (j+.i) will be low; 2) Intra-Frame Constraint (IFC), and M _i Other figures in the same video frame will not likely belong to the figure portrait q to be queried _i The upper right of fig. 1 illustrates the principle of intra-frame constraints, i.e. there may be multiple persons simultaneously within the same image if the first person already belongs to q _i Then the other people who are present therewith will not likely belong to q _i Should not participate in the subsequent search ranking (it may be put to the end of the ranking result in practice). Specifically, in calculating the portrait q of the person to be queried _i Ranking result L of (2) _i When according to rejection constraint, M _i Should be at L _i And M is the foremost of _j (j. Noteq. I) should be at L _i Is the most rearward of (a). Meanwhile, according to intra-frame constraint, M _i All images of other people in the same frame are marked as F _i Should be located in list L _i Finally of (3).

Based on the above principle, the final reduced database available set is expressed as:

wherein, the ith character portrait q to be inquired _i As the portrait of the character to be queried currently, M _j Portrait q representing non-current person to be queried _j J=1, 2,., M, j+.i, M represents the number of portrait of person to be queried; f (F) _i Representing a portrait q with the ith person to be queried _i Face matching result M of (2) _i All images of all other persons in the same frame.

By the restriction of the two constraints, compared with the original face retrieval, only the face information is used, but the obvious performance improvement can be obtained.

And 4, respectively enhancing the body characteristics of the reduced database and the images in the face matching result, taking the face matching result as a query library, and searching in the reduced database by using the similarity of the enhanced body characteristics to obtain the search result of the portrait of the character to be queried currently.

Body feature enhancement may be enhanced using KNN feature expansion, including: for each body feature, the enhanced body feature is obtained by carrying out weighted fusion on the K neighbor features of the body feature, so that the enhanced body feature has robustness; through such operation, a query library and a reduced database U are obtained _i Is a physical feature of all image enhancement.

Then, for the single image u in the reduced database, respectively calculating the similarity of the enhanced body features with all the images in the query library, selecting Top-lambda similarity (namely lambda maximum similarity, lambda specific number can be set according to actual conditions or experience), taking the average value of lambda maximum similarity as the similarity of the single image u and the body features of the query library, which can reduce M _i The influence on the retrieval of the body when the body is equipped with a distinction.

After the body characteristic similarity between all the images in the reduced database and the query library is calculated, sequencing all the images in the reduced database according to the body characteristic similarity, placing the images at the tail end after the face matching result, and forming the final retrieval result of the portrait of the person to be queried currently. The reduced images are ordered according to the original sequence of the images in the database, namely, the images are arranged in ascending order according to the original corner marks of the images.

Fig. 2 shows a schematic diagram of body Top- λ similarity calculation between two different database images and a query library, where λ=3 is set, that is, 3 highest similarity values are selected, and 3 highest similarity average values are calculated as body feature similarities between a single image and the query library, where the similarity calculated in this way is called Top- λ similarity, and is distinguished from conventional average similarity.

It should be noted that, each image in fig. 1 and fig. 2 comes from the existing public data set, so there is no problem of face privacy; the image content, the number of search results of each section, and the content referred to in fig. 2 are examples.

Compared with the traditional scheme, the scheme provided by the embodiment of the invention has the advantages that the face information in the base is more effectively mined, the mutual exclusivity of the identities of the characters is utilized, the size of the database is reduced, and the retrieval efficiency and performance are improved. In addition, through reasonable design of the body retrieval scheme, the influence of body wearing on the body retrieval is reduced, and finally, a more excellent and robust portrait query result is obtained.

Example two

The invention also provides a portrait inquiry system, which is mainly realized based on the method provided by the previous embodiment, as shown in fig. 3, and mainly comprises:

the feature extraction unit is used for extracting face features of all the figures to be queried in the video to be queried respectively and extracting face features and body features of images in the database respectively;

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the system is divided into different functional modules to perform all or part of the functions described above.

The main technical details of each unit in the above system are described in the first embodiment, so that they will not be repeated.

Example III

The present invention also provides a processing apparatus, as shown in fig. 4, which mainly includes: one or more processors; a memory for storing one or more programs; wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the methods provided by the foregoing embodiments.

Further, the processing device further comprises at least one input device and at least one output device; in the processing device, the processor, the memory, the input device and the output device are connected through buses.

In the embodiment of the invention, the specific types of the memory, the input device and the output device are not limited; for example:

the input device can be a touch screen, an image acquisition device, a physical key or a mouse and the like;

the output device may be a display terminal;

the memory may be random access memory (Random Access Memory, RAM) or non-volatile memory (non-volatile memory), such as disk memory.

Example IV

The invention also provides a readable storage medium storing a computer program which, when executed by a processor, implements the method provided by the foregoing embodiments.

The readable storage medium according to the embodiment of the present invention may be provided as a computer readable storage medium in the aforementioned processing apparatus, for example, as a memory in the processing apparatus. The readable storage medium may be any of various media capable of storing a program code, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, and an optical disk.

The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims

1. A portrait inquiry method, comprising:

2. The portrait inquiry method according to claim 1, wherein the extracting facial features and body features of the images in the database respectively includes:

for a video to be retrieved, it contains M portrait of the person to be queriedLet the database include N images, denoted +.>Wherein q _i Representing the i-th portrait of the person to be queried, g _l Representing a first image in the database;

respectively carrying out face detection on the portrait of M people to be inquired through a face detection network, and extracting corresponding face features from all face detection results through a face feature extraction network;

respectively carrying out face detection on N images in a database through a face detection network, and extracting corresponding face features from all face detection results through a face feature extraction network; and respectively performing body detection on the N images in the database through a body detection network, and extracting corresponding body characteristics from all body detection results through a body characteristic extraction network.

3. The portrait inquiry method according to claim 1, wherein for the portrait of the current person to be inquired in the video to be inquired, performing iterative search by using face features, and obtaining a face matching result includes:

setting the iterative search times as T, gradually reducing the threshold value of each search, and representing the threshold value during iterative search: th (th) ₁ ≥th ₂ ≥…≥th _T Where th represents a threshold, t=1, 2,..;

portrait q of the ith person to be queried _i As the portrait of the character to be queried, the first time of searching, the query setRetrieving from said database a portrait q of said i-th person to be queried _i The similarity of the face features of (a) exceeds a threshold th ₁ The image of (2) is recorded as { g ] as the first search result ¹ ，g ² ，…，g ⁿ N represents the number of images retrieved for the first time, each g represents an image retrieved from the database, and the first retrieval result is compared with the i-th portrait q of the person to be queried _i Merging to obtain a set of queriesClose->

The retrieval results obtained by each iteration retrieval are images from a database and are arranged in descending order according to the average human face feature similarity with all images in the query set; and when merging, placing the search result obtained by the previous search at the end of the query set obtained by the previous search.

4. The portrait query method according to claim 1 or 3, wherein, when searching for the current time, using the query set obtained by the previous search, searching for an image in the database, the average facial feature similarity of which is greater than a threshold value with all images in the query set, and combining the obtained current search result with the query set obtained by the previous search as the query set obtained by the current time includes:

portrait q of the ith person to be queried _i Taking the current search as the t time as the portrait of the character to be searched, and marking the searching set obtained by the last search asCalculating images with average facial feature similarity exceeding a threshold between the images in the database and all the images in the query set by the following steps:

wherein,representing the set of queries obtained from the last retrieval +.>The number of images in q represents the query set +.>Single image of g _j Representing a j-th image in the database; s (·, ·) is a face feature similarity calculation function, ++>Represents the jth image g in the database calculated at the time of the t-th retrieval _j Query set obtained from last search +.>Average face feature similarity of all images in the image;

through threshold th _t The t-th search result is filtered out and is matched with the query set obtained by the last searchMergingQuery set obtained as the t-th time +.>

5. A portrait inquiry method according to claim 1 or 3, wherein the database is reduced by using a face matching result of a portrait other than the current portrait to be inquired and images of all other people in the same frame as all images in the face matching result of the portrait to be inquired, and the reduced database is expressed as:

6. The portrait query method according to claim 1 or 2, wherein the enhancing the body features of the reduced database and the images in the face matching result respectively includes:

portrait q of the ith person to be queried _i As the portrait of the person to be queried, matching the corresponding face with the result M _i As a query library, the corresponding reduced database is marked as U _i ；

Enhancing body features using KNN feature expansion, comprising: and for each body feature, obtaining the enhanced body feature by carrying out weighted fusion on the K neighbor features of the body feature.

7. The portrait query method of claim 6 wherein using the face matching result as a query repository, retrieving in the reduced database using the similarity of the enhanced somatic features, obtaining a retrieval result of a portrait of a person to be queried currently includes:

for the single image u in the reduced database, respectively calculating the similarity of the enhanced body characteristics of all the images in the query library, selecting Top-lambda similarity, namely lambda highest similarity, and taking the average value of the lambda highest similarity as the similarity of the single image u and the body characteristics of the query library;

after the body characteristic similarity between all the images in the reduced database and the query database is calculated, sequencing all the images in the reduced database according to the body characteristic similarity, placing the images in the reduced database at the tail end according to the original sequence after the face matching result, and forming the final retrieval result of the portrait of the current person to be queried.

8. A portrait inquiry system, characterised in that it is implemented on the basis of the method of any one of claims 1 to 7, the system comprising:

9. A processing apparatus, comprising: one or more processors; a memory for storing one or more programs;

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-7.

10. A readable storage medium storing a computer program, characterized in that the method according to any one of claims 1-7 is implemented when the computer program is executed by a processor.