CN111401158B - Difficult sample discovery method and device and computer equipment - Google Patents

Difficult sample discovery method and device and computer equipment Download PDF

Info

Publication number
CN111401158B
CN111401158B CN202010138382.XA CN202010138382A CN111401158B CN 111401158 B CN111401158 B CN 111401158B CN 202010138382 A CN202010138382 A CN 202010138382A CN 111401158 B CN111401158 B CN 111401158B
Authority
CN
China
Prior art keywords
sample image
model
sample
key point
difficult
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010138382.XA
Other languages
Chinese (zh)
Other versions
CN111401158A (en
Inventor
蔡中印
陆进
陈斌
宋晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010138382.XA priority Critical patent/CN111401158B/en
Publication of CN111401158A publication Critical patent/CN111401158A/en
Priority to PCT/CN2020/118113 priority patent/WO2021174820A1/en
Application granted granted Critical
Publication of CN111401158B publication Critical patent/CN111401158B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/178Human faces, e.g. facial parts, sketches or expressions estimating age from face image; using age information for improving recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a method for finding a difficult sample, which comprises the following steps: acquiring a first sample set; identifying sample images with unlabeled attributes based on a preset face attribute model to obtain the attributes of the unlabeled sample images; selecting sample images meeting preset conditions according to the attributes of the sample images; sorting the image quality of the sample images meeting the preset conditions based on a preset quality sorting model, and outputting a quality sorting result; performing key point position marking on the first sample images sequenced at the preset positions in the quality sequencing result based on the first key point marking model and the second key point marking model to obtain marked second sample images and third sample images; calculating unitized pixel deviation of key points in the marked second and third sample images; and if the unitized pixel deviation is larger than or equal to a preset value, the first sample image is taken as a difficult sample image. The embodiment of the invention can improve the discovery efficiency of difficult samples.

Description

Difficult sample discovery method and device and computer equipment
Technical Field
The embodiment of the invention relates to the technical field of image processing, in particular to a difficult sample discovery device and computer equipment.
Background
Most of the technologies related to the key points of the face focus on better neural network structures (such as HRNET), training methods (such as un-supported), loss functions (such as swing loss), and the like, and the training set and the testing set are public data sets which are already marked. However, since the number of the public data sets is small and the coverage cannot completely contain the actual service scene, when in actual use, a key point labeling set of the actual service scene needs to be established, but a model trained by the open source data set can only adapt to most of simple scenes in the actual service scene, and is often not very suitable for large angles, strong backlight, blurring, shielding and other conditions, so that in order to make the key point labeling model accurately label images in complex scenes, the key point model needs to be trained by a large number of sample images shot in complex scenes, however, when judging whether the sample images belong to sample images (difficult samples) shot in complex scenes, in the prior art, judgment is often performed in a manual mode, so that the judgment efficiency of the sample images is low, and a large amount of labor cost is required.
Disclosure of Invention
In view of the above, an object of the embodiments of the present invention is to provide a method, an apparatus, a computer device and a computer readable storage medium for finding a difficult sample, which are used for solving the problems of low efficiency and large manpower requirement of finding a difficult sample.
In order to achieve the above object, an embodiment of the present invention provides a method for finding a difficult sample, including:
acquiring a first sample set, wherein the first sample set comprises a plurality of sample images with unlabeled attributes;
identifying sample images with unlabeled attributes based on a preset face attribute model to obtain attributes of the unlabeled sample images, wherein the face attribute model is used for labeling the attributes of the sample images;
selecting sample images meeting preset conditions according to the attributes of the sample images;
sorting the image quality of the sample images meeting the preset conditions based on a preset quality sorting model, and outputting a quality sorting result, wherein the quality sorting model is a model which is obtained through training and is used for identifying the image quality;
performing keypoint point marking on a first sample image sequenced at a preset position in the quality sequencing result based on a first keypoint marking model to obtain a marked second sample image, and performing keypoint point marking on the first sample image sequenced at the preset position in the quality sequencing result based on a second keypoint marking model to obtain a marked third sample image, wherein the first keypoint marking model and the second keypoint marking model are used for performing keypoint positioning on the images;
Calculating unitized pixel deviation of the key points in the second sample image and the third sample image after labeling; a kind of electronic device with high-pressure air-conditioning system
And if the unitized pixel deviation is larger than or equal to a preset value, the first sample image is taken as a difficult sample image.
Optionally, the sample image is a face image, and the attribute of the sample image includes at least one of deflection angle, ambiguity, expression, backlight intensity, occlusion, glasses, mask, cap, bang, and age;
the preset condition is any one of the following conditions:
the deflection angle is larger than a first preset value, the ambiguity is larger than a second preset value, the backlight is larger than a third preset value, the expression is preset expression Dai Mojing, the mask is worn, the hat is worn, the Liu is on, and the age is larger than a fourth preset value or smaller than a fifth preset value.
Optionally, the method for finding a difficult sample further comprises:
labeling the difficult sample image through a third key point labeling model to obtain a first difficult sample image containing key points, and labeling the difficult sample image through a fourth key point labeling model to obtain a second difficult sample image containing key points;
Inputting the first difficult sample image and the second difficult sample image into a face detection frame model to obtain a third difficult sample image and a fourth difficult sample image containing a face frame;
and publishing the third difficult sample image and the fourth difficult sample image to a labeling website so that labeling personnel classify the third difficult sample image and the fourth difficult sample image, wherein classification categories comprise three categories, namely, a third key point labeling model prediction result is accurate, a fourth key point labeling model prediction result is inaccurate, both the third key point labeling model and the fourth key point labeling model prediction result are inaccurate, and both the third key point labeling model and the fourth key point labeling model and the face frame are inaccurate.
Optionally, the method for finding a difficult sample further comprises:
and receiving the classification result, and taking a difficult sample image corresponding to the classification result as a training sample image of the fourth key point mark model when the classification result is a third key point mark model prediction result and the fourth key point mark model prediction result is not correct.
Optionally, the method for finding a difficult sample further comprises:
receiving a classification result, and when the classification result is that the third key point marking model and the fourth key point marking model are not predicted, publishing a difficult sample image corresponding to the classification result to a marking website so that a marking person corrects the key points in the marked difficult sample image;
And receiving the difficult sample image corrected by the labeling personnel, and taking the difficult sample image corrected by the labeling personnel as training sample images of the third key point marking model and the fourth key point marking model.
Optionally, the method for finding a difficult sample further comprises:
receiving a classification result, and when the classification result is that the third key point marking model, the fourth key point marking model and the face frame are not identical, publishing a difficult sample image corresponding to the classification result to a marking website so as to enable a marking person to correct the face detection frame in the marked difficult sample image;
and receiving the difficult sample image corrected by the labeling personnel, and taking the difficult sample image corrected by the labeling personnel as a training sample image of the face detection frame model.
Optionally, the calculating the unitized pixel deviation of the key point in the second sample image and the third sample image after labeling is calculating the unitized pixel deviation of the centers of the left eye, the right eye and the mouth in the second sample image and the third sample image after labeling, including:
calculating a first unitized pixel deviation of the left eye in the second sample image and the third sample image after labeling;
Calculating a second unitized pixel deviation of the right eye in the second sample image and the third sample image after labeling;
calculating a third unitized pixel deviation of the mouth center in the second sample image and the third sample image after labeling;
and taking an average value of the first unitized pixel deviation, the second unitized pixel deviation and the third unitized pixel deviation as the unitized pixel deviation.
In order to achieve the above object, an embodiment of the present invention further provides a device for finding a difficult sample, including:
the acquisition module is used for acquiring a first sample set, wherein the first sample set comprises a plurality of sample images with unlabeled attributes;
the identification module is used for identifying each sample image with unlabeled attributes based on a preset face attribute model to obtain the attributes of each unlabeled sample image, wherein the face attribute model is used for labeling the attributes of the sample images;
the selecting module is used for selecting sample images meeting preset conditions according to the attributes of the sample images;
the sorting module is used for sorting the image quality of the sample images meeting the preset conditions based on a preset quality sorting model and outputting a quality sorting result, wherein the quality sorting model is a model which is obtained through training and used for identifying the image quality;
The marking module is used for marking key points of the first sample images sequenced at the preset positions in the quality sequencing result based on the first key point marking model to obtain a marked second sample image, and marking key points of the first sample images sequenced at the preset positions in the quality sequencing result based on the second key point marking model to obtain a marked third sample image, wherein the first key point marking model and the second key point marking model are used for positioning key points of the images;
the calculation module is used for calculating unitized pixel deviation of the key points in the second sample image and the third sample image after labeling; a kind of electronic device with high-pressure air-conditioning system
And the module is used for taking the first sample image as a difficult sample image if the unitized pixel deviation is larger than or equal to a preset value.
To achieve the above object, an embodiment of the present invention further provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the difficult sample discovery method as described above when executing the computer program.
To achieve the above object, an embodiment of the present invention also provides a computer-readable storage medium having stored therein a computer program executable by at least one processor to cause the at least one processor to perform the steps of the difficult sample discovery method as described above.
The method, the device, the computer equipment and the computer readable storage medium for finding the difficult sample provided by the embodiment of the invention are characterized in that a first sample set is obtained, and the first sample set comprises a plurality of sample images with unlabeled attributes; identifying sample images with unlabeled attributes based on a preset face attribute model to obtain attributes of the unlabeled sample images, wherein the face attribute model is used for labeling the attributes of the sample images; selecting sample images meeting preset conditions from the sample images; sorting the image quality of the sample images meeting the preset conditions based on a preset quality sorting model, and outputting a quality sorting result; performing keypoint point marking on a first sample image sequenced at a preset position in the quality sequencing result based on a first keypoint marking model to obtain a marked second sample image, and performing keypoint point marking on the first sample image sequenced at the preset position in the quality sequencing result based on a second keypoint marking model to obtain a marked third sample image, wherein the first keypoint marking model and the second keypoint marking model are used for performing keypoint positioning on the images; calculating unitized pixel deviation of the key points in the second sample image and the third sample image after labeling; and if the unitized pixel deviation is greater than or equal to a preset value, the first sample image is taken as a difficult sample image. According to the embodiment of the invention, the face attribute model and the quality sorting model are combined, so that a first sample image can be automatically selected from a large number of sample images, and then the first sample image is further analyzed through the first key point model and the second key point model, so that whether the first sample image is a difficult sample image can be judged. According to the embodiment of the invention, the sample image is not required to be judged manually, so that the finding efficiency of the difficult sample can be improved, and the labor cost can be reduced.
Drawings
FIG. 1 is a flow chart illustrating steps of an embodiment of a method for finding a difficult sample according to the present invention.
Fig. 2 is a detailed flowchart of a step of calculating unitized pixel deviation of centers of left eye, right eye and mouth of the second sample image and the third sample image after labeling according to an embodiment of the present invention.
FIG. 3 is a flow chart illustrating steps of another embodiment of the method for finding a difficult sample according to the present invention.
Fig. 4 is a schematic program module diagram of a difficult sample discovery apparatus according to an embodiment of the invention.
Fig. 5 is a schematic hardware structure of a computer device according to an embodiment of the invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
Advantages of the invention are further illustrated in the following description, taken in conjunction with the accompanying drawings and detailed description.
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in this disclosure to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
In the description of the present invention, it should be understood that the numerical references before the steps do not identify the order in which the steps are performed, but are merely used to facilitate description of the present invention and to distinguish between each step, and thus should not be construed as limiting the present invention.
Referring to fig. 1, a flowchart of a method for finding a difficult sample according to a first embodiment of the present invention is shown. It will be appreciated that the flow charts in the method embodiments are not intended to limit the order in which the steps are performed. An exemplary description will be made below with a difficult sample discovery apparatus (hereinafter referred to as "discovery apparatus") as an execution subject, which may be applied to a computer device such as a mobile phone, a tablet personal computer (tablet personal computer), a laptop computer (laptop computer), a server, or the like, which has a data transmission function. The method comprises the following steps:
step S10, a first sample set is acquired, wherein the first sample set comprises a plurality of sample images with unlabeled attributes.
Specifically, the first sample set includes face sample images with various attributes, such as face images shot at various angles, face images with various definition, face images with various illumination conditions, face images with various expressions, face images with glasses and face images without glasses, face images with masks and face images without masks, face images with and without bans, face images with hats, face images without hats, face images with hats, and face images with various age groups. The sample image with the unlabeled attribute refers to a sample image in which the image is not labeled by a manual labeling method or the like.
Step S11, identifying each sample image with unlabeled attributes based on a preset face attribute model to obtain the attributes of each unlabeled sample image, wherein the face attribute model is used for labeling the attributes of the sample images.
Specifically, the face attribute model may identify an attribute of a sample image, and annotate the identified result in the sample image, where the sample image is preferably a face image, and the attribute of the sample image may include a deflection angle of the sample image, where the deflection angle is a left-right deflection angle of the face, an ambiguity of the sample image, an expression of the sample image, a backlight intensity of the sample image, whether the face of the sample image is blocked, whether a person in the sample image wears glasses, a mask, a cap, whether the person in the sample image has a bang, and an age of the person in the sample image. The face attribute model may be obtained by training a deep neural network in advance using a large number of training sample images, and the deep neural network may be a convolutional neural network, but is not limited thereto.
In an embodiment, the face attribute model may be composed of a plurality of independent models, for example, including a deflection angle recognition model, an ambiguity recognition model, a backlight intensity recognition model, an expression recognition model, and the like, and the sample image sequentially passes through the deflection angle recognition model, the ambiguity recognition model, the backlight intensity recognition model, the expression recognition model, and the like, so as to recognize various attributes of the sample image. In another embodiment, the face attribute model may also be composed of a plurality of model cascades, for example, a model cascade composed of a deflection angle recognition model, a ambiguity recognition model, a backlight intensity recognition model, an expression recognition model, and the like, through which the sample image recognizes various attributes of the sample image.
In this embodiment, by inputting each unlabeled sample image into the face attribute model, attribute identification can be performed on each unlabeled sample image, so as to obtain the attribute of each unlabeled sample image.
And step S12, selecting sample images meeting preset conditions according to the attributes of the sample images.
Specifically, the preset condition is any one of the following conditions: the deflection angle is larger than a first preset value, the ambiguity is larger than a second preset value, the backlight is larger than a third preset value, the expression is the preset expression, dai Mojing, the mask is worn, the cap is worn, the bang is arranged, the age is larger than a fourth preset value or smaller than a fifth preset value, wherein the first preset value, the second preset value, the third preset value, the fourth preset value and the fifth preset value can be preset by a user, can be preset by a default of the system, and are not limited in the embodiment.
In this embodiment, after the attribute of each sample image is identified by the face attribute model, it may be determined whether the attribute of each sample image satisfies any one of the above conditions, and if one of the conditions is satisfied, it indicates that the sample image is a sample image that conforms to a preset condition.
And step S13, sorting the image quality of the sample image meeting the preset condition based on a preset quality sorting model, and outputting a quality sorting result, wherein the quality sorting model is a model which is obtained through training and is used for identifying the image quality.
In particular, the quality ranking model is used to score the quality of the images. The quality sorting model can be obtained by training a deep neural network model through a large number of sample images containing high-definition base images and snap shots of different characters, wherein the high-definition base images and the snap shots contain scores of users on the high-definition base images and the snap shots. After the training of the quality ranking model is completed, the images input into it can be scored by the quality ranking model.
In this embodiment, the scoring value of each sample image meeting the preset condition may be obtained by inputting each sample image meeting the preset condition into the quality sorting model. After the scoring values of the sample images meeting the preset conditions are obtained, the scoring values of the sample images meeting the preset conditions can be ranked, and after the ranking is completed, a quality ranking result can be output. In this embodiment, the ranking may be performed from a large score value to a small score value, or may be performed from a small score value to a large score value, which is not limited in this embodiment.
And step S14, performing key point marking on the first sample images sequenced at the preset positions in the quality sequencing result based on a first key point marking model to obtain marked second sample images, and performing key point marking on the first sample images sequenced at the preset positions in the quality sequencing result based on a second key point marking model to obtain marked third sample images, wherein the first key point marking model and the second key point marking model are used for performing key point positioning on the images.
Specifically, the preset position is a preset sorting position range, for example, the sorting is performed from small to large before 50 bits, that is, the first sample image is the sample image sorted before 50 bits in the sample images meeting the preset condition.
In this embodiment, the first and second keypoint marker models are keypoint marker models obtained through different network structures or different training standards. For example, the first key point marking model is a key point marking model trained by adopting a vgg network structure and adopting a 72-point landmark (key point) standard, and the second key point marking model is a key point marking model trained by adopting a resnet network structure and adopting a 106-point landmark standard. It will be appreciated that the network structure of the first and second keypoint marking models and the landmark standard are exemplary only and are not limiting in this embodiment. For example, the first key point marking model may also adopt a resnet network structure, or a key point marking model obtained by training a 68-point landmark standard; the second key point marking model can also adopt a vgg network, and can also adopt a key point marking model obtained by training a 150-point landmark standard.
After the first sample image is input into the first key point marking model, the key point coordinates of each input image can be predicted through the first key point marking model, so that a second sample image after marking is obtained. Wherein the first sample image comprises at least one image.
After the first sample image is input into the second key point marking model, the key point coordinates of each input image can be predicted through the second key point marking model, so that a third sample image after marking is obtained.
The key point coordinates are the marks of the first sample image by the first key point marking model and the second key point marking model, namely, the key points of the sample image are positioned by the coordinate points.
Step S15, calculating unitized pixel deviation of the key points in the second sample image and the third sample image after labeling.
Specifically, the second sample image and the third sample image marked by the first keypoint model and the second keypoint model each include a plurality of keypoints, which are preferably left eye, right eye, and mouth center.
Predicting coordinate values of key points (taking the center three points of a left eye, a right eye and a mouth as an example) of each first sample image through a first key point marking model to obtain a second sample image; and predicting the coordinate values of the three points of the left eye, the right eye and the mouth center of each first sample image through the second key point mark model to obtain a third sample image, and then calculating the unitized pixel deviation according to the coordinate values of the three points of the land mark predicted values (the left eye, the right eye and the mouth center) of the second sample image and the third sample image. Specifically, the unitized pixel deviation can be calculated by the following formula:
wherein Δx is the x-axis difference value of the second sample image and the third sample image corresponding to two points, Δy is the y-axis difference value of the second sample image and the third sample image corresponding to two points, w is the width of the face frame in the first sample image, and h is the height of the face frame in the first sample image.
In an embodiment, referring to fig. 2, the calculating the unitized pixel deviation of the key point in the second sample image and the third sample image after labeling is calculating the unitized pixel deviation of the left eye, the right eye and the mouth center in the second sample image and the third sample image after labeling, including:
Step S20, calculating a first unitized pixel deviation of the left eye in the second sample image and the third sample image after labeling;
step S21, calculating a second unitized pixel deviation of the right eye in the second sample image and the third sample image after labeling;
step S22, calculating a third unitized pixel deviation of the mouth center in the second sample image and the third sample image after labeling;
step S23, taking an average value of the first unitized pixel deviation, the second unitized pixel deviation and the third unitized pixel deviation as the unitized pixel deviation.
Specifically, after the first unitized pixel deviation, the second unitized pixel deviation, and the third unitized pixel deviation of the three points in the center of the left eye, the right eye, and the mouth are obtained, respectively, the average value of the unitized pixel deviations of the three points may be used as the final unitized pixel deviation value.
In another embodiment of the present invention, the maximum value of the calculated first, second, and third unit pixel deviations may be used as the final unit pixel deviation value, or the median of the calculated first, second, and third unit pixel deviations may be used as the final unit pixel deviation.
And S16, if the unitized pixel deviation is greater than or equal to a preset value, the first sample image is taken as a difficult sample image.
Specifically, the preset value is a preset standard unitized pixel deviation value, and the value can be set and modified according to actual situations.
When the calculated unitized pixel deviation value is greater than or equal to the preset value, the first sample image can be judged to be a difficult sample image, and when the calculated unitized pixel deviation value is less than the preset value, the first sample image can be judged to be a difficult sample image.
Note that, the difficult sample image in the present embodiment refers to a face image captured under a complex scene, for example, a face image captured under strong light, a blocked face image, or the like.
According to the difficult sample discovery method provided by the embodiment of the invention, the face attribute model and the quality sorting model are combined, so that a first sample image can be automatically selected from a large number of sample images, and then the first sample image is further analyzed through the first key point model and the second key point model, so that whether the first sample image is a difficult sample image can be judged. According to the embodiment of the invention, the sample image is not required to be judged manually, so that the finding efficiency of the difficult sample can be improved, and the labor cost can be reduced.
Referring to fig. 3, a flowchart of steps of another embodiment of the method for finding a difficult sample according to the present invention is shown. The embodiments of the present invention are based on the above-described embodiments. In this embodiment, the execution sequence of the steps in the flowchart shown in fig. 3 may be changed, and some steps may be omitted according to different requirements. The following description is also exemplarily made with a difficult-sample discovery apparatus (hereinafter, "discovery apparatus" for short) as an execution subject. The method comprises the following steps:
and S30, marking the difficult sample image through a third key point marking model to obtain a first difficult sample image containing key points, and marking the difficult sample image through a fourth key point marking model to obtain a second difficult sample image containing key points.
Specifically, the third key point marking model is a model which is specially used for data cleaning, has a larger network structure with deeper input, has a slower speed than an actual on-line deployment model and has higher precision and is used for marking the key points of the face, and the fourth key point model is a small model which is actually deployed and is used for marking the key points of the face. The network model of the third and fourth keypoint marking models may be models trained with a network of a resnet network structure or a vgg network structure.
And after the difficult sample image is marked by the third key point marking model and the fourth key point marking model, obtaining a second sample image of the corresponding mark. The marked second sample image contains coordinates of centers of left eye, right eye and mouth.
Step S31, inputting the first difficult sample image and the second difficult sample image into a face detection frame model to obtain a third difficult sample image and a fourth difficult sample image including a face frame.
Specifically, the face detection frame model is a model for marking a face frame, and is a prior art, which is not described in this embodiment.
By inputting the first and second difficult sample images into the face detection frame model, a third difficult sample image including a face frame and a fourth difficult sample image corresponding to the first difficult sample image input into the face detection frame model and corresponding to the second difficult sample image input into the face detection frame model can be output.
And S32, publishing the third difficult sample image and the fourth difficult sample image to a labeling website so that labeling personnel classify the third difficult sample image and the fourth difficult sample image, wherein classification categories comprise three categories, namely, a third key point mark model prediction result is accurate, a fourth key point mark model prediction result is inaccurate, third key point mark models and fourth key point mark models prediction results are inaccurate, and third key point mark models, fourth key point mark models and face frames are inaccurate.
Specifically, when the labeling person classifies the third difficult sample image and the fourth difficult sample image, since the classification results of different labeling persons may be somewhat different, when classifying the third difficult sample image and the fourth difficult sample image, the third difficult sample image and the fourth difficult sample image may be classified by a plurality of labeling persons, and then the same classification results of a plurality of persons on the third difficult sample image and the fourth difficult sample image may be selected as the classification results of the third difficult sample image and the fourth difficult sample image.
And step S33, receiving the classification result, and taking the difficult sample image corresponding to the classification result as a training sample image of the fourth key point mark model when the classification result is a third key point mark model prediction result and the fourth key point mark model prediction result is not the same.
Specifically, when the classification result is that the third key point mark model predicts the result accurately and the fourth key point mark model predicts the result incorrectly, the difficult sample image corresponding to the classification result can be used as the training sample image of the fourth key point mark model, and then the difficult sample image can be used for carrying out iterative training on the fourth key point mark model again so as to improve the marking accuracy of the fourth key point mark model.
And step S34, receiving a classification result, and when the classification result is that the third key point marking model and the fourth key point marking model are not predicted, publishing the difficult sample image corresponding to the classification result to a marking website so as to enable a marking person to correct the key points in the marked difficult sample image.
And step S35, receiving the difficult sample image corrected by the labeling personnel, and taking the difficult sample image corrected by the labeling personnel as training sample images of the third key point marking model and the fourth key point marking model.
Specifically, when the classification result is that the third key point marking model prediction result is inaccurate and the fourth key point marking model prediction result is inaccurate, a difficult sample image corresponding to the classification result can be issued to a marking website so as to enable a marking person to manually correct the difficult sample image, and after the marking person finishes correction, the corrected landmark (key point) result can be printed on a face picture again so as to enable different marking persons to judge whether the correction accuracy is accurate or not, thereby improving the correction accuracy.
And after the final correction is finished, taking the corrected difficult sample image as a training sample picture of the third key point marking model and the fourth key point marking model, so that the corrected difficult sample image can be adopted to carry out iterative training on the third key point marking model and the fourth key point marking model again, and the marking accuracy of the third key point marking model and the fourth key point marking model is improved.
And S36, receiving a classification result, and when the classification result is that the third key point marking model, the fourth key point marking model and the face frame are not identical, publishing the difficult sample image corresponding to the classification result to a marking website so as to enable a marking person to correct the face detection frame in the marked difficult sample image.
And step S37, receiving the difficult sample image corrected by the labeling personnel, and taking the difficult sample image corrected by the labeling personnel as a training sample image of the face detection frame model.
Specifically, when the classification result is that the third key point marking model, the fourth key point marking model and the face frame are not identical, the difficult sample picture corresponding to the classification result can be issued to the marking website, so that the marking personnel can manually correct the face frame in the difficult sample picture. After the correction of the face frame is completed, the corrected difficult sample image of the labeling person is used as a training sample image of the face detection frame model, so that the corrected difficult sample image can be used for carrying out iterative training on the face detection frame model again, and the detection accuracy of the face detection frame model is improved.
According to the difficult sample discovery method provided by the embodiment of the invention, the labeling personnel classifies the third difficult sample image and the fourth difficult sample image, so that whether the third key point marking model and the fourth key point marking model accurately predict the key points in the sample image or not can be judged, and when the prediction result is not correct, the corresponding difficult sample image is used as a training sample image to train the corresponding model again in an iterating way, and the accuracy of the model is improved.
Referring to fig. 4, a program module diagram of a difficult sample discovery apparatus 400 (hereinafter referred to as "discovery apparatus" 400) according to an embodiment of the invention is shown. The discovery apparatus 400 may be applied to a computer device, which may be a mobile phone, a tablet personal computer (tablet personal computer), a laptop computer (laptop computer), a server, or the like, having a data transmission function. In this embodiment, the discovery apparatus 400 may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and executed by one or more processors to complete the present invention and implement the above-described difficult sample discovery method. Program modules in accordance with the embodiments of the present invention may be referred to as a series of computer program instruction segments capable of performing particular functions, and are more suitable than programs themselves for describing the execution of the difficult sample discovery method in a storage medium. The following description will specifically describe functions of each program module of the present embodiment:
An obtaining module 401 is configured to obtain a first sample set, where the first sample set includes a plurality of sample images with unlabeled attributes.
Specifically, the first sample set includes face sample images with various attributes, such as face images shot at various angles, face images with various definition, face images with various illumination conditions, face images with various expressions, face images with glasses and face images without glasses, face images with masks and face images without masks, face images with and without bans, face images with hats, face images without hats, face images with hats, and face images with various age groups. The sample image with the unlabeled attribute refers to a sample image in which the image is not labeled by a manual labeling method or the like.
The identifying module 402 is configured to identify each sample image with an unlabeled attribute based on a preset face attribute model, so as to obtain an attribute of each unlabeled sample image, where the face attribute model is used to label the attribute of the sample image.
Specifically, the face attribute model may identify an attribute of a sample image, and annotate the identified result in the sample image, where the sample image is preferably a face image, and the attribute of the sample image may include a deflection angle of the sample image, where the deflection angle is a left-right deflection angle of the face, an ambiguity of the sample image, an expression of the sample image, a backlight intensity of the sample image, whether the face of the sample image is blocked, whether a person in the sample image wears glasses, a mask, a cap, whether the person in the sample image has a bang, and an age of the person in the sample image. The face attribute model may be obtained by training a deep neural network in advance using a large number of training sample images, and the deep neural network may be a convolutional neural network, but is not limited thereto.
In an embodiment, the face attribute model may be composed of a plurality of independent models, for example, including a deflection angle recognition model, an ambiguity recognition model, a backlight intensity recognition model, an expression recognition model, and the like, and the sample image sequentially passes through the deflection angle recognition model, the ambiguity recognition model, the backlight intensity recognition model, the expression recognition model, and the like, so as to recognize various attributes of the sample image. In another embodiment, the face attribute model may also be composed of a plurality of model cascades, for example, a model cascade composed of a deflection angle recognition model, a ambiguity recognition model, a backlight intensity recognition model, an expression recognition model, and the like, through which the sample image recognizes various attributes of the sample image.
In this embodiment, by inputting each unlabeled sample image into the face attribute model, attribute identification can be performed on each unlabeled sample image, so as to obtain the attribute of each unlabeled sample image.
And the selecting module 403 is configured to select a sample image that meets a preset condition according to an attribute of each sample image.
Specifically, the preset condition is any one of the following conditions: the deflection angle is larger than a first preset value, the ambiguity is larger than a second preset value, the backlight is larger than a third preset value, the expression is the preset expression, dai Mojing, the mask is worn, the cap is worn, the bang is arranged, the age is larger than a fourth preset value or smaller than a fifth preset value, wherein the first preset value, the second preset value, the third preset value, the fourth preset value and the fifth preset value can be preset by a user, can be preset by a default of the system, and are not limited in the embodiment.
In this embodiment, after the attribute of each sample image is identified by the face attribute model, it may be determined whether the attribute of each sample image satisfies any one of the above conditions, and if one of the conditions is satisfied, it indicates that the sample image is a sample image that conforms to a preset condition.
The sorting module 404 is configured to sort the image quality of the sample image according to the preset condition based on a preset quality sorting model, and output a quality sorting result, where the quality sorting model is a model that is obtained through training and is used for identifying the image quality.
In particular, the quality ranking model is used to score the quality of the images. The quality sorting model can be obtained by training a deep neural network model through a large number of sample images containing high-definition base images and snap shots of different characters, wherein the high-definition base images and the snap shots contain scores of users on the high-definition base images and the snap shots. After the training of the quality ranking model is completed, the images input into it can be scored by the quality ranking model.
In this embodiment, the scoring value of each sample image meeting the preset condition may be obtained by inputting each sample image meeting the preset condition into the quality sorting model. After the scoring values of the sample images meeting the preset conditions are obtained, the scoring values of the sample images meeting the preset conditions can be ranked, and after the ranking is completed, a quality ranking result can be output. In this embodiment, the ranking may be performed from a large score value to a small score value, or may be performed from a small score value to a large score value, which is not limited in this embodiment.
The labeling module 405 is configured to label the first sample images ranked at the preset positions in the quality ranking result with a keypoint point based on a first keypoint labeling model, obtain a labeled second sample image, and label the first sample images ranked at the preset positions in the quality ranking result with a keypoint point based on a second keypoint labeling model, obtain a labeled third sample image, where the first keypoint labeling model and the second keypoint labeling model are used for positioning keypoints of the images.
Specifically, the preset position is a preset sorting position range, for example, the sorting is performed from small to large before 50 bits, that is, the first sample image is the sample image sorted before 50 bits in the sample images meeting the preset condition.
In this embodiment, the first and second keypoint marker models are keypoint marker models obtained through different network structures or different training standards. For example, the first key point marking model is a key point marking model trained by adopting a vgg network structure and adopting a 72-point landmark (key point) standard, and the second key point marking model is a key point marking model trained by adopting a resnet network structure and adopting a 106-point landmark standard. It will be appreciated that the network structure of the first and second keypoint marking models and the landmark standard are exemplary only and are not limiting in this embodiment. For example, the first key point marking model may also adopt a resnet network structure, or a key point marking model obtained by training a 68-point landmark standard; the second key point marking model can also adopt a vgg network, and can also adopt a key point marking model obtained by training a 150-point landmark standard.
After the first sample image is input into the first key point marking model, the key point coordinates of each input image can be predicted through the first key point marking model, so that a second sample image after marking is obtained. Wherein the first sample image comprises at least one image.
After the first sample image is input into the second key point marking model, the key point coordinates of each input image can be predicted through the second key point marking model, so that a third sample image after marking is obtained.
The key point coordinates are the marks of the first sample image by the first key point marking model and the second key point marking model, namely, the key points of the sample image are positioned by the coordinate points.
A calculating module 406, configured to calculate a unitized pixel deviation of the labeled second sample image and the keypoint in the third sample image.
Specifically, the second sample image and the third sample image marked by the first keypoint model and the second keypoint model each include a plurality of keypoints, which are preferably left eye, right eye, and mouth center.
Predicting coordinate values of key points (taking the center three points of a left eye, a right eye and a mouth as an example) of each first sample image through a first key point marking model to obtain a second sample image; and predicting coordinate values of three points of the left eye, the right eye and the mouth center of each first sample image through the second key point marking model, and calculating unitized pixel deviation according to the coordinate values of the three points of the land mark predicted values (the left eye, the right eye and the mouth center) of the second sample image and the third sample image after obtaining the third sample image. Specifically, the unitized pixel deviation can be calculated by the following formula:
wherein Δx is the x-axis difference value of the second sample image and the third sample image corresponding to two points, Δy is the y-axis difference value of the second sample image and the third sample image corresponding to two points, w is the width of the face frame in the first sample image, and h is the height of the face frame in the first sample image.
In an embodiment, the calculating module 406 is further configured to calculate a second unitized pixel deviation of the second sample image after labeling and the right eye in the third sample image; calculating a third unitized pixel deviation of the mouth center in the second sample image and the third sample image after labeling; and taking an average of the first unitized pixel deviation, the second unitized pixel deviation and the third unitized pixel deviation as the unitized pixel deviation.
Specifically, after the first unitized pixel deviation, the second unitized pixel deviation, and the third unitized pixel deviation of the three points in the center of the left eye, the right eye, and the mouth are obtained, respectively, the average value of the unitized pixel deviations of the three points may be used as the final unitized pixel deviation value.
In another embodiment of the present invention, the maximum value of the calculated first, second, and third unit pixel deviations may be used as the final unit pixel deviation value, or the median of the calculated first, second, and third unit pixel deviations may be used as the final unit pixel deviation.
As a block 407, the method is used for taking the first sample image as a difficult sample image if the unitized pixel deviation is greater than or equal to a preset value.
Specifically, the preset value is a preset standard unitized pixel deviation value, and the value can be set and modified according to actual situations.
When the calculated unitized pixel deviation value is greater than or equal to the preset value, the first sample image can be judged to be a difficult sample image, and when the calculated unitized pixel deviation value is less than the preset value, the first sample image can be judged to be a difficult sample image.
Note that, the difficult sample image in the present embodiment refers to a face image captured under a complex scene, for example, a face image captured under strong light, a blocked face image, or the like.
According to the difficult sample discovery method provided by the embodiment of the invention, the face attribute model and the quality sorting model are combined, so that a first sample image can be automatically selected from a large number of sample images, and then the first sample image is further analyzed through the first key point model and the second key point model, so that whether the first sample image is a difficult sample image can be judged. According to the embodiment of the invention, the sample image is not required to be judged manually, so that the finding efficiency of the difficult sample can be improved, and the labor cost can be reduced.
In another embodiment of the present invention, the discovery apparatus 400 further includes:
the difficult sample labeling module is used for labeling the difficult sample image through a third key point labeling model to obtain a first difficult sample image containing key points, and labeling the difficult sample image through a fourth key point labeling model to obtain a second difficult sample image containing key points.
Specifically, the third key point marking model is a model which is specially used for data cleaning, has a larger network structure with deeper input, has a slower speed than an actual on-line deployment model and has higher precision and is used for marking the key points of the face, and the fourth key point model is a small model which is actually deployed and is used for marking the key points of the face. The network model of the third and fourth keypoint marking models may be models trained with a network of a resnet network structure or a vgg network structure.
And after the difficult sample image is marked by the third key point marking model and the fourth key point marking model, obtaining a second sample image of the corresponding mark. The marked second sample image contains coordinates of centers of left eye, right eye and mouth.
The input module is used for inputting the first difficult sample image and the second difficult sample image into a face detection frame model so as to obtain a third difficult sample image and a fourth difficult sample image containing a face frame.
Specifically, the face detection frame model is a model for marking a face frame, and is a prior art, which is not described in this embodiment.
By inputting the first and second difficult sample images into the face detection frame model, a third difficult sample image including a face frame and a fourth difficult sample image corresponding to the first difficult sample image input into the face detection frame model and corresponding to the second difficult sample image input into the face detection frame model can be output.
The issuing module is used for issuing the third difficult sample image and the fourth difficult sample image to a labeling website so that labeling personnel classify the third difficult sample image and the fourth difficult sample image, wherein classification categories comprise three categories, namely, a third key point marking model prediction result is accurate, a fourth key point marking model prediction result is inaccurate, third key point marking model and fourth key point marking model prediction result are inaccurate, and third key point marking model and fourth key point marking model and human face frame are inaccurate.
Specifically, when the labeling person classifies the third difficult sample image and the fourth difficult sample image, since the classification results of different labeling persons may be somewhat different, when classifying the third difficult sample image and the fourth difficult sample image, the third difficult sample image and the fourth difficult sample image may be classified by a plurality of labeling persons, and then the same classification results of a plurality of persons on the third difficult sample image and the fourth difficult sample image may be selected as the classification results of the third difficult sample image and the fourth difficult sample image.
The receiving module is used for receiving the classification result and taking the difficult sample image corresponding to the classification result as a training sample image of the fourth key point mark model when the classification result is a third key point mark model prediction result and the fourth key point mark model prediction result is not.
Specifically, when the classification result is that the third key point mark model predicts the result accurately and the fourth key point mark model predicts the result incorrectly, the difficult sample image corresponding to the classification result can be used as the training sample image of the fourth key point mark model, and then the difficult sample image can be used for carrying out iterative training on the fourth key point mark model again so as to improve the marking accuracy of the fourth key point mark model.
In an embodiment, the receiving module is further configured to receive a classification result, and issue a difficult sample image corresponding to the classification result to a marking website when the classification result is that the third key point marking model and the fourth key point marking model are not predicted, so that a labeling person corrects the key points in the marked difficult sample image; and receiving the difficult sample image corrected by the labeling personnel, and taking the difficult sample image corrected by the labeling personnel as training sample images of the third key point marking model and the fourth key point marking model.
Specifically, when the classification result is that the third key point marking model prediction result is inaccurate and the fourth key point marking model prediction result is inaccurate, a difficult sample image corresponding to the classification result can be issued to a marking website so as to enable a marking person to manually correct the difficult sample image, and after the marking person finishes correction, the corrected landmark (key point) result can be printed on a face picture again so as to enable different marking persons to judge whether the correction accuracy is accurate or not, thereby improving the correction accuracy.
And after the final correction is finished, taking the corrected difficult sample image as a training sample picture of the third key point marking model and the fourth key point marking model, so that the corrected difficult sample image can be adopted to carry out iterative training on the third key point marking model and the fourth key point marking model again, and the marking accuracy of the third key point marking model and the fourth key point marking model is improved.
In an embodiment, the receiving module is further configured to receive a classification result, and issue a difficult sample image corresponding to the classification result to a marking website when the classification result is that the third key point marking model, the fourth key point marking model and the face frame are not identical, so that a labeling person corrects the face detection frame in the marked difficult sample image; and receiving the difficult sample image corrected by the labeling personnel, and taking the difficult sample image corrected by the labeling personnel as a training sample image of the face detection frame model.
Specifically, when the classification result is that the third key point marking model, the fourth key point marking model and the face frame are not identical, the difficult sample picture corresponding to the classification result can be issued to the marking website, so that the marking personnel can manually correct the face frame in the difficult sample picture. After the correction of the face frame is completed, the corrected difficult sample image of the labeling person is used as a training sample image of the face detection frame model, so that the corrected difficult sample image can be used for carrying out iterative training on the face detection frame model again, and the detection accuracy of the face detection frame model is improved.
According to the difficult sample discovery method provided by the embodiment of the invention, the labeling personnel classifies the third difficult sample image and the fourth difficult sample image, so that whether the third key point marking model and the fourth key point marking model accurately predict the key points in the sample image or not can be judged, and when the prediction result is not correct, the corresponding difficult sample image is used as a training sample image to train the corresponding model again in an iterating way, and the accuracy of the model is improved.
Referring to fig. 5, a hardware architecture of a computer device 500 according to an embodiment of the invention is shown. In this embodiment, the computer device 500 is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction. As shown, the computer device 500 includes, but is not limited to, at least a memory 501, a processor 502, and a network interface 503 communicatively coupled to each other via a device bus. Wherein:
In this embodiment, the memory 501 includes at least one type of computer readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 501 may be an internal storage unit of the computer device 500, such as a hard disk or a memory of the computer device 500. In other embodiments, the memory 501 may also be an external storage device of the computer device 500, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the computer device 500. Of course, memory 501 may also include both internal storage units of computer device 500 and external storage devices. In this embodiment, the memory 501 is generally used to store an operating device installed in the computer apparatus 500 and various types of application software, such as program codes of the difficult sample discovery device 400. Further, the memory 501 may be used to temporarily store various types of data that have been output or are to be output.
The processor 502 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 502 is generally used to control the overall operation of the computer device 500. In this embodiment, the processor 502 is configured to execute the program code stored in the memory 501 or process data, for example, execute the difficult sample discovery apparatus 400, to implement the difficult sample discovery methods in the above embodiments.
The network interface 503 may comprise a wireless network interface or a wired network interface, the network interface 503 typically being used to establish a communication connection between the computer apparatus 500 and other electronic devices. For example, the network interface 503 is used to connect the computer device 500 to an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 500 and the external terminal, and the like. The network may be an Intranet (Intranet), the Internet (Internet), a global system for mobile communications (Global System of Mobile communication, GSM), wideband code division multiple access (Wideband Code Division Multiple Access, WCDMA), a 4G network, a 5G network, bluetooth (Bluetooth), wi-Fi, or other wireless or wired network.
It should be noted that fig. 5 only shows a computer device 500 having components 501-503, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead.
In this embodiment, the difficult sample discovery apparatus 400 stored in the memory 501 may be further divided into one or more program modules, which are stored in the memory 501 and executed by one or more processors (the processor 502 in this embodiment) to perform the difficult sample discovery method or the difficult sample discovery method of the present invention.
The present embodiment also provides a computer-readable storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., on which a computer program is stored, which when executed by a processor, performs the corresponding functions. The computer readable storage medium of the present embodiment is used to store the difficult sample discovery apparatus 400 to implement the difficult sample discovery method or the difficult sample discovery method of the present invention when executed by a processor.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (10)

1. A method for finding a difficult sample, comprising:
obtaining a first sample set comprising a plurality of sample images of unlabeled properties;
identifying sample images with unlabeled attributes based on a preset face attribute model to obtain attributes of the unlabeled sample images, wherein the face attribute model is used for labeling the attributes of the sample images;
selecting sample images meeting preset conditions according to the attributes of the sample images;
Sorting the image quality of the sample images meeting the preset conditions based on a preset quality sorting model, and outputting a quality sorting result, wherein the quality sorting model is a model which is obtained through training and is used for identifying the image quality;
performing keypoint point marking on a first sample image sequenced at a preset position in the quality sequencing result based on a first keypoint marking model to obtain a marked second sample image, and performing keypoint point marking on the first sample image sequenced at the preset position in the quality sequencing result based on a second keypoint marking model to obtain a marked third sample image, wherein the first keypoint marking model and the second keypoint marking model are used for performing keypoint positioning on the images;
calculating unitized pixel deviation of the key points in the second sample image and the third sample image after labeling; a kind of electronic device with high-pressure air-conditioning system
And if the unitized pixel deviation is larger than or equal to a preset value, the first sample image is taken as a difficult sample image.
2. The difficult sample discovery method of claim 1, wherein the sample image is a facial image, and wherein the properties of the sample image include at least one of yaw angle, ambiguity, expression, backlight intensity, occlusion, glasses, mask, cap, bang, and age;
The preset condition is any one of the following conditions:
the deflection angle is larger than a first preset value, the ambiguity is larger than a second preset value, the backlight is larger than a third preset value, the expression is preset expression Dai Mojing, the mask is worn, the hat is worn, the Liu is on, and the age is larger than a fourth preset value or smaller than a fifth preset value.
3. The difficult sample discovery method of claim 2, further comprising:
labeling the difficult sample image through a third key point labeling model to obtain a first difficult sample image containing key points, and labeling the difficult sample image through a fourth key point labeling model to obtain a second difficult sample image containing key points;
inputting the first difficult sample image and the second difficult sample image into a face detection frame model to obtain a third difficult sample image and a fourth difficult sample image containing a face frame;
and publishing the third difficult sample image and the fourth difficult sample image to a labeling website so that labeling personnel classify the third difficult sample image and the fourth difficult sample image, wherein classification categories comprise three categories, namely, a third key point labeling model prediction result is accurate, a fourth key point labeling model prediction result is inaccurate, both the third key point labeling model and the fourth key point labeling model prediction result are inaccurate, and both the third key point labeling model and the fourth key point labeling model and the face frame are inaccurate.
4. The difficult sample discovery method of claim 3, further comprising:
and receiving a classification result, and taking a difficult sample image corresponding to the classification result as a training sample image of the fourth key point mark model when the classification result is a third key point mark model prediction result and the fourth key point mark model prediction result is not correct.
5. The difficult sample discovery method of claim 3, further comprising:
receiving a classification result, and when the classification result is that the third key point marking model and the fourth key point marking model are not predicted, publishing a difficult sample image corresponding to the classification result to a marking website so that a marking person corrects the key points in the marked difficult sample image;
and receiving the difficult sample image corrected by the labeling personnel, and taking the difficult sample image corrected by the labeling personnel as training sample images of the third key point marking model and the fourth key point marking model.
6. The difficult sample discovery method of claim 3, further comprising:
Receiving a classification result, and when the classification result is that the third key point marking model, the fourth key point marking model and the face frame are not identical, publishing a difficult sample image corresponding to the classification result to a marking website so as to enable a marking person to correct the face detection frame in the marked difficult sample image;
and receiving the difficult sample image corrected by the labeling personnel, and taking the difficult sample image corrected by the labeling personnel as a training sample image of the face detection frame model.
7. The difficult sample discovery method according to any one of claims 1 to 6, wherein the calculating of the unitized pixel deviation of the keypoints in the second sample image and the third sample image after labeling is calculating of the unitized pixel deviation of the centers of the left eye, the right eye, and the mouth in the second sample image and the third sample image after labeling includes:
calculating a first unitized pixel deviation of the left eye in the second sample image and the third sample image after labeling;
calculating a second unitized pixel deviation of the right eye in the second sample image and the third sample image after labeling;
Calculating a third unitized pixel deviation of the mouth center in the second sample image and the third sample image after labeling;
and taking an average value of the first unitized pixel deviation, the second unitized pixel deviation and the third unitized pixel deviation as the unitized pixel deviation.
8. A difficult sample discovery apparatus, comprising:
an acquisition module for acquiring a first sample set comprising a plurality of sample images of unlabeled properties;
the identification module is used for identifying each sample image with unlabeled attributes based on a preset face attribute model to obtain the attributes of each unlabeled sample image, wherein the face attribute model is used for labeling the attributes of the sample images;
the selecting module is used for selecting sample images meeting preset conditions according to the attributes of the sample images;
the sorting module is used for sorting the image quality of the sample images meeting the preset conditions based on a preset quality sorting model and outputting a quality sorting result, wherein the quality sorting model is a model which is obtained through training and used for identifying the image quality;
The marking module is used for marking key points of the first sample images sequenced at the preset positions in the quality sequencing result based on the first key point marking model to obtain a marked second sample image, and marking key points of the first sample images sequenced at the preset positions in the quality sequencing result based on the second key point marking model to obtain a marked third sample image, wherein the first key point marking model and the second key point marking model are used for positioning key points of the images;
the calculation module is used for calculating unitized pixel deviation of the key points in the second sample image and the third sample image after labeling; a kind of electronic device with high-pressure air-conditioning system
And the module is used for taking the first sample image as a difficult sample image if the unitized pixel deviation is larger than or equal to a preset value.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the difficult sample discovery method of any one of claims 1 to 7 when the computer program is executed by the processor.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program executable by at least one processor to cause the at least one processor to perform the steps of the difficult sample discovery method according to any one of claims 1-7.
CN202010138382.XA 2020-03-03 2020-03-03 Difficult sample discovery method and device and computer equipment Active CN111401158B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010138382.XA CN111401158B (en) 2020-03-03 2020-03-03 Difficult sample discovery method and device and computer equipment
PCT/CN2020/118113 WO2021174820A1 (en) 2020-03-03 2020-09-27 Discovery method and apparatus for difficult sample, and computer device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010138382.XA CN111401158B (en) 2020-03-03 2020-03-03 Difficult sample discovery method and device and computer equipment

Publications (2)

Publication Number Publication Date
CN111401158A CN111401158A (en) 2020-07-10
CN111401158B true CN111401158B (en) 2023-09-01

Family

ID=71432167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010138382.XA Active CN111401158B (en) 2020-03-03 2020-03-03 Difficult sample discovery method and device and computer equipment

Country Status (2)

Country Link
CN (1) CN111401158B (en)
WO (1) WO2021174820A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401158B (en) * 2020-03-03 2023-09-01 平安科技(深圳)有限公司 Difficult sample discovery method and device and computer equipment
CN116416666A (en) * 2023-04-17 2023-07-11 北京数美时代科技有限公司 Face recognition method, system and storage medium based on distributed distillation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108133220A (en) * 2016-11-30 2018-06-08 北京市商汤科技开发有限公司 Model training, crucial point location and image processing method, system and electronic equipment
CN109558864A (en) * 2019-01-16 2019-04-02 苏州科达科技股份有限公司 Face critical point detection method, apparatus and storage medium
WO2019109526A1 (en) * 2017-12-06 2019-06-13 平安科技(深圳)有限公司 Method and device for age recognition of face image, storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608450B (en) * 2016-03-01 2018-11-27 天津中科智能识别产业技术研究院有限公司 Heterogeneous face identification method based on depth convolutional neural networks
US10332312B2 (en) * 2016-12-25 2019-06-25 Facebook, Inc. Shape prediction model compression for face alignment
CN109635838B (en) * 2018-11-12 2023-07-11 平安科技(深圳)有限公司 Face sample picture labeling method and device, computer equipment and storage medium
CN110135263A (en) * 2019-04-16 2019-08-16 深圳壹账通智能科技有限公司 Portrait attribute model construction method, device, computer equipment and storage medium
CN110110611A (en) * 2019-04-16 2019-08-09 深圳壹账通智能科技有限公司 Portrait attribute model construction method, device, computer equipment and storage medium
CN111401158B (en) * 2020-03-03 2023-09-01 平安科技(深圳)有限公司 Difficult sample discovery method and device and computer equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108133220A (en) * 2016-11-30 2018-06-08 北京市商汤科技开发有限公司 Model training, crucial point location and image processing method, system and electronic equipment
WO2019109526A1 (en) * 2017-12-06 2019-06-13 平安科技(深圳)有限公司 Method and device for age recognition of face image, storage medium
CN109558864A (en) * 2019-01-16 2019-04-02 苏州科达科技股份有限公司 Face critical point detection method, apparatus and storage medium

Also Published As

Publication number Publication date
WO2021174820A1 (en) 2021-09-10
CN111401158A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
US11087447B2 (en) Systems and methods for quality assurance of image recognition model
CN106557747B (en) The method and device of identification insurance single numbers
CN110738101A (en) Behavior recognition method and device and computer readable storage medium
CN109448007B (en) Image processing method, image processing apparatus, and storage medium
CN104615986B (en) The method that pedestrian detection is carried out to the video image of scene changes using multi-detector
KR101165415B1 (en) Method for recognizing human face and recognizing apparatus
CN111401158B (en) Difficult sample discovery method and device and computer equipment
CN111368682B (en) Method and system for detecting and identifying station caption based on master RCNN
CN111815577A (en) Method, device, equipment and storage medium for processing safety helmet wearing detection model
CN112381092B (en) Tracking method, tracking device and computer readable storage medium
JP2007052575A (en) Metadata applying device and metadata applying method
CN114445879A (en) High-precision face recognition method and face recognition equipment
CN111291773A (en) Feature identification method and device
CN112633221A (en) Face direction detection method and related device
CN111507957A (en) Identity card picture conversion method and device, computer equipment and storage medium
CN112036304A (en) Medical bill layout identification method and device and computer equipment
CN113436735A (en) Body weight index prediction method, device and storage medium based on face structure measurement
CN113780116A (en) Invoice classification method and device, computer equipment and storage medium
CN111104942B (en) Template matching network training method, recognition method and device
CN110427828B (en) Face living body detection method, device and computer readable storage medium
JP6567638B2 (en) Noseprint matching system, noseprint matching method, and noseprint matching program
CN116229502A (en) Image-based tumbling behavior identification method and equipment
CN110751163A (en) Target positioning method and device, computer readable storage medium and electronic equipment
CN114758384A (en) Face detection method, device, equipment and storage medium
CN113592789A (en) Dim light image identification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40032304

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant