CN112541372A

CN112541372A - Difficult sample screening method and device

Info

Publication number: CN112541372A
Application number: CN201910890908.7A
Authority: CN
Inventors: 马贤忠; 董维山; 江浩; 胡皓瑜; 范一磊
Original assignee: Momenta Suzhou Technology Co Ltd
Current assignee: Momenta Suzhou Technology Co Ltd
Priority date: 2019-09-20
Filing date: 2019-09-20
Publication date: 2021-03-23
Anticipated expiration: 2039-09-20
Also published as: CN112541372B; WO2021051887A1

Abstract

The embodiment of the invention discloses a method and a device for screening difficult samples, wherein the method comprises the following steps: detecting each obtained image to be screened by utilizing a pre-established target detection model, determining the image to be screened containing at least one first missed-detection target area image, wherein the target detection model is as follows: the confidence coefficient detection module is used for detecting the area where the target contained in the image is located and determining the confidence coefficient of the target in the detected area where the target is located; extracting image characteristics of each first omission target area image; determining a target label corresponding to each first missed detection target area image based on the image characteristics of each first missed detection target area image and a pre-established corresponding relationship, wherein the corresponding relationship comprises the image characteristics of the marked image and the corresponding relationship between the image characteristics and the corresponding labels of the marked image; and determining the image to be screened of the first missed detection target area image containing at least one corresponding target label as a missed detection label as a difficult sample image so as to realize automatic screening of the difficult sample.

Description

Difficult sample screening method and device

Technical Field

The invention relates to the technical field of intelligent driving, in particular to a difficult sample screening method and device.

Background

Deep learning relies on a large amount of training data, i.e., samples, but when the number of samples reaches a certain scale, the potential of improving the model performance by different newly added sample images is different.

For an Object Detection (also called Object Detection) model, a difficult sample, i.e. a sample containing a missing Detection Object and a false Detection Object, is data which is valuable for improving the performance of the Object Detection model. In order to improve the performance of the target detection model to a certain extent, it is necessary to acquire samples as difficult samples as possible preferentially so as to train and optimize the corresponding target detection model by using the difficult samples.

How to automatically screen out difficult samples from the samples becomes a problem to be solved urgently.

Disclosure of Invention

The invention provides a method and a device for screening difficult samples, which are used for automatically screening the difficult samples. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides a method for screening a difficult sample, including:

detecting each obtained image to be screened by utilizing a pre-established target detection model, and determining the image to be screened comprising at least one first missed-detection target area image, wherein the target detection model is as follows: the first missed detection target area image is used for detecting an area where a target contained in the image is located and determining the confidence level of the detected area where the target is located, and the first missed detection target area image is as follows: the corresponding confidence coefficient is lower than the regional image with the preset threshold value;

performing image feature extraction on each first missed detection target area image, and determining the image feature of each first missed detection target area image;

determining a target label corresponding to each first missed detection target area image based on the image characteristics of each first missed detection target area image and a pre-established corresponding relationship, wherein the corresponding relationship comprises: the image characteristics of the marked image and the corresponding relation between the corresponding labels of the marked image;

and determining the image to be screened of the first missed detection target area image containing at least one corresponding target label as a missed detection label as a difficult sample image.

Optionally, the step of determining the target label corresponding to each first missed-detection target area image based on the image characteristics of each first missed-detection target area image and the pre-established corresponding relationship includes:

determining an alternative label corresponding to each first missed detection target area image based on the image characteristics of each first missed detection target area image and a pre-established corresponding relation;

and determining the target label corresponding to each first missed detection target area image based on the alternative label corresponding to each first missed detection target area image.

Optionally, the step of determining the alternative label corresponding to each first missed-detection target area image based on the image characteristics of each first missed-detection target area image and the pre-established corresponding relationship includes:

for each first missed detection target area image, determining a similarity value between the first missed detection target area image and each labeled image based on the image characteristics of the first missed detection target area image and the image characteristics of each labeled image;

and determining the alternative label corresponding to each first missed detection target area image based on the similarity value.

Optionally, the step of determining the alternative label corresponding to each first missed-detection target area image based on the similarity value includes:

aiming at each first missed detection target area image, arranging labels corresponding to each labeled image according to the sequence of similarity values between the first missed detection target area image and each labeled image from large to small to obtain a label queue corresponding to the first missed detection target area image;

and for each first undetected target area image, determining a preset number of labels in a label queue corresponding to the first undetected target area image as alternative labels corresponding to the suspected target area image.

Optionally, the step of determining the target label corresponding to each first missed-detection target area image based on the alternative label corresponding to each first missed-detection target area image includes:

counting the number of alternative labels which are missed labels in the alternative labels corresponding to each first missed detection target area image as a first number;

judging whether the first quantity meets a preset statistical condition or not, wherein the meeting of the preset statistical condition comprises the following steps: the number of the alternative labels is larger than a preset number threshold, or the ratio of the number of the alternative labels to the total number of the alternative labels corresponding to the corresponding first missed detection target area image is larger than a preset proportion threshold;

if the first quantity meets the preset statistical condition, determining that the target label corresponding to the first missed-detection target area image is the missed-detection label;

and if the first quantity is judged not to meet the preset statistical condition, determining that the target label corresponding to the first missed-detection target area image is a non-missed-detection label.

Optionally, the step of detecting each obtained image to be screened by using a pre-established target detection model and determining an image to be screened including at least one first missed-detection target area image includes:

detecting each obtained image to be screened by using a pre-established target detection model, determining the image to be screened containing at least one suspected target area, and determining the corresponding confidence of each suspected target area;

determining a suspected target area with the corresponding confidence coefficient lower than the preset threshold value from the suspected target areas as a candidate target area based on the corresponding confidence coefficient of each suspected target area;

if the candidate target area is a rectangular area, determining an area image corresponding to the candidate target area which is the rectangular area as a first missed detection target area image;

if the candidate target area is a non-rectangular area, determining an area image corresponding to the smallest rectangular area containing the candidate target area as a first missed detection target area image so as to determine an image to be screened containing at least one first missed detection target area image.

Optionally, before the step of determining the alternative label corresponding to each first missed-detection target area image based on the image feature of each first missed-detection target area image and the pre-established corresponding relationship, the method further includes:

a process of establishing a correspondence relationship, wherein the process comprises:

acquiring built images and annotation information corresponding to each built image, wherein the annotation information comprises: marking position information of a region where the target is located, wherein the marking position information is contained in the correspondingly established image;

detecting each established image by using the target detection model, and determining each established image comprising at least one second undetected target area image and detection position information corresponding to the at least one second undetected target area image, wherein the at least one second undetected target area image is an area image of which the corresponding confidence coefficient is lower than the preset threshold value;

performing image feature extraction on each second missed detection target area image, and determining the image feature of each second missed detection target area image;

and for each second undetected target area image, determining a label corresponding to the second undetected target area image based on the detection position information corresponding to the second undetected target area image and the labeling position information in the labeling information corresponding to the established image of the second undetected target area image so as to establish and obtain the corresponding relation.

Optionally, the step of determining, for each second undetected target area image, a label corresponding to the second undetected target area image based on the detection position information corresponding to the second undetected target area image and the annotation position information in the annotation information corresponding to the established image in which the second undetected target area image is located, so as to establish the corresponding relationship includes:

for each second undetected target area image, determining the intersection and parallel ratio between the labeling frame and the detection frame corresponding to the second undetected target area image based on the detection position information corresponding to the second undetected target area image and the labeling position information in the labeling information corresponding to the established image in which the second undetected target area image is located;

for each second undetected target area image, comparing the intersection ratio corresponding to the second undetected target area image with a preset intersection ratio threshold;

if the intersection ratio corresponding to the second undetected target area image is not smaller than the preset intersection ratio threshold, determining that the label corresponding to the second undetected target area image is an undetected label;

and if the intersection ratio corresponding to the second undetected target area image is smaller than the preset intersection ratio threshold, determining that the label corresponding to the second undetected target area image is a non-undetected label so as to establish and obtain the corresponding relation.

In a second aspect, an embodiment of the present invention provides a difficult sample screening apparatus, including:

the first determining module is configured to detect each obtained image to be screened by using a pre-established target detection model, and determine an image to be screened including at least one first missed-detection target area image, wherein the target detection model is as follows: the first missed detection target area image is used for detecting an area where a target contained in the image is located and determining the confidence level of the detected area where the target is located, and the first missed detection target area image is as follows: the corresponding confidence coefficient is lower than the regional image with the preset threshold value;

the second determining module is configured to extract image features of each first missed detection target area image and determine the image features of each first missed detection target area image;

a third determining module, configured to determine a target label corresponding to each first missed detection target area image based on an image feature of each first missed detection target area image and a pre-established correspondence relationship, where the correspondence relationship includes: the image characteristics of the marked image and the corresponding relation between the corresponding labels of the marked image;

and the fourth determination module is configured to determine the image to be screened, which contains the first missed detection target area image with at least one corresponding target label as the missed detection label, as the difficult sample image.

Optionally, the third determining module includes:

the first determining unit is configured to determine an alternative label corresponding to each first missed detection target area image based on the image characteristics of each first missed detection target area image and a pre-established corresponding relation;

and the second determining unit is configured to determine the target label corresponding to each first missed detection target area image based on the alternative label corresponding to each first missed detection target area image.

Optionally, the first determining unit includes:

a first determining sub-module, configured to determine, for each first missed detection target area image, a similarity value between the first missed detection target area image and each labeled image based on the image features of the first missed detection target area image and the image features of each labeled image;

and the second determining sub-module is configured to determine the alternative label corresponding to each first missed detection target area image based on the similarity value.

Optionally, the second determining sub-module is specifically configured to, for each first missed detection target area image, arrange the labels corresponding to each labeled image according to a descending order of the similarity value between the first missed detection target area image and each labeled image, so as to obtain a label queue corresponding to the first missed detection target area image;

Optionally, the second determining unit is specifically configured to, for each first missed detection target area image, count the number of candidate tags that are missed detection tags in the candidate tags corresponding to the first missed detection target area image, as a first number;

Optionally, the first determining module is specifically configured to detect each obtained image to be screened by using a pre-established target detection model, determine an image to be screened including at least one suspected target area, and determine a confidence corresponding to each suspected target area;

Optionally, the apparatus further comprises:

a relationship establishing module configured to establish a corresponding relationship before determining the alternative label corresponding to each first missed detection target area image based on the image features of each first missed detection target area image and the pre-established corresponding relationship, wherein the relationship establishing module includes:

an obtaining unit, configured to obtain build images and annotation information corresponding to each build image, where the annotation information includes: marking position information of a region where the target is located, wherein the marking position information is contained in the correspondingly established image;

a third determining unit, configured to detect each created image by using the target detection model, and determine each created image including at least one second missed-detection target area image and detection position information corresponding to the at least one second missed-detection target area image, where the at least one second missed-detection target area image is an area image whose corresponding confidence coefficient is lower than the preset threshold;

a fourth determining unit, configured to perform image feature extraction on each second missed detection target area image, and determine an image feature of each second missed detection target area image;

and a fifth determining unit, configured to determine, for each second undetected target area image, a label corresponding to the second undetected target area image based on the detection position information corresponding to the second undetected target area image and the annotation position information in the annotation information corresponding to the setup image in which the second undetected target area image is located, so as to establish the correspondence.

Optionally, the fifth determining unit is specifically configured to determine, for each second missed-detection target area image, based on the detection position information corresponding to the second missed-detection target area image and the annotation position information in the annotation information corresponding to the established image of the second missed-detection target area image, an intersection-parallel ratio between an annotation frame corresponding to the second missed-detection target area image and the detection frame;

As can be seen from the above, the method and apparatus for screening a difficult sample according to the embodiments of the present invention can detect each obtained image to be screened by using a pre-established target detection model, and determine an image to be screened including at least one first missing target area image, where the target detection model is: the first missed detection target area image is used for detecting the area where the target contained in the image is located and determining the confidence coefficient that the target exists in the detected area where the target is located, and the first missed detection target area image is as follows: the corresponding confidence coefficient is lower than the regional image with the preset threshold value; performing image feature extraction on each first missed detection target area image, and determining the image feature of each first missed detection target area image; determining a target label corresponding to each first missed detection target area image based on the image characteristics of each first missed detection target area image and a pre-established corresponding relationship, wherein the corresponding relationship comprises the image characteristics of the marked image and the corresponding relationship between the image characteristics and the corresponding labels of the marked image; and determining the image to be screened of the first missed detection target area image containing at least one corresponding target label as a missed detection label as a difficult sample image.

By applying the embodiment of the invention, the marked image with the image characteristics similar to the image characteristics of the first undetected target area image can be determined from the marked image based on the pre-established corresponding relationship comprising the image characteristics of the marked image and the corresponding labels thereof and the image characteristics extracted from the first undetected target area image, and further, the target label corresponding to each first undetected target area image is determined based on the label corresponding to the marked image with the image characteristics similar to the image characteristics of the first undetected target area image, wherein the target label can comprise the undetected label representing the undetected target in the first undetected target area image; furthermore, the first missed detection target area image of which the corresponding target label is the missed detection label can be considered to contain the missed detection target, and the image to be screened of the first missed detection target area image of which at least one corresponding target label is the missed detection label is determined to be the difficult sample image. The method can realize that the required samples can be selected in a targeted manner, particularly when the required samples have centralized characteristics and are not related to the complex and changeable scenes of the samples; according to the missed detection target area detected by the target detection model, the content of the image block of the missed detection target area, namely the local image block of the image to be screened, is extracted as the main body of the retrieval, so that the interference of irrelevant information is avoided, the accuracy of retrieval identification is effectively improved, the memory is saved, the speed is increased, and the difficult sample is automatically screened out. Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.

The innovation points of the embodiment of the invention comprise:

1. determining an annotated image with image characteristics similar to those of the first undetected target area image from the annotated image based on a pre-established corresponding relationship comprising image characteristics of the annotated image and corresponding labels thereof and image characteristics extracted from the first undetected target area image, and further determining a target label corresponding to each first undetected target area image based on a label corresponding to the annotated image with image characteristics similar to those of the first undetected target area image, wherein the target label may comprise an undetected label representing that the first undetected target area image contains an undetected target; furthermore, the first missed detection target area image of which the corresponding target label is the missed detection label can be considered to contain the missed detection target, and the image to be screened of the first missed detection target area image of which at least one corresponding target label is the missed detection label is determined to be the difficult sample image. The method can realize that the required samples can be selected in a targeted manner, particularly when the required samples have centralized characteristics and are not related to the complex and changeable scenes of the samples; according to the missed detection target area detected by the target detection model, the content of the image block of the missed detection target area, namely the local image block of the image to be screened, is extracted as the main body of the retrieval, so that the interference of irrelevant information is avoided, the accuracy of retrieval identification is effectively improved, the memory is saved, the speed is increased, and the difficult sample is automatically screened out.

2. When determining a target label corresponding to each first missed detection target area image, firstly determining a similarity value between the first missed detection target area image and each labeled image, further, for each first missed detection target area image, determining labels corresponding to a preset number of labeled images which are most similar to the first missed detection target area image based on the corresponding similarity value, and taking the determined labels as alternative labels corresponding to the first missed detection target area image; and determining the target label corresponding to the first missed detection target area image based on the alternative label corresponding to the first missed detection target area image so as to improve the accuracy of the determined target label to a certain extent.

3. Counting a first number of alternative labels which are missed labels in the alternative labels corresponding to each first missed label target area image, determining the target labels corresponding to the first missed label target area images with the corresponding first number meeting a preset counting condition as the missed labels, and determining the target labels corresponding to the first missed label target area images with the corresponding first number not meeting the preset counting condition as the non-missed labels, so as to improve the accuracy of the determined first missed label target area images containing the missed labels to a certain extent, and further improve the accuracy of automatically determining difficult samples.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is to be understood that the drawings in the following description are merely exemplary of some embodiments of the invention. For a person skilled in the art, without inventive effort, further figures can be obtained from these figures.

FIG. 1 is a schematic flow chart of a method for screening a difficult sample according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of establishing a corresponding relationship according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a difficult sample screening apparatus according to an embodiment of the present invention.

Detailed Description

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.

It is to be noted that the terms "comprises" and "comprising" and any variations thereof in the embodiments and drawings of the present invention are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

The invention provides a method and a device for screening difficult samples, which are used for automatically screening the difficult samples. The following provides a detailed description of embodiments of the invention.

FIG. 1 is a schematic flow chart of a method for screening a difficult sample according to an embodiment of the present invention. The method may comprise the steps of:

s101: and detecting each obtained image to be screened by utilizing a pre-established target detection model, and determining the image to be screened containing at least one first missed detection target area image.

Wherein, the target detection model is as follows: the first missed detection target area image is used for detecting the area where the target contained in the image is located and determining the confidence coefficient that the target exists in the detected area where the target is located, and the first missed detection target area image is as follows: and the corresponding confidence coefficient is lower than the area image with the preset threshold value. The target detection model is as follows: and training the obtained network model based on the image marked with the target to be detected.

In the embodiment of the present invention, the method may be applied to any type of electronic device with computing capability, and the electronic device may be a server or a terminal device.

In one case, the pre-established target detection model may be a neural network model, such as: the target detection model may be a Convolutional Neural network model, specifically may be a fast Region-Convolutional Neural network model (fast Region-Convolutional Neural network) and a yolo (young Only Look one) model, and the pre-established target detection model may be: in any type of neural network model capable of detecting the position of the target in the image in the related art, the embodiment of the invention does not limit the specific type of the pre-established target detection model. For a training mode of the pre-established target detection model, reference may be made to related technologies, and embodiments of the present invention are not limited in particular.

The target to be detected may be any type of target, including but not limited to lane lines, vehicles, traffic lights, signs and/or pedestrians, etc.

In one implementation, after obtaining one or more frames of images to be screened, the electronic device may detect each obtained image to be screened by using a pre-established target detection model, identify regions in the images to be screened, in which targets may exist, respectively, and determine a confidence corresponding to each region in which targets may exist; and intercepting the area image corresponding to the identified area where the target possibly exists, subsequently, determining the area image with the corresponding confidence coefficient lower than a preset threshold value from the intercepted area image based on the confidence coefficient corresponding to each area image to be used as a first missed detection target area image, and determining the image to be screened containing at least one first missed detection target area image from the obtained image to be screened.

The confidence coefficient can represent the possibility that the corresponding region image has the target to be detected. In one case, the lower the confidence corresponding to the region image, the lower the probability that the representation target detection model predicts that the target to be detected exists in the region image of the image to be screened. Correspondingly, when the confidence corresponding to the region image is low, the region image has the possibility of containing the object to be detected which is missed to be detected. Correspondingly, the intercepted area image corresponding to the area where the target to be detected may exist may include an area image whose corresponding confidence coefficient is within a preset confidence coefficient threshold, where a lower limit value of the preset confidence coefficient threshold is 0, and an upper limit value of the preset confidence coefficient threshold is greater than or equal to the preset threshold.

In one case, after determining the image to be screened including the at least one first missed detection target area image, the electronic device may mark and record a corresponding relationship between the image to be screened and the at least one first missed detection target area image included in the image to be screened, so as to be used in a subsequent process.

S102: and performing image feature extraction on each first missed detection target area image, and determining the image feature of each first missed detection target area image.

In this step, the electronic device may perform image feature extraction on each first missed detection target area image by using any type of preset feature extraction algorithm, and determine an image feature of each first missed detection target area image. The preset feature extraction algorithm may include, but is not limited to, a Scale-invariant feature transform (SIFT) feature extraction algorithm, a Histogram of Oriented Gradients (HOG) feature extraction algorithm, a Harr feature extraction algorithm, a global feature extraction algorithm (GIST), and the like, and may also be a convolutional neural network-based feature extraction algorithm.

S103: and determining a target label corresponding to each first missed detection target area image based on the image characteristics of each first missed detection target area image and the pre-established corresponding relation.

Wherein, the corresponding relation comprises: and the image characteristics of the marked image and the corresponding relationship between the image characteristics and the corresponding labels of the marked image.

The pre-established correspondence may be stored in advance in a storage device local to or connected to the electronic device, where the correspondence includes: the marked image may include an area image in which a corresponding confidence coefficient is lower than a preset threshold value, which is captured from an original image in which the marked image is located based on the pre-established target detection model, and the original image may refer to a subsequently-mentioned established image; the annotated image may also include: in such a case, correspondingly, in order to ensure the accuracy of the difficult sample screening process, the labeled image may include only the target to be detected or only the target not to be detected.

The label corresponding to each labeled image may include: the label-missing tag representing that the labeled image contains the target to be detected that is missed by the pre-established target detection model, or the non-missing tag representing that the labeled image does not contain the target to be detected that is missed by the pre-established target detection model, for example: the label corresponding to the labeled image may be a label representing that the labeled image includes a lane line, that is, a missed-inspection label, that is, the content included in the missed-inspection label is "lane line", or a label representing that the labeled image does not include a lane line, that is, a non-missed-inspection label, that is, the content included in the non-missed-inspection label is "non-lane line".

In one case, the pre-established correspondence may be stored in a pre-set index database, so as to facilitate the comparison and matching of the image features of the first undetected target area image and the image features of the labeled image in the correspondence.

In an implementation manner, the electronic device may match, for each first missed detection target area, an image feature of the first missed detection target area image with an image feature of each labeled image in the corresponding relationship, and determine, as a target label corresponding to the first missed detection target area image, a label corresponding to an image feature that is most matched with the image feature of the first missed detection target area image in the corresponding relationship. Wherein, the matching process may be: based on a preset similarity algorithm, calculating a similarity value between the image features of the first missed detection target region image and the image features of each labeled image in a corresponding relationship, wherein correspondingly, the image feature in the corresponding relationship that is most matched with the image features of the first missed detection target region image may refer to: and the image characteristic with the maximum similarity value between the image characteristics of the first missed detection target area image in the corresponding relation. Wherein, the preset similarity algorithm includes but is not limited to: euclidean distance, cosine distance, minz distance, correlation coefficient, etc.

In another implementation manner, the S103 may include the following steps 01-02:

01: determining an alternative label corresponding to each first missed detection target area image based on the image characteristics of each first missed detection target area image and a pre-established corresponding relation;

02: and determining the target label corresponding to each first missed detection target area image based on the alternative label corresponding to each first missed detection target area image.

In this implementation manner, the electronic device may determine, for each first missed detection target area image, a plurality of labeled images matching the image features of the first missed detection target area image from the correspondence relationship based on the image features of the first missed detection target area image and the image features of the labeled images in the correspondence relationship established in advance, determine, as the alternative labels corresponding to the first missed detection target area image, the labels corresponding to the plurality of labeled images matching the image features of the first missed detection target area image, and further determine, based on the alternative labels corresponding to each first missed detection target area image, the target labels corresponding to each first missed detection target area image. The accuracy of the target label corresponding to each determined first missed detection target area image is improved to a certain extent.

In one implementation, the 01 may include the following steps 011-:

011: for each first missed detection target area image, determining a similarity value between the first missed detection target area image and each labeled image based on the image characteristics of the first missed detection target area image and the image characteristics of each labeled image;

012: and determining the alternative label corresponding to each first missed detection target area image based on the similarity value.

In this implementation manner, for each first missed detection target area image, the electronic device calculates, based on a preset similarity algorithm, an image feature of the first missed detection target area image, and an image feature of each labeled image, a similarity value between the first missed detection target area image and each labeled image, and further determines, based on the similarity value, an alternative label corresponding to each first missed detection target area image, for example: and determining labels corresponding to the preset number of labeled images with the maximum corresponding similarity value as alternative labels corresponding to the first missed-detection target area image.

In one implementation, the 012 can include:

In order to determine the labels corresponding to the preset number of labeled images with the largest corresponding similarity value from the corresponding relationship, the electronic device may arrange the labels corresponding to each labeled image according to the sequence of the similarity values between the first missed-detection target area image and each labeled image from large to small for each first missed-detection target area image, so as to obtain a label queue corresponding to the first missed-detection target area image; and then, determining the labels as the alternative labels corresponding to the suspected target area image according to the labels with the preset number in the label queue corresponding to the first undetected target area image. Alternatively, it may be: aiming at each first missed detection target area image, arranging the labels corresponding to each labeled image according to the sequence of similarity values between the first missed detection target area image and each labeled image from small to large to obtain another label queue corresponding to the first missed detection target area image; furthermore, it is also possible that a preset number of labels are in another label queue corresponding to the first undetected target area image, and the labels are determined as candidate labels corresponding to the suspected target area image.

The preset number is a preset number, or the preset number may be set by the electronic device according to the number of image features of the labeled image included in the preset correspondence, which may be all the preset number.

In one implementation, the step 02 may include the following steps 021-:

021: counting the number of alternative labels which are missed labels in the alternative labels corresponding to each first missed detection target area image as a first number;

022: judging whether the first quantity meets a preset statistical condition or not, wherein the meeting of the preset statistical condition comprises the following steps: the number of the alternative labels is larger than a preset number threshold, or the ratio of the number of the alternative labels to the total number of the alternative labels corresponding to the corresponding first missed detection target area image is larger than a preset proportion threshold;

023: if the first quantity meets the preset statistical condition, determining that the target label corresponding to the first missed-detection target area image is a missed-detection label;

024: and if the first quantity does not meet the preset statistical condition, determining that the target label corresponding to the first missed-detection target area image is a non-missed-detection label.

The alternative labels corresponding to each first missed detection target area image may include missed detection labels and/or non-missed detection labels. In this embodiment, the electronic device may count, as a first number, the number of candidate tags that are missed tags in the candidate tags corresponding to each first missed detection target area image; and judging whether the first quantity meets a preset statistical condition, namely judging whether the first quantity is greater than a preset quantity threshold value or judging whether the ratio of the first quantity to the total number of the alternative labels corresponding to the corresponding first missed detection target area image is greater than a preset proportion threshold value.

If the first quantity is judged to be larger than a preset quantity threshold value, or the ratio of the first quantity to the total number of the alternative labels corresponding to the corresponding first missed-detection target area image is judged to be larger than a preset proportion threshold value, the first quantity is determined to meet a preset statistical condition, namely the proportion of the labels representing the missed-detection targets contained in the first missed-detection target area image in the alternative labels corresponding to the first missed-detection target area image is larger, and correspondingly, the target labels corresponding to the first missed-detection target area image can be determined to be the missed-detection labels; on the contrary, if the first quantity is judged not to be greater than the preset quantity threshold value, or the ratio of the first quantity to the total number of the alternative labels corresponding to the corresponding first missed-detection target area image is judged not to be greater than the preset proportion threshold value, it is determined that the first quantity does not meet the preset statistical condition, that is, the proportion of the labels representing that the first missed-detection target area image contains the missed objects to be detected in the alternative labels corresponding to the first missed-detection target area image is smaller, and correspondingly, it can be determined that the target labels corresponding to the first missed-detection target area image are the non-missed labels.

In another implementation manner, it is considered that the greater the similarity value between the image features of the labeled image and the image features of the first missed-detection target region image, the more similar the first missed-detection target region image and the labeled image are characterized, and for the object to be detected which is missed to be detected, the features between the objects to be detected are also very similar. In view of this, when determining the target label corresponding to each first missed detection target area image based on the alternative label corresponding to each first missed detection target area image, the electronic device may set a weight value, where the greater the similarity between the image feature of the labeled image and the image feature of the first missed detection target area image, the greater the weight value corresponding to the label corresponding to the labeled image.

Subsequently, for each first missed detection target area image, comparing the sum of the products of the numerical value corresponding to each candidate label corresponding to the first missed detection target area image and the weight value corresponding to the candidate label with a preset label threshold value, namely a first sum, determining that the target label corresponding to the first missed detection target area image with the first sum being greater than the preset label threshold value is a missed detection label, and determining that the target label corresponding to the first missed detection target area image with the first sum being not greater than the preset label threshold value is a non-missed detection label. Among them, it can be: the value corresponding to the alternative label which is the missed label is 1, and the value corresponding to the alternative label which is the non-missed label is 0.

S104: and determining the image to be screened of the first missed detection target area image containing at least one corresponding target label as a missed detection label as a difficult sample image.

When the image to be screened contains at least one first missing detection target area image, the image to be screened can be considered to contain the target to be detected which is missed by the pre-established target detection model, and the electronic device can determine the image to be screened as a difficult sample.

In one implementation, after the difficult sample is determined, the difficult sample may be stored and labeled again, and a corresponding relationship between each first missed inspection target area image and the corresponding image to be screened is stored. And further, continuing to train the pre-established target detection model by using the difficult sample and the labeling information thereof, namely updating the parameters of the pre-established target detection model by using the difficult sample and the labeling information thereof so as to improve the detection precision of the pre-established target detection model.

By applying the embodiment of the invention, the marked image with the image characteristics similar to the image characteristics of the first undetected target area image can be determined from the marked image based on the pre-established corresponding relationship comprising the image characteristics of the marked image and the corresponding labels thereof and the image characteristics extracted from the first undetected target area image, and further, the target label corresponding to each first undetected target area image is determined based on the label corresponding to the marked image with the image characteristics similar to the image characteristics of the first undetected target area image, wherein the target label can comprise the undetected label representing the undetected target in the first undetected target area image; furthermore, the first missed detection target area image of which the corresponding target label is the missed detection label can be considered to contain the missed detection target, and the image to be screened of the first missed detection target area image of which at least one corresponding target label is the missed detection label is determined to be the difficult sample image. The method can realize that the required samples can be selected in a targeted manner, particularly when the required samples have centralized characteristics and are not related to the complex and changeable scenes of the samples; according to the missed detection target area detected by the target detection model, the content of the image block of the missed detection target area, namely the local image block of the image to be screened, is extracted as the main body of the retrieval, so that the interference of irrelevant information is avoided, the accuracy of retrieval identification is effectively improved, the memory is saved, the speed is increased, and the difficult sample is automatically screened out.

In another embodiment of the present invention, the S104 may include the following steps 11 to 14:

11: detecting each obtained image to be screened by using a pre-established target detection model, determining the image to be screened containing at least one suspected target area, and determining the corresponding confidence of each suspected target area;

12: determining a suspected target area with the corresponding confidence coefficient lower than a preset threshold value from the suspected target areas as a candidate target area based on the corresponding confidence coefficient of each suspected target area;

13: if the candidate target area is a rectangular area, determining an area image corresponding to the candidate target area which is the rectangular area as a first missed detection target area image;

14: if the candidate target area is a non-rectangular area, determining an area image corresponding to the smallest rectangular area containing the candidate target area as a first missed detection target area image so as to determine an image to be screened containing at least one first missed detection target area image.

In this embodiment, the electronic device may detect each image to be screened by using a pre-established target detection model, determine an image to be screened that includes at least one suspected target area, and determine a confidence corresponding to each suspected target area. The image block represented by each suspected target area may be referred to as an area image. The suspected target area determined by the electronic equipment is an area with the corresponding confidence coefficient within a preset confidence coefficient threshold value. The upper limit value of the preset confidence threshold may not be less than the preset threshold, and the lower limit value may be 0.

The electronic equipment determines a suspected target area with the corresponding confidence coefficient lower than a preset threshold value from the suspected target areas as a candidate target area. Judging whether each candidate target area is rectangular, and if the candidate target area is rectangular, determining an area image corresponding to the candidate target area which is the rectangular area as a first missed-detection target area image; if the candidate target area is a non-rectangular area, determining an area image corresponding to the smallest rectangular area containing the candidate target area as a first missed detection target area image so as to determine an image to be screened containing at least one first missed detection target area image.

In another embodiment of the present invention, before the S103, the method may further include:

the process of establishing the corresponding relationship, as shown in fig. 2, may include:

s201: and acquiring the built images and the corresponding annotation information of each built image.

Wherein, the labeling information includes: and marking position information of the region where the target is located, wherein the marking position information is contained in the correspondingly established image.

S202: and detecting each established image by using the target detection model, and determining the detection position information corresponding to each established image containing at least one second undetected target area image and at least one second undetected target area image.

Wherein, at least one second undetected target area image is an area image with the corresponding confidence coefficient lower than a preset threshold value

S203: and performing image feature extraction on each second missed detection target area image, and determining the image feature of each second missed detection target area image.

S204: and for each second undetected target area image, determining a label corresponding to the second undetected target area image based on the detection position information corresponding to the second undetected target area image and the labeling position information in the labeling information corresponding to the established image of the second undetected target area image so as to establish and obtain the corresponding relation.

In this embodiment, the electronic device may further include a process of establishing the corresponding relationship. Correspondingly, the electronic equipment can obtain a plurality of images for establishing the corresponding relationship, the image is called as an established image in the embodiment of the invention, the established image containing the target to be detected can be marked with the area where the target to be detected is located, and the marking information corresponding to the established image containing the target to be detected contains the position information of the target to be detected in the corresponding established image, and can be called as marking position information. Inputting the established images and the corresponding labeling information into a pre-established target detection model, detecting each established image by using the pre-established target detection model, and determining the detection position information corresponding to each established image comprising at least one second undetected target area image and at least one second undetected target area image; and each second undetected target area image is an area image of which the corresponding confidence coefficient is lower than a preset threshold value.

After the electronic equipment obtains at least one second missed detection target area image, image feature extraction is carried out on the at least one second missed detection target area image by using a preset feature extraction algorithm to obtain the image feature of the at least one second missed detection target area image; and for each second undetected target area image, determining a label corresponding to the second undetected target area image based on the detection position information corresponding to the second undetected target area image and the labeling position information in the labeling information corresponding to the established image of the second undetected target area image.

In one case, the S204 may be: determining a detection frame corresponding to the detection position information corresponding to the second undetected target area image, determining whether the overlapping area between the detection frame corresponding to the labeling position information in the labeling information corresponding to the image where the second undetected target area image is located exceeds a preset area proportion, if so, determining that the second undetected target area image has the target to be detected which is undetected by the pre-established target detection model, and determining that a label corresponding to the second undetected target area image is an undetected label; otherwise, if the area ratio does not exceed the preset area ratio, it can be considered that the target to be detected which is subjected to the missing detection by the pre-established target detection model does not exist in the second missing detection target area image, and the label corresponding to the second missing detection target area image is determined to be a non-missing detection label.

In another case, the step S204 may include the following steps 021-:

021: for each second undetected target area image, determining the intersection and parallel ratio between the labeling frame and the detection frame corresponding to the second undetected target area image based on the detection position information corresponding to the second undetected target area image and the labeling position information in the labeling information corresponding to the established image in which the second undetected target area image is located;

022: for each second undetected target area image, comparing the intersection ratio corresponding to the second undetected target area image with a preset intersection ratio threshold;

023: if the intersection ratio corresponding to the second undetected target area image is not smaller than the preset intersection ratio threshold, determining that the label corresponding to the second undetected target area image is an undetected label;

024: and if the intersection ratio corresponding to the second undetected target area image is smaller than a preset intersection ratio threshold value, determining that the label corresponding to the second undetected target area image is a non-undetected label so as to establish a corresponding relation.

In this implementation manner, the electronic device may determine, for each second undetected target area image, based on the detection position information corresponding to the second undetected target area image and the annotation position information in the annotation information corresponding to the established image in which the second undetected target area image is located, an intersection and a union ratio between an annotation frame corresponding to the second undetected target area image and a detection frame, that is, an intersection and a union ratio, where the annotation frame corresponds to the annotation position information of the second undetected target area image and the detection frame corresponds to the detection position information corresponding to the second undetected target area image. Judging whether the cross-over ratio corresponding to the second missed detection target area image is not less than a preset cross-over ratio threshold value or not, if so, considering that a target to be detected which is missed to be detected by a preset target detection model exists in the second missed detection target area image, and determining that a label corresponding to the second missed detection target area image is a missed detection label; otherwise, if the image feature of the second missed detection target area image is judged to be smaller than the label feature of the labeled image, the label feature of the labeled image is determined to be a label corresponding to the label feature of the second missed detection target area image, and the label feature of the labeled image is determined to be a label corresponding to the label feature of the labeled image. The labeled image comprises the second missed-detection target area image.

Corresponding to the above method embodiment, an embodiment of the present invention provides a difficult sample screening apparatus, as shown in fig. 3, which may include:

a first determining module 310, configured to detect each obtained image to be screened by using a pre-established target detection model, and determine an image to be screened that includes at least one first missed-detection target area image, where the target detection model is: the first missed detection target area image is used for detecting an area where a target contained in the image is located and determining the confidence level of the detected area where the target is located, and the first missed detection target area image is as follows: the corresponding confidence coefficient is lower than the regional image with the preset threshold value;

a second determining module 320, configured to perform image feature extraction on each first missed detection target area image, and determine an image feature of each first missed detection target area image;

a third determining module 330, configured to determine a target label corresponding to each first missed detection target area image based on the image features of each first missed detection target area image and a pre-established correspondence relationship, where the correspondence relationship includes: the image characteristics of the marked image and the corresponding relation between the corresponding labels of the marked image;

the fourth determining module 340 is configured to determine an image to be screened, which includes at least one first missed-detection target area image with a corresponding target label as a missed-detection label, as a difficult sample image.

In another embodiment of the present invention, the third determining module 330 includes:

In another embodiment of the present invention, the first determining unit includes:

In another embodiment of the present invention, the second determining sub-module is specifically configured to, for each first missed detection target area image, arrange the labels corresponding to each labeled image according to a descending order of the similarity value between the first missed detection target area image and each labeled image, so as to obtain a label queue corresponding to the first missed detection target area image;

In another embodiment of the present invention, the second determining unit is specifically configured to, for each first missed detection target area image, count the number of candidate tags that are missed detection tags in the candidate tags corresponding to the first missed detection target area image, as a first number;

Optionally, the first determining module 310 is specifically configured to detect each obtained image to be screened by using a pre-established target detection model, determine an image to be screened that includes at least one suspected target area, and determine a confidence corresponding to each suspected target area;

In another embodiment of the present invention, the apparatus further comprises:

In another embodiment of the present invention, the fifth determining unit is specifically configured to determine, for each second undetected target area image, an intersection ratio between an annotation frame and a detection frame corresponding to the second undetected target area image based on the detection position information corresponding to the second undetected target area image and the annotation position information in the annotation information corresponding to the established image of the second undetected target area image;

The device and system embodiments correspond to the method embodiments, and have the same technical effects as the method embodiments, and specific descriptions refer to the method embodiments. The device embodiment is obtained based on the method embodiment, and for specific description, reference may be made to the method embodiment section, which is not described herein again.

Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.

Those of ordinary skill in the art will understand that: modules in the devices in the embodiments may be distributed in the devices in the embodiments according to the description of the embodiments, or may be located in one or more devices different from the embodiments with corresponding changes. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method for screening a difficult sample, comprising:

2. The method as claimed in claim 1, wherein the step of determining the target label corresponding to each first missed detection target area image based on the image characteristics of each first missed detection target area image and the pre-established correspondence relationship comprises:

3. The method according to claim 2, wherein the step of determining the candidate label corresponding to each first missed-detection target area image based on the image characteristics of each first missed-detection target area image and the pre-established correspondence relationship comprises:

4. The method of claim 3, wherein the step of determining the alternative label corresponding to each first missed-detection target area image based on the similarity value comprises:

5. The method of claim 2, wherein the step of determining the target label corresponding to each first missed target area image based on the alternative label corresponding to each first missed target area image comprises:

6. The method according to any one of claims 1 to 5, wherein the step of determining the image to be screened including at least one first missing-detection target area image by detecting each obtained image to be screened by using a pre-established target detection model comprises:

7. The method according to any one of claims 1 to 6, wherein before the step of determining the alternative label corresponding to each first missed detection target area image based on the image features of each first missed detection target area image and the pre-established correspondence relationship, the method further comprises:

8. The method according to claim 7, wherein the step of determining, for each second undetected target area image, a label corresponding to the second undetected target area image based on the detection position information corresponding to the second undetected target area image and the labeling position information in the labeling information corresponding to the setup image in which the second undetected target area image is located, so as to establish the corresponding relationship includes:

9. A difficult sample screening device, the device comprising:

10. The apparatus of claim 9, wherein the third determining module comprises: