CN117789275A - Model optimization method, device, electronic equipment and storage medium - Google Patents
Model optimization method, device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN117789275A CN117789275A CN202311843499.8A CN202311843499A CN117789275A CN 117789275 A CN117789275 A CN 117789275A CN 202311843499 A CN202311843499 A CN 202311843499A CN 117789275 A CN117789275 A CN 117789275A
- Authority
- CN
- China
- Prior art keywords
- image
- face
- images
- difficult
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000005457 optimization Methods 0.000 title claims abstract description 28
- 230000003416 augmentation Effects 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims description 45
- 238000011176 pooling Methods 0.000 claims description 24
- 238000004891 communication Methods 0.000 claims description 18
- 238000012800 visualization Methods 0.000 claims description 12
- 230000004927 fusion Effects 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 7
- 238000012163 sequencing technique Methods 0.000 claims description 5
- 238000007499 fusion processing Methods 0.000 claims description 4
- 230000003321 amplification Effects 0.000 claims description 3
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 238000001514 detection method Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Landscapes
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The application provides a model optimization method, a model optimization device, electronic equipment and a storage medium. Acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying a face image, and the image data set contains a plurality of sample face images; determining a difficult image in a plurality of sample face images; for each difficult-case image, dividing the difficult-case image based on face parts contained in the difficult-case image to obtain a plurality of face part images; determining a region to be processed on the difficult image based on a plurality of face bit images, wherein the region to be processed contains at least one face part; performing augmentation treatment on the region to be treated to obtain a corresponding target face image; and optimizing the model to be optimized by using all the target face images to obtain a corresponding target model. Thereby improving the recognition accuracy of the model.
Description
Technical Field
The present disclosure relates to the field of model training technologies, and in particular, to a model optimization method, a device, an electronic device, and a storage medium.
Background
With the continuous progress of technology, face recognition technology has been rapidly developed, and face recognition has been applied to more and more fields. In practical application, the face recognition process is generally realized by relying on a face image recognition model. However, due to the current face image recognition model, most of the sample data involved in model training during training is a sharp image, and the model tends to produce over-fitting (e.g., eyes, mouth, nose, etc.) conditions on a localized area of a simple sample during training. Therefore, in the process of model application, when a picture with poor effect (such as a picture with low definition, various noise or distortion exists), the model is easy to be subjected to false detection due to poor effect of the picture and excessive attention of the model to a local area.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method, an apparatus, an electronic device, and a storage medium for model optimization, so as to solve the problem that a false detection is easy to occur in a current face image recognition model. The specific technical scheme is as follows:
in a first aspect, the present application provides a model optimization method, including:
acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying a face image, and the image data set contains a plurality of sample face images;
Determining a difficult image in a plurality of sample face images;
for each difficult-case image, dividing the difficult-case image based on face parts contained in the difficult-case image to obtain a plurality of face part images;
determining a region to be processed on the difficult image based on a plurality of face bit images, wherein the region to be processed contains at least one face part;
performing augmentation treatment on the region to be treated to obtain a corresponding target face image;
and optimizing the model to be optimized by using all the target face images to obtain a corresponding target model.
In one possible implementation manner, the determining a to-be-processed area on the difficult image based on the plurality of face image includes:
extracting preliminary image features of the face part images aiming at each face part image, and carrying out pooling treatment on the preliminary image features to obtain target image features;
performing stitching processing on all target image features corresponding to the difficult-to-sample image to obtain global features corresponding to the difficult-to-sample image;
determining fitting scores corresponding to each face part image based on the global features;
Sequencing all face images according to the sequence of the fitting score from high to low;
and determining the front-ordered and preset number of face image images as first target part images, and determining the corresponding area of the first target part images in the difficult-to-treat image as the area to be processed.
In one possible implementation manner, the pooling the preliminary image feature to obtain a target image feature includes:
carrying out global maximum pooling treatment on the preliminary image features in the space dimension to obtain first image features;
carrying out global average pooling treatment on the preliminary image features in the space dimension to obtain second image features;
carrying out fusion processing on the first image feature and the second image feature in the channel dimension to obtain corresponding fusion features;
and performing full connection processing on the fusion characteristics to obtain the target image characteristics.
In one possible implementation manner, the determining the fitting score corresponding to each face part image based on the global feature includes:
performing full connection processing on the global features to obtain features to be calculated;
calculating the feature to be calculated by using a normalized exponential function to obtain a weight score corresponding to each preliminary image feature in the feature to be calculated;
And aiming at each face part image, determining the weight score of the preliminary image feature corresponding to the face part image as the fitting score corresponding to the face part image.
In one possible implementation manner, the determining a to-be-processed area on the difficult image based on the plurality of face image includes:
displaying a plurality of face part images through a visualization component;
receiving first selection information input by a user based on a plurality of face image;
determining the face image corresponding to the first selection information as a second target part image;
and determining a corresponding region of the second target part image in the difficult-to-treat image as the region to be processed.
In one possible implementation manner, the determining the difficult image in the plurality of sample face images includes:
acquiring an image tag of each sample face image, and acquiring a recognition result of the model to be optimized on each sample face image;
and determining that each sample face image is a difficult image when the identification result corresponding to the sample face image is inconsistent with the image label corresponding to the sample face image.
In one possible implementation manner, the determining the difficult image in the plurality of sample face images includes:
displaying a plurality of sample face images through a visualization component;
receiving second selection information input by a user based on a plurality of sample face images;
and determining the sample face image corresponding to the second selection information as a difficult image.
In a second aspect, the present application provides a model optimization apparatus, including:
the acquisition module is used for acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying face images, and the image data set contains a plurality of sample face images;
the first determining module is used for determining difficult-case images in the plurality of sample face images;
the segmentation module is used for carrying out segmentation processing on each difficult-case image based on the face parts of the people contained in the difficult-case images to obtain a plurality of face part images;
the second determining module is used for determining a to-be-processed area on the difficult image based on a plurality of face bit images, wherein the to-be-processed area comprises at least one face part;
The processing module is used for carrying out amplification processing on the region to be processed to obtain a corresponding target face image;
and the optimization module is used for optimizing the model to be optimized by utilizing all the target face images to obtain a corresponding target model.
In one possible embodiment, the second determining module is further configured to:
extracting preliminary image features of the face part images aiming at each face part image, and carrying out pooling treatment on the preliminary image features to obtain target image features;
performing stitching processing on all target image features corresponding to the difficult-to-sample image to obtain global features corresponding to the difficult-to-sample image;
determining fitting scores corresponding to each face part image based on the global features;
sequencing all face images according to the sequence of the fitting score from high to low;
and determining the front-ordered and preset number of face image images as first target part images, and determining the corresponding area of the first target part images in the difficult-to-treat image as the area to be processed.
In a possible embodiment, the second determining module is further configured to:
Carrying out global maximum pooling treatment on the preliminary image features in the space dimension to obtain first image features;
carrying out global average pooling treatment on the preliminary image features in the space dimension to obtain second image features;
carrying out fusion processing on the first image feature and the second image feature in the channel dimension to obtain corresponding fusion features;
and performing full connection processing on the fusion characteristics to obtain the target image characteristics.
In a possible embodiment, the second determining module is further configured to:
performing full connection processing on the global features to obtain features to be calculated;
calculating the feature to be calculated by using a normalized exponential function to obtain a weight score corresponding to each preliminary image feature in the feature to be calculated;
and aiming at each face part image, determining the weight score of the preliminary image feature corresponding to the face part image as the fitting score corresponding to the face part image.
In a possible embodiment, the second determining module is further configured to:
displaying a plurality of face part images through a visualization component;
receiving first selection information input by a user based on a plurality of face image;
Determining the face image corresponding to the first selection information as a second target part image;
and determining a corresponding region of the second target part image in the difficult-to-treat image as the region to be processed.
In one possible embodiment, the first determining module is further configured to:
acquiring an image tag of each sample face image, and acquiring a recognition result of the model to be optimized on each sample face image;
and determining that each sample face image is a difficult image when the identification result corresponding to the sample face image is inconsistent with the image label corresponding to the sample face image.
In one possible embodiment, the first determining module is further configured to:
displaying a plurality of sample face images through a visualization component;
receiving second selection information input by a user based on a plurality of sample face images;
and determining the sample face image corresponding to the second selection information as a difficult image.
In a third aspect, an electronic device is provided, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
A memory for storing a computer program;
a processor for implementing the method steps of any of the first aspects when executing a program stored on a memory.
In a fourth aspect, a computer-readable storage medium is provided, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of the first aspects.
In a fifth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform any of the model optimization methods described above.
The beneficial effects of the embodiment of the application are that:
in the embodiment of the application, firstly, a model to be optimized and an image data set corresponding to the model to be optimized are obtained, then, a difficult-case image is determined in a plurality of sample face images, for each difficult-case image, the difficult-case image is segmented based on face positions contained in the difficult-case image to obtain a plurality of face part images, a region to be processed is determined on the difficult-case image based on the plurality of face part images, further, the region to be processed is subjected to augmentation processing to obtain a corresponding target face image, and finally, the model to be optimized is optimized by utilizing all the target face images to obtain the corresponding target model. According to the scheme, the to-be-optimized model can determine the to-be-processed area which is excessively focused in the difficult image processing process, and the to-be-optimized model is adjusted and optimized by using the image after the augmentation processing by carrying out the augmentation processing on the to-be-processed area in the difficult image, so that the newly obtained target model can resist the overfitting effect on the local areas of different face images, and the recognition accuracy of the model is improved.
Of course, not all of the above-described advantages need be achieved simultaneously in practicing any one of the products or methods of the present application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which the figures of the drawings are not to be taken in a limiting sense, unless otherwise indicated.
FIG. 1 is a flowchart of a model optimization method according to an embodiment of the present application;
FIG. 2 is a flow chart of another model optimization method provided in an embodiment of the present application;
FIG. 3 is a flowchart of another model optimization method according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a model optimization device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present application based on the embodiments herein.
The following disclosure provides many different embodiments, or examples, for implementing different structures of the invention. In order to simplify the present disclosure, components and arrangements of specific examples are described below. They are, of course, merely examples and are not intended to limit the invention. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
Fig. 1 is a schematic flow chart of a model optimization method according to an embodiment of the present application. The method can be applied to one or more electronic devices such as smart phones, notebook computers, desktop computers, portable computers, servers and the like. The main execution body of the method may be hardware or software. When the execution body is hardware, the execution body may be one or more of the electronic devices. For example, a single electronic device may perform the method, or a plurality of electronic devices may cooperate with one another to perform the method. When the execution subject is software, the method may be implemented as a plurality of software or software modules, or may be implemented as a single software or software module. The present invention is not particularly limited herein.
As shown in fig. 1, the method specifically includes:
s101, acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying face images, and the image data set contains a plurality of sample face images.
The model to be optimized refers to a pre-trained model for identifying the face image.
An image dataset refers to a collection comprising a plurality of sample face images.
In the embodiment of the application, the face recognition model uploaded or designated by the user can be used as the model to be optimized, and the image set uploaded or designated by the user can be used as the image data set.
S102, determining difficult-to-sample images in a plurality of sample face images.
In practical applications, the model to be optimized has accurate detection results for some images, and these images may be referred to as good case (goodcase) images. Accordingly, the model to be optimized may not be accurate enough for the detection of certain images, which may be referred to as bad case images.
As one possible implementation manner of the embodiment of the present application, determining the difficult-case image in the plurality of sample face images may include the following steps:
obtaining an image label of each sample face image, obtaining a recognition result of the model to be optimized on each sample face image, and determining the sample face image as a difficult image for each sample face image under the condition that the recognition result corresponding to the sample face image is inconsistent with the image label corresponding to the sample face image.
By the embodiment, the sample face image with the error recognition of the model to be optimized (namely, the sample face image with the recognition result not conforming to the corresponding label) can be directly taken as the difficult-to-sample image. The part of the difficult images can be processed in a targeted manner, and the model to be optimized is optimized based on the processed images, so that the recognition effect of the model is improved.
As another possible implementation manner of the embodiment of the present application, determining the difficult case image in the plurality of sample face images may include the following steps: and displaying a plurality of sample face images through a visualization component, receiving second selection information input by a user based on the plurality of sample face images, and determining the sample face images corresponding to the second selection information as difficult images. By the implementation mode, the user can flexibly specify the difficult image according to actual requirements.
S103, for each difficult-case image, dividing the difficult-case image based on the face parts of the people contained in the difficult-case image to obtain a plurality of face part images.
The face part image is an image including only the corresponding face part. For example, if the difficult-to-find image 1 includes eyes, a nose, and a mouth of a person, a partial image including eyes may be taken as a face image corresponding to the eyes, a partial image including a nose may be taken as a face image corresponding to the nose, and a partial image including a mouth may be taken as a face image corresponding to the mouth. Thus, three face part images are cut out from the difficult-to-see image 1.
S104, determining a to-be-processed area on the difficult image based on a plurality of face image, wherein the to-be-processed area comprises at least one face part.
The area to be processed refers to an area which is excessively concerned in the process of identifying the difficult image by the model to be optimized, and the area to be processed can be an area where a certain face part is located on the difficult image or an area where a plurality of face parts are located.
As to how to determine the area to be processed on the difficult-to-handle image based on the plurality of face image, details will be explained later, and will not be described in detail here.
And S105, performing augmentation processing on the region to be processed to obtain a corresponding target face image.
And S106, optimizing the model to be optimized by utilizing all the target face images to obtain a corresponding target model.
S105 and S106 are collectively described below:
in the embodiment of the application, for each difficult image, the processing of blocking, deforming or blurring the to-be-processed area in the difficult image can be performed to realize the amplification processing of the to-be-processed area, so as to obtain the corresponding target face image. And further, retraining the model to be optimized by utilizing all the target face images so as to realize optimization adjustment of the model to be optimized and obtain a corresponding target model.
Therefore, the obtained target model is more focused on the places except the area to be processed, so that the overfitting effect on the local areas of different face images can be resisted, and the recognition accuracy of the model is improved.
In the embodiment of the application, firstly, a model to be optimized and an image data set corresponding to the model to be optimized are obtained, then, a difficult-case image is determined in a plurality of sample face images, for each difficult-case image, segmentation processing is carried out on the difficult-case image based on face positions contained in the difficult-case image to obtain a plurality of face part images, a region to be processed is determined on the difficult-case image based on the plurality of face part images, further, the region to be processed is subjected to augmentation processing to obtain a corresponding target face image, and finally, the model to be optimized is optimized by utilizing all the target face images to obtain a corresponding target model. According to the scheme, the to-be-optimized model can determine the to-be-processed area which is excessively focused in the difficult image processing process, and the to-be-optimized model is adjusted and optimized by using the image after the augmentation processing by carrying out the augmentation processing on the to-be-processed area in the difficult image, so that the newly obtained target model can resist the overfitting effect on the local areas of different face images, and the recognition accuracy of the model is improved.
Referring to fig. 2, a flowchart of an embodiment of another model optimization method is provided in the embodiment of the present application. The flow shown in fig. 2 describes how to determine a region to be processed on the difficult-to-deal image based on a plurality of the face-side images on the basis of the flow shown in fig. 1 described above. As shown in fig. 2, the process may include the steps of:
S201, extracting preliminary image features of the face part images aiming at each face part image, and carrying out pooling treatment on the preliminary image features to obtain target image features.
In this embodiment of the present application, first, feature extraction may be performed on each face image by using a feature extraction model to obtain corresponding local features (i.e., preliminary image features), where feature dimensions are generally 4×4x128. In application, the feature extraction model may be any convolutional neural network (e.g., a resnext101 model) or a sequence model of the attention mechanism. And then, pooling processing is carried out on each preliminary image feature to obtain a target image feature.
Specifically, the pooling processing of the preliminary image features to obtain target image features may include the following steps: and carrying out global maximum pooling treatment on the preliminary image features in the space dimension to obtain first image features, carrying out global average pooling treatment on the preliminary image features in the space dimension to obtain second image features, carrying out fusion treatment on the first image features and the second image features in the channel dimension to obtain corresponding fusion features, and carrying out full connection treatment on the fusion features to obtain the target image features.
In the scheme, firstly, the primary image features are subjected to pooling processing by using global maximum pooling and global average pooling in the space dimension in parallel, and channel dimensions are reserved to obtain pooled features (namely a first image feature and a second image feature), wherein the dimensions of the first image feature and the second image feature are generally 1x1x128. Then, based on the pooled features, the features obtained in the two pooling modes are spliced in the channel dimension by using a concat function to obtain a fused feature, wherein the dimension of the fused feature is generally 1x1x (2 x 128), and further, the dimension of the fused feature is kept at 1x1x128 dimension through full connection.
Through the scheme, the preliminary image features can be processed in two pooling modes to obtain target image features, so that the feature richness is increased, and the feature characterization capability is improved.
S202, performing stitching processing on all target image features corresponding to the difficult-to-case image to obtain global features corresponding to the difficult-to-case image.
In this embodiment of the present application, for each difficult image, the face feature (i.e., global feature) of the entire face is obtained by stitching all the target image features corresponding to the difficult image, where the dimension of the global feature is generally Nx128, and N refers to the number of target image features.
S203, determining fitting scores corresponding to the face part images based on the global features.
The fitting score is used for representing the attention degree of the model to be optimized to the region where the face image is located when the difficult image is identified, and the higher the fitting score corresponding to the face image is, the higher the attention degree of the model to be optimized to the region where the face image is located is, the higher the possibility of over-fitting.
In this embodiment, the specific implementation of determining the fitting score corresponding to each face part image based on the global feature may include the following steps: and performing full connection processing on the global features to obtain features to be calculated, calculating the features to be calculated by using a normalized exponential function to obtain weight scores corresponding to each preliminary image feature in the features to be calculated, and determining the weight scores of the preliminary image features corresponding to the face part images as fitting scores corresponding to the face part images aiming at each face part image.
In this embodiment, first, a global feature is fully connected through a fully connected layer to obtain a feature to be calculated, so that the dimension of the feature meets the input requirement of softmax (normalized exponential function), and further, the feature to be calculated is calculated by using the softmax to obtain a weight score of each face part image in the whole difficult-case image, and the weight score is determined as a fitting score of the corresponding face part image.
S204, sequencing all face image maps according to the order of the fitting score from high to low.
S205, determining the front-ordered and preset number of face bit images as first target part images, and determining the corresponding area of the first target part images in the difficult-to-treat image as the area to be processed.
S204 and S205 are collectively described below:
the preset number can be set by a user according to actual requirements, and in actual application, the preset number can be set to be 1 or 2.
In the embodiment of the present application, first, all face part images are ranked according to the order of the fitting score from high to low, so that the attention degree of the model to be optimized to the region where the face part images are located is higher for the face part images with the earlier ranking. On the basis, the front-ordered face part bitmap images with the preset number are determined to be first target part images, and the corresponding areas of the first target part images in the difficult-to-be-processed images are determined to be areas to be processed. Namely, the region where the preset number of face image points of the model to be optimized with higher attention is located is determined as the region to be processed.
Through the flow shown in fig. 2, the region where the preset number of facial image positions with higher attention of the model to be optimized is determined as the region to be processed, so that after the corresponding model optimization is performed subsequently, the newly obtained target model can resist the overfitting effect on the local regions of different facial images, and the recognition accuracy of the model is improved.
Referring to fig. 3, a flowchart of an embodiment of another model optimization method is provided in the embodiment of the present application. The flow shown in fig. 3 describes how to determine a region to be processed on the difficult-to-deal image based on a plurality of the face-side images on the basis of the flow shown in fig. 1 described above. As shown in fig. 3, the process may include the steps of:
s301, displaying a plurality of face part images through a visualization component;
s302, receiving first selection information input by a user based on a plurality of face images;
s303, determining the face image corresponding to the first selection information as a second target part image;
s304, determining a corresponding area of the second target part image in the difficult-to-treat image as the area to be processed.
S301 and S304 are collectively described below:
In this embodiment, for each difficult image, a plurality of face part images corresponding to the difficult image may be displayed to a user by using a visualization component, so that the user may select, according to actual experience or actual test data, among the plurality of face part images displayed by using the visualization component, that is, input corresponding first selection information, and on this basis, determine a face part image corresponding to the first selection information as a second target part image, and determine an area where the second target part image is located in the difficult image as a to-be-processed area.
Through the flow shown in fig. 3, the area to be processed can be flexibly set according to the user demand, so that after the corresponding model optimization is performed subsequently, the newly obtained target model can resist the overfitting effect on the local areas of different face images, and the recognition accuracy of the model is improved.
Based on the same technical concept, the embodiment of the application further provides a model optimization device, as shown in fig. 4, which includes:
the obtaining module 401 is configured to obtain a model to be optimized and an image dataset corresponding to the model to be optimized, where the model to be optimized is used to identify a face image, and the image dataset includes a plurality of sample face images;
A first determining module 402, configured to determine a difficult case image from a plurality of the sample face images;
a segmentation module 403, configured to segment, for each difficult-case image, the difficult-case image based on a face part included in the difficult-case image, to obtain a plurality of face part images;
a second determining module 404, configured to determine a to-be-processed area on the refractory image based on a plurality of face bit images, where the to-be-processed area includes at least one face part;
a processing module 405, configured to perform augmentation processing on the area to be processed to obtain a corresponding target face image;
and the optimizing module 406 is configured to optimize the model to be optimized by using all the target face images to obtain a corresponding target model.
In one possible embodiment, the second determining module is further configured to:
extracting preliminary image features of the face part images aiming at each face part image, and carrying out pooling treatment on the preliminary image features to obtain target image features;
performing stitching processing on all target image features corresponding to the difficult-to-sample image to obtain global features corresponding to the difficult-to-sample image;
Determining fitting scores corresponding to each face part image based on the global features;
sequencing all face images according to the sequence of the fitting score from high to low;
and determining the front-ordered and preset number of face image images as first target part images, and determining the corresponding area of the first target part images in the difficult-to-treat image as the area to be processed.
In a possible embodiment, the second determining module is further configured to:
carrying out global maximum pooling treatment on the preliminary image features in the space dimension to obtain first image features;
carrying out global average pooling treatment on the preliminary image features in the space dimension to obtain second image features;
carrying out fusion processing on the first image feature and the second image feature in the channel dimension to obtain corresponding fusion features;
and performing full connection processing on the fusion characteristics to obtain the target image characteristics.
In a possible embodiment, the second determining module is further configured to:
performing full connection processing on the global features to obtain features to be calculated;
calculating the feature to be calculated by using a normalized exponential function to obtain a weight score corresponding to each preliminary image feature in the feature to be calculated;
And aiming at each face part image, determining the weight score of the preliminary image feature corresponding to the face part image as the fitting score corresponding to the face part image.
In a possible embodiment, the second determining module is further configured to:
displaying a plurality of face part images through a visualization component;
receiving first selection information input by a user based on a plurality of face image;
determining the face image corresponding to the first selection information as a second target part image;
and determining a corresponding region of the second target part image in the difficult-to-treat image as the region to be processed.
In one possible embodiment, the first determining module is further configured to:
acquiring an image tag of each sample face image, and acquiring a recognition result of the model to be optimized on each sample face image;
and determining that each sample face image is a difficult image when the identification result corresponding to the sample face image is inconsistent with the image label corresponding to the sample face image.
In one possible embodiment, the first determining module is further configured to:
Displaying a plurality of sample face images through a visualization component;
receiving second selection information input by a user based on a plurality of sample face images;
and determining the sample face image corresponding to the second selection information as a difficult image.
In the embodiment of the application, firstly, a model to be optimized and an image data set corresponding to the model to be optimized are obtained, then, a difficult-case image is determined in a plurality of sample face images, for each difficult-case image, segmentation processing is carried out on the difficult-case image based on face positions contained in the difficult-case image to obtain a plurality of face part images, a region to be processed is determined on the difficult-case image based on the plurality of face part images, further, the region to be processed is subjected to augmentation processing to obtain a corresponding target face image, and finally, the model to be optimized is optimized by utilizing all the target face images to obtain a corresponding target model. According to the scheme, the to-be-optimized model can determine the to-be-processed area which is excessively focused in the difficult image processing process, and the to-be-optimized model is adjusted and optimized by using the image after the augmentation processing by carrying out the augmentation processing on the to-be-processed area in the difficult image, so that the newly obtained target model can resist the overfitting effect on the local areas of different face images, and the recognition accuracy of the model is improved.
Based on the same technical concept, the embodiment of the present application further provides an electronic device, as shown in fig. 5, including a processor 111, a communication interface 112, a memory 113, and a communication bus 114, where the processor 111, the communication interface 112, and the memory 113 perform communication with each other through the communication bus 114,
a memory 113 for storing a computer program;
the processor 111 is configured to execute a program stored in the memory 113, and implement the following steps:
acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying a face image, and the image data set contains a plurality of sample face images;
determining a difficult image in a plurality of sample face images;
for each difficult-case image, dividing the difficult-case image based on face parts contained in the difficult-case image to obtain a plurality of face part images;
determining a region to be processed on the difficult image based on a plurality of face bit images, wherein the region to be processed contains at least one face part;
performing augmentation treatment on the region to be treated to obtain a corresponding target face image;
And optimizing the model to be optimized by using all the target face images to obtain a corresponding target model.
The communication bus mentioned above for the electronic devices may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In yet another embodiment provided herein, there is also provided a computer readable storage medium having stored therein a computer program which when executed by a processor implements the steps of any of the model optimization methods described above.
In yet another embodiment provided herein, there is also provided a computer program product containing instructions that, when run on a computer, cause the computer to perform any of the model optimization methods of the above embodiments.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
From the above description of embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus a general purpose hardware platform, or may be implemented by hardware. Based on such understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the related art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the method described in the respective embodiments or some parts of the embodiments.
It is to be understood that the terminology used herein is for the purpose of describing particular example embodiments only, and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises," "comprising," "includes," "including," and "having" are inclusive and therefore specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order described or illustrated, unless an order of performance is explicitly stated. It should also be appreciated that additional or alternative steps may be used.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. A method of model optimization, the method comprising:
acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying a face image, and the image data set contains a plurality of sample face images;
determining a difficult image in a plurality of sample face images;
for each difficult-case image, dividing the difficult-case image based on face parts contained in the difficult-case image to obtain a plurality of face part images;
determining a region to be processed on the difficult image based on a plurality of face bit images, wherein the region to be processed contains at least one face part;
performing augmentation treatment on the region to be treated to obtain a corresponding target face image;
and optimizing the model to be optimized by using all the target face images to obtain a corresponding target model.
2. The method of claim 1, wherein the determining a region to be processed on the refractory image based on the plurality of face bit images comprises:
extracting preliminary image features of the face part images aiming at each face part image, and carrying out pooling treatment on the preliminary image features to obtain target image features;
Performing stitching processing on all target image features corresponding to the difficult-to-sample image to obtain global features corresponding to the difficult-to-sample image;
determining fitting scores corresponding to each face part image based on the global features;
sequencing all face images according to the sequence of the fitting score from high to low;
and determining the front-ordered and preset number of face image images as first target part images, and determining the corresponding area of the first target part images in the difficult-to-treat image as the area to be processed.
3. The method according to claim 2, wherein the pooling the preliminary image features to obtain target image features comprises:
carrying out global maximum pooling treatment on the preliminary image features in the space dimension to obtain first image features;
carrying out global average pooling treatment on the preliminary image features in the space dimension to obtain second image features;
carrying out fusion processing on the first image feature and the second image feature in the channel dimension to obtain corresponding fusion features;
and performing full connection processing on the fusion characteristics to obtain the target image characteristics.
4. The method of claim 2, wherein the determining a fitting score for each face region image based on the global features comprises:
performing full connection processing on the global features to obtain features to be calculated;
calculating the feature to be calculated by using a normalized exponential function to obtain a weight score corresponding to each preliminary image feature in the feature to be calculated;
and aiming at each face part image, determining the weight score of the preliminary image feature corresponding to the face part image as the fitting score corresponding to the face part image.
5. The method of claim 1, wherein the determining a region to be processed on the refractory image based on the plurality of face bit images comprises:
displaying a plurality of face part images through a visualization component;
receiving first selection information input by a user based on a plurality of face image;
determining the face image corresponding to the first selection information as a second target part image;
and determining a corresponding region of the second target part image in the difficult-to-treat image as the region to be processed.
6. The method of claim 1, wherein said determining a difficult-to-case image among a plurality of said sample face images comprises:
Acquiring an image tag of each sample face image, and acquiring a recognition result of the model to be optimized on each sample face image;
and determining that each sample face image is a difficult image when the identification result corresponding to the sample face image is inconsistent with the image label corresponding to the sample face image.
7. The method of claim 1, wherein said determining a difficult-to-case image among a plurality of said sample face images comprises:
displaying a plurality of sample face images through a visualization component;
receiving second selection information input by a user based on a plurality of sample face images;
and determining the sample face image corresponding to the second selection information as a difficult image.
8. A model optimization apparatus, the apparatus comprising:
the acquisition module is used for acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying face images, and the image data set contains a plurality of sample face images;
the first determining module is used for determining difficult-case images in the plurality of sample face images;
The segmentation module is used for carrying out segmentation processing on each difficult-case image based on the face parts of the people contained in the difficult-case images to obtain a plurality of face part images;
the second determining module is used for determining a to-be-processed area on the difficult image based on a plurality of face bit images, wherein the to-be-processed area comprises at least one face part;
the processing module is used for carrying out amplification processing on the region to be processed to obtain a corresponding target face image;
and the optimization module is used for optimizing the model to be optimized by utilizing all the target face images to obtain a corresponding target model.
9. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for implementing the model optimization method of any one of claims 1-7 when executing a program stored on a memory.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the model optimization method according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311843499.8A CN117789275A (en) | 2023-12-28 | 2023-12-28 | Model optimization method, device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311843499.8A CN117789275A (en) | 2023-12-28 | 2023-12-28 | Model optimization method, device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117789275A true CN117789275A (en) | 2024-03-29 |
Family
ID=90398074
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311843499.8A Pending CN117789275A (en) | 2023-12-28 | 2023-12-28 | Model optimization method, device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117789275A (en) |
-
2023
- 2023-12-28 CN CN202311843499.8A patent/CN117789275A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11348249B2 (en) | Training method for image semantic segmentation model and server | |
CN108205655B (en) | Key point prediction method and device, electronic equipment and storage medium | |
US20190279014A1 (en) | Method and apparatus for detecting object keypoint, and electronic device | |
CN109740668B (en) | Deep model training method and device, electronic equipment and storage medium | |
CN111860398B (en) | Remote sensing image target detection method and system and terminal equipment | |
CN108229673B (en) | Convolutional neural network processing method and device and electronic equipment | |
CN107886082B (en) | Method and device for detecting mathematical formulas in images, computer equipment and storage medium | |
US20210390370A1 (en) | Data processing method and apparatus, storage medium and electronic device | |
CN109740752B (en) | Deep model training method and device, electronic equipment and storage medium | |
CN109285105A (en) | Method of detecting watermarks, device, computer equipment and storage medium | |
US10891471B2 (en) | Method and system for pose estimation | |
CN112085056B (en) | Target detection model generation method, device, equipment and storage medium | |
CN111738269A (en) | Model training method, image processing device, model training apparatus, and storage medium | |
CN110889437A (en) | Image processing method and device, electronic equipment and storage medium | |
CN114022748B (en) | Target identification method, device, equipment and storage medium | |
CN112991281B (en) | Visual detection method, system, electronic equipment and medium | |
CN111985488B (en) | Target detection segmentation method and system based on offline Gaussian model | |
CN117437423A (en) | Weak supervision medical image segmentation method and device based on SAM collaborative learning and cross-layer feature aggregation enhancement | |
CN116310899A (en) | YOLOv 5-based improved target detection method and device and training method | |
CN114494782B (en) | Image processing method, model training method, related device and electronic equipment | |
CN117789275A (en) | Model optimization method, device, electronic equipment and storage medium | |
CN113743448B (en) | Model training data acquisition method, model training method and device | |
CN113886578A (en) | Form classification method and device | |
CN112862002A (en) | Training method of multi-scale target detection model, target detection method and device | |
CN111008604A (en) | Prediction image acquisition method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |