CN117789275A - Model optimization method, device, electronic equipment and storage medium - Google Patents

Model optimization method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117789275A
CN117789275A CN202311843499.8A CN202311843499A CN117789275A CN 117789275 A CN117789275 A CN 117789275A CN 202311843499 A CN202311843499 A CN 202311843499A CN 117789275 A CN117789275 A CN 117789275A
Authority
CN
China
Prior art keywords
image
face
images
difficult
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311843499.8A
Other languages
Chinese (zh)
Inventor
王发发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing IQIYI Science and Technology Co Ltd
Original Assignee
Beijing IQIYI Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing IQIYI Science and Technology Co Ltd filed Critical Beijing IQIYI Science and Technology Co Ltd
Priority to CN202311843499.8A priority Critical patent/CN117789275A/en
Publication of CN117789275A publication Critical patent/CN117789275A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a model optimization method, a model optimization device, electronic equipment and a storage medium. Acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying a face image, and the image data set contains a plurality of sample face images; determining a difficult image in a plurality of sample face images; for each difficult-case image, dividing the difficult-case image based on face parts contained in the difficult-case image to obtain a plurality of face part images; determining a region to be processed on the difficult image based on a plurality of face bit images, wherein the region to be processed contains at least one face part; performing augmentation treatment on the region to be treated to obtain a corresponding target face image; and optimizing the model to be optimized by using all the target face images to obtain a corresponding target model. Thereby improving the recognition accuracy of the model.

Description

Model optimization method, device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of model training technologies, and in particular, to a model optimization method, a device, an electronic device, and a storage medium.
Background
With the continuous progress of technology, face recognition technology has been rapidly developed, and face recognition has been applied to more and more fields. In practical application, the face recognition process is generally realized by relying on a face image recognition model. However, due to the current face image recognition model, most of the sample data involved in model training during training is a sharp image, and the model tends to produce over-fitting (e.g., eyes, mouth, nose, etc.) conditions on a localized area of a simple sample during training. Therefore, in the process of model application, when a picture with poor effect (such as a picture with low definition, various noise or distortion exists), the model is easy to be subjected to false detection due to poor effect of the picture and excessive attention of the model to a local area.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method, an apparatus, an electronic device, and a storage medium for model optimization, so as to solve the problem that a false detection is easy to occur in a current face image recognition model. The specific technical scheme is as follows:
in a first aspect, the present application provides a model optimization method, including:
acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying a face image, and the image data set contains a plurality of sample face images;
Determining a difficult image in a plurality of sample face images;
for each difficult-case image, dividing the difficult-case image based on face parts contained in the difficult-case image to obtain a plurality of face part images;
determining a region to be processed on the difficult image based on a plurality of face bit images, wherein the region to be processed contains at least one face part;
performing augmentation treatment on the region to be treated to obtain a corresponding target face image;
and optimizing the model to be optimized by using all the target face images to obtain a corresponding target model.
In one possible implementation manner, the determining a to-be-processed area on the difficult image based on the plurality of face image includes:
extracting preliminary image features of the face part images aiming at each face part image, and carrying out pooling treatment on the preliminary image features to obtain target image features;
performing stitching processing on all target image features corresponding to the difficult-to-sample image to obtain global features corresponding to the difficult-to-sample image;
determining fitting scores corresponding to each face part image based on the global features;
Sequencing all face images according to the sequence of the fitting score from high to low;
and determining the front-ordered and preset number of face image images as first target part images, and determining the corresponding area of the first target part images in the difficult-to-treat image as the area to be processed.
In one possible implementation manner, the pooling the preliminary image feature to obtain a target image feature includes:
carrying out global maximum pooling treatment on the preliminary image features in the space dimension to obtain first image features;
carrying out global average pooling treatment on the preliminary image features in the space dimension to obtain second image features;
carrying out fusion processing on the first image feature and the second image feature in the channel dimension to obtain corresponding fusion features;
and performing full connection processing on the fusion characteristics to obtain the target image characteristics.
In one possible implementation manner, the determining the fitting score corresponding to each face part image based on the global feature includes:
performing full connection processing on the global features to obtain features to be calculated;
calculating the feature to be calculated by using a normalized exponential function to obtain a weight score corresponding to each preliminary image feature in the feature to be calculated;
And aiming at each face part image, determining the weight score of the preliminary image feature corresponding to the face part image as the fitting score corresponding to the face part image.
In one possible implementation manner, the determining a to-be-processed area on the difficult image based on the plurality of face image includes:
displaying a plurality of face part images through a visualization component;
receiving first selection information input by a user based on a plurality of face image;
determining the face image corresponding to the first selection information as a second target part image;
and determining a corresponding region of the second target part image in the difficult-to-treat image as the region to be processed.
In one possible implementation manner, the determining the difficult image in the plurality of sample face images includes:
acquiring an image tag of each sample face image, and acquiring a recognition result of the model to be optimized on each sample face image;
and determining that each sample face image is a difficult image when the identification result corresponding to the sample face image is inconsistent with the image label corresponding to the sample face image.
In one possible implementation manner, the determining the difficult image in the plurality of sample face images includes:
displaying a plurality of sample face images through a visualization component;
receiving second selection information input by a user based on a plurality of sample face images;
and determining the sample face image corresponding to the second selection information as a difficult image.
In a second aspect, the present application provides a model optimization apparatus, including:
the acquisition module is used for acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying face images, and the image data set contains a plurality of sample face images;
the first determining module is used for determining difficult-case images in the plurality of sample face images;
the segmentation module is used for carrying out segmentation processing on each difficult-case image based on the face parts of the people contained in the difficult-case images to obtain a plurality of face part images;
the second determining module is used for determining a to-be-processed area on the difficult image based on a plurality of face bit images, wherein the to-be-processed area comprises at least one face part;
The processing module is used for carrying out amplification processing on the region to be processed to obtain a corresponding target face image;
and the optimization module is used for optimizing the model to be optimized by utilizing all the target face images to obtain a corresponding target model.
In one possible embodiment, the second determining module is further configured to:
extracting preliminary image features of the face part images aiming at each face part image, and carrying out pooling treatment on the preliminary image features to obtain target image features;
performing stitching processing on all target image features corresponding to the difficult-to-sample image to obtain global features corresponding to the difficult-to-sample image;
determining fitting scores corresponding to each face part image based on the global features;
sequencing all face images according to the sequence of the fitting score from high to low;
and determining the front-ordered and preset number of face image images as first target part images, and determining the corresponding area of the first target part images in the difficult-to-treat image as the area to be processed.
In a possible embodiment, the second determining module is further configured to:
Carrying out global maximum pooling treatment on the preliminary image features in the space dimension to obtain first image features;
carrying out global average pooling treatment on the preliminary image features in the space dimension to obtain second image features;
carrying out fusion processing on the first image feature and the second image feature in the channel dimension to obtain corresponding fusion features;
and performing full connection processing on the fusion characteristics to obtain the target image characteristics.
In a possible embodiment, the second determining module is further configured to:
performing full connection processing on the global features to obtain features to be calculated;
calculating the feature to be calculated by using a normalized exponential function to obtain a weight score corresponding to each preliminary image feature in the feature to be calculated;
and aiming at each face part image, determining the weight score of the preliminary image feature corresponding to the face part image as the fitting score corresponding to the face part image.
In a possible embodiment, the second determining module is further configured to:
displaying a plurality of face part images through a visualization component;
receiving first selection information input by a user based on a plurality of face image;
Determining the face image corresponding to the first selection information as a second target part image;
and determining a corresponding region of the second target part image in the difficult-to-treat image as the region to be processed.
In one possible embodiment, the first determining module is further configured to:
acquiring an image tag of each sample face image, and acquiring a recognition result of the model to be optimized on each sample face image;
and determining that each sample face image is a difficult image when the identification result corresponding to the sample face image is inconsistent with the image label corresponding to the sample face image.
In one possible embodiment, the first determining module is further configured to:
displaying a plurality of sample face images through a visualization component;
receiving second selection information input by a user based on a plurality of sample face images;
and determining the sample face image corresponding to the second selection information as a difficult image.
In a third aspect, an electronic device is provided, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
A memory for storing a computer program;
a processor for implementing the method steps of any of the first aspects when executing a program stored on a memory.
In a fourth aspect, a computer-readable storage medium is provided, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of the first aspects.
In a fifth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform any of the model optimization methods described above.
The beneficial effects of the embodiment of the application are that:
in the embodiment of the application, firstly, a model to be optimized and an image data set corresponding to the model to be optimized are obtained, then, a difficult-case image is determined in a plurality of sample face images, for each difficult-case image, the difficult-case image is segmented based on face positions contained in the difficult-case image to obtain a plurality of face part images, a region to be processed is determined on the difficult-case image based on the plurality of face part images, further, the region to be processed is subjected to augmentation processing to obtain a corresponding target face image, and finally, the model to be optimized is optimized by utilizing all the target face images to obtain the corresponding target model. According to the scheme, the to-be-optimized model can determine the to-be-processed area which is excessively focused in the difficult image processing process, and the to-be-optimized model is adjusted and optimized by using the image after the augmentation processing by carrying out the augmentation processing on the to-be-processed area in the difficult image, so that the newly obtained target model can resist the overfitting effect on the local areas of different face images, and the recognition accuracy of the model is improved.
Of course, not all of the above-described advantages need be achieved simultaneously in practicing any one of the products or methods of the present application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which the figures of the drawings are not to be taken in a limiting sense, unless otherwise indicated.
FIG. 1 is a flowchart of a model optimization method according to an embodiment of the present application;
FIG. 2 is a flow chart of another model optimization method provided in an embodiment of the present application;
FIG. 3 is a flowchart of another model optimization method according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a model optimization device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present application based on the embodiments herein.
The following disclosure provides many different embodiments, or examples, for implementing different structures of the invention. In order to simplify the present disclosure, components and arrangements of specific examples are described below. They are, of course, merely examples and are not intended to limit the invention. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
Fig. 1 is a schematic flow chart of a model optimization method according to an embodiment of the present application. The method can be applied to one or more electronic devices such as smart phones, notebook computers, desktop computers, portable computers, servers and the like. The main execution body of the method may be hardware or software. When the execution body is hardware, the execution body may be one or more of the electronic devices. For example, a single electronic device may perform the method, or a plurality of electronic devices may cooperate with one another to perform the method. When the execution subject is software, the method may be implemented as a plurality of software or software modules, or may be implemented as a single software or software module. The present invention is not particularly limited herein.
As shown in fig. 1, the method specifically includes:
s101, acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying face images, and the image data set contains a plurality of sample face images.
The model to be optimized refers to a pre-trained model for identifying the face image.
An image dataset refers to a collection comprising a plurality of sample face images.
In the embodiment of the application, the face recognition model uploaded or designated by the user can be used as the model to be optimized, and the image set uploaded or designated by the user can be used as the image data set.
S102, determining difficult-to-sample images in a plurality of sample face images.
In practical applications, the model to be optimized has accurate detection results for some images, and these images may be referred to as good case (goodcase) images. Accordingly, the model to be optimized may not be accurate enough for the detection of certain images, which may be referred to as bad case images.
As one possible implementation manner of the embodiment of the present application, determining the difficult-case image in the plurality of sample face images may include the following steps:
obtaining an image label of each sample face image, obtaining a recognition result of the model to be optimized on each sample face image, and determining the sample face image as a difficult image for each sample face image under the condition that the recognition result corresponding to the sample face image is inconsistent with the image label corresponding to the sample face image.
By the embodiment, the sample face image with the error recognition of the model to be optimized (namely, the sample face image with the recognition result not conforming to the corresponding label) can be directly taken as the difficult-to-sample image. The part of the difficult images can be processed in a targeted manner, and the model to be optimized is optimized based on the processed images, so that the recognition effect of the model is improved.
As another possible implementation manner of the embodiment of the present application, determining the difficult case image in the plurality of sample face images may include the following steps: and displaying a plurality of sample face images through a visualization component, receiving second selection information input by a user based on the plurality of sample face images, and determining the sample face images corresponding to the second selection information as difficult images. By the implementation mode, the user can flexibly specify the difficult image according to actual requirements.
S103, for each difficult-case image, dividing the difficult-case image based on the face parts of the people contained in the difficult-case image to obtain a plurality of face part images.
The face part image is an image including only the corresponding face part. For example, if the difficult-to-find image 1 includes eyes, a nose, and a mouth of a person, a partial image including eyes may be taken as a face image corresponding to the eyes, a partial image including a nose may be taken as a face image corresponding to the nose, and a partial image including a mouth may be taken as a face image corresponding to the mouth. Thus, three face part images are cut out from the difficult-to-see image 1.
S104, determining a to-be-processed area on the difficult image based on a plurality of face image, wherein the to-be-processed area comprises at least one face part.
The area to be processed refers to an area which is excessively concerned in the process of identifying the difficult image by the model to be optimized, and the area to be processed can be an area where a certain face part is located on the difficult image or an area where a plurality of face parts are located.
As to how to determine the area to be processed on the difficult-to-handle image based on the plurality of face image, details will be explained later, and will not be described in detail here.
And S105, performing augmentation processing on the region to be processed to obtain a corresponding target face image.
And S106, optimizing the model to be optimized by utilizing all the target face images to obtain a corresponding target model.
S105 and S106 are collectively described below:
in the embodiment of the application, for each difficult image, the processing of blocking, deforming or blurring the to-be-processed area in the difficult image can be performed to realize the amplification processing of the to-be-processed area, so as to obtain the corresponding target face image. And further, retraining the model to be optimized by utilizing all the target face images so as to realize optimization adjustment of the model to be optimized and obtain a corresponding target model.
Therefore, the obtained target model is more focused on the places except the area to be processed, so that the overfitting effect on the local areas of different face images can be resisted, and the recognition accuracy of the model is improved.
In the embodiment of the application, firstly, a model to be optimized and an image data set corresponding to the model to be optimized are obtained, then, a difficult-case image is determined in a plurality of sample face images, for each difficult-case image, segmentation processing is carried out on the difficult-case image based on face positions contained in the difficult-case image to obtain a plurality of face part images, a region to be processed is determined on the difficult-case image based on the plurality of face part images, further, the region to be processed is subjected to augmentation processing to obtain a corresponding target face image, and finally, the model to be optimized is optimized by utilizing all the target face images to obtain a corresponding target model. According to the scheme, the to-be-optimized model can determine the to-be-processed area which is excessively focused in the difficult image processing process, and the to-be-optimized model is adjusted and optimized by using the image after the augmentation processing by carrying out the augmentation processing on the to-be-processed area in the difficult image, so that the newly obtained target model can resist the overfitting effect on the local areas of different face images, and the recognition accuracy of the model is improved.
Referring to fig. 2, a flowchart of an embodiment of another model optimization method is provided in the embodiment of the present application. The flow shown in fig. 2 describes how to determine a region to be processed on the difficult-to-deal image based on a plurality of the face-side images on the basis of the flow shown in fig. 1 described above. As shown in fig. 2, the process may include the steps of:
S201, extracting preliminary image features of the face part images aiming at each face part image, and carrying out pooling treatment on the preliminary image features to obtain target image features.
In this embodiment of the present application, first, feature extraction may be performed on each face image by using a feature extraction model to obtain corresponding local features (i.e., preliminary image features), where feature dimensions are generally 4×4x128. In application, the feature extraction model may be any convolutional neural network (e.g., a resnext101 model) or a sequence model of the attention mechanism. And then, pooling processing is carried out on each preliminary image feature to obtain a target image feature.
Specifically, the pooling processing of the preliminary image features to obtain target image features may include the following steps: and carrying out global maximum pooling treatment on the preliminary image features in the space dimension to obtain first image features, carrying out global average pooling treatment on the preliminary image features in the space dimension to obtain second image features, carrying out fusion treatment on the first image features and the second image features in the channel dimension to obtain corresponding fusion features, and carrying out full connection treatment on the fusion features to obtain the target image features.
In the scheme, firstly, the primary image features are subjected to pooling processing by using global maximum pooling and global average pooling in the space dimension in parallel, and channel dimensions are reserved to obtain pooled features (namely a first image feature and a second image feature), wherein the dimensions of the first image feature and the second image feature are generally 1x1x128. Then, based on the pooled features, the features obtained in the two pooling modes are spliced in the channel dimension by using a concat function to obtain a fused feature, wherein the dimension of the fused feature is generally 1x1x (2 x 128), and further, the dimension of the fused feature is kept at 1x1x128 dimension through full connection.
Through the scheme, the preliminary image features can be processed in two pooling modes to obtain target image features, so that the feature richness is increased, and the feature characterization capability is improved.
S202, performing stitching processing on all target image features corresponding to the difficult-to-case image to obtain global features corresponding to the difficult-to-case image.
In this embodiment of the present application, for each difficult image, the face feature (i.e., global feature) of the entire face is obtained by stitching all the target image features corresponding to the difficult image, where the dimension of the global feature is generally Nx128, and N refers to the number of target image features.
S203, determining fitting scores corresponding to the face part images based on the global features.
The fitting score is used for representing the attention degree of the model to be optimized to the region where the face image is located when the difficult image is identified, and the higher the fitting score corresponding to the face image is, the higher the attention degree of the model to be optimized to the region where the face image is located is, the higher the possibility of over-fitting.
In this embodiment, the specific implementation of determining the fitting score corresponding to each face part image based on the global feature may include the following steps: and performing full connection processing on the global features to obtain features to be calculated, calculating the features to be calculated by using a normalized exponential function to obtain weight scores corresponding to each preliminary image feature in the features to be calculated, and determining the weight scores of the preliminary image features corresponding to the face part images as fitting scores corresponding to the face part images aiming at each face part image.
In this embodiment, first, a global feature is fully connected through a fully connected layer to obtain a feature to be calculated, so that the dimension of the feature meets the input requirement of softmax (normalized exponential function), and further, the feature to be calculated is calculated by using the softmax to obtain a weight score of each face part image in the whole difficult-case image, and the weight score is determined as a fitting score of the corresponding face part image.
S204, sequencing all face image maps according to the order of the fitting score from high to low.
S205, determining the front-ordered and preset number of face bit images as first target part images, and determining the corresponding area of the first target part images in the difficult-to-treat image as the area to be processed.
S204 and S205 are collectively described below:
the preset number can be set by a user according to actual requirements, and in actual application, the preset number can be set to be 1 or 2.
In the embodiment of the present application, first, all face part images are ranked according to the order of the fitting score from high to low, so that the attention degree of the model to be optimized to the region where the face part images are located is higher for the face part images with the earlier ranking. On the basis, the front-ordered face part bitmap images with the preset number are determined to be first target part images, and the corresponding areas of the first target part images in the difficult-to-be-processed images are determined to be areas to be processed. Namely, the region where the preset number of face image points of the model to be optimized with higher attention is located is determined as the region to be processed.
Through the flow shown in fig. 2, the region where the preset number of facial image positions with higher attention of the model to be optimized is determined as the region to be processed, so that after the corresponding model optimization is performed subsequently, the newly obtained target model can resist the overfitting effect on the local regions of different facial images, and the recognition accuracy of the model is improved.
Referring to fig. 3, a flowchart of an embodiment of another model optimization method is provided in the embodiment of the present application. The flow shown in fig. 3 describes how to determine a region to be processed on the difficult-to-deal image based on a plurality of the face-side images on the basis of the flow shown in fig. 1 described above. As shown in fig. 3, the process may include the steps of:
s301, displaying a plurality of face part images through a visualization component;
s302, receiving first selection information input by a user based on a plurality of face images;
s303, determining the face image corresponding to the first selection information as a second target part image;
s304, determining a corresponding area of the second target part image in the difficult-to-treat image as the area to be processed.
S301 and S304 are collectively described below:
In this embodiment, for each difficult image, a plurality of face part images corresponding to the difficult image may be displayed to a user by using a visualization component, so that the user may select, according to actual experience or actual test data, among the plurality of face part images displayed by using the visualization component, that is, input corresponding first selection information, and on this basis, determine a face part image corresponding to the first selection information as a second target part image, and determine an area where the second target part image is located in the difficult image as a to-be-processed area.
Through the flow shown in fig. 3, the area to be processed can be flexibly set according to the user demand, so that after the corresponding model optimization is performed subsequently, the newly obtained target model can resist the overfitting effect on the local areas of different face images, and the recognition accuracy of the model is improved.
Based on the same technical concept, the embodiment of the application further provides a model optimization device, as shown in fig. 4, which includes:
the obtaining module 401 is configured to obtain a model to be optimized and an image dataset corresponding to the model to be optimized, where the model to be optimized is used to identify a face image, and the image dataset includes a plurality of sample face images;
A first determining module 402, configured to determine a difficult case image from a plurality of the sample face images;
a segmentation module 403, configured to segment, for each difficult-case image, the difficult-case image based on a face part included in the difficult-case image, to obtain a plurality of face part images;
a second determining module 404, configured to determine a to-be-processed area on the refractory image based on a plurality of face bit images, where the to-be-processed area includes at least one face part;
a processing module 405, configured to perform augmentation processing on the area to be processed to obtain a corresponding target face image;
and the optimizing module 406 is configured to optimize the model to be optimized by using all the target face images to obtain a corresponding target model.
In one possible embodiment, the second determining module is further configured to:
extracting preliminary image features of the face part images aiming at each face part image, and carrying out pooling treatment on the preliminary image features to obtain target image features;
performing stitching processing on all target image features corresponding to the difficult-to-sample image to obtain global features corresponding to the difficult-to-sample image;
Determining fitting scores corresponding to each face part image based on the global features;
sequencing all face images according to the sequence of the fitting score from high to low;
and determining the front-ordered and preset number of face image images as first target part images, and determining the corresponding area of the first target part images in the difficult-to-treat image as the area to be processed.
In a possible embodiment, the second determining module is further configured to:
carrying out global maximum pooling treatment on the preliminary image features in the space dimension to obtain first image features;
carrying out global average pooling treatment on the preliminary image features in the space dimension to obtain second image features;
carrying out fusion processing on the first image feature and the second image feature in the channel dimension to obtain corresponding fusion features;
and performing full connection processing on the fusion characteristics to obtain the target image characteristics.
In a possible embodiment, the second determining module is further configured to:
performing full connection processing on the global features to obtain features to be calculated;
calculating the feature to be calculated by using a normalized exponential function to obtain a weight score corresponding to each preliminary image feature in the feature to be calculated;
And aiming at each face part image, determining the weight score of the preliminary image feature corresponding to the face part image as the fitting score corresponding to the face part image.
In a possible embodiment, the second determining module is further configured to:
displaying a plurality of face part images through a visualization component;
receiving first selection information input by a user based on a plurality of face image;
determining the face image corresponding to the first selection information as a second target part image;
and determining a corresponding region of the second target part image in the difficult-to-treat image as the region to be processed.
In one possible embodiment, the first determining module is further configured to:
acquiring an image tag of each sample face image, and acquiring a recognition result of the model to be optimized on each sample face image;
and determining that each sample face image is a difficult image when the identification result corresponding to the sample face image is inconsistent with the image label corresponding to the sample face image.
In one possible embodiment, the first determining module is further configured to:
Displaying a plurality of sample face images through a visualization component;
receiving second selection information input by a user based on a plurality of sample face images;
and determining the sample face image corresponding to the second selection information as a difficult image.
In the embodiment of the application, firstly, a model to be optimized and an image data set corresponding to the model to be optimized are obtained, then, a difficult-case image is determined in a plurality of sample face images, for each difficult-case image, segmentation processing is carried out on the difficult-case image based on face positions contained in the difficult-case image to obtain a plurality of face part images, a region to be processed is determined on the difficult-case image based on the plurality of face part images, further, the region to be processed is subjected to augmentation processing to obtain a corresponding target face image, and finally, the model to be optimized is optimized by utilizing all the target face images to obtain a corresponding target model. According to the scheme, the to-be-optimized model can determine the to-be-processed area which is excessively focused in the difficult image processing process, and the to-be-optimized model is adjusted and optimized by using the image after the augmentation processing by carrying out the augmentation processing on the to-be-processed area in the difficult image, so that the newly obtained target model can resist the overfitting effect on the local areas of different face images, and the recognition accuracy of the model is improved.
Based on the same technical concept, the embodiment of the present application further provides an electronic device, as shown in fig. 5, including a processor 111, a communication interface 112, a memory 113, and a communication bus 114, where the processor 111, the communication interface 112, and the memory 113 perform communication with each other through the communication bus 114,
a memory 113 for storing a computer program;
the processor 111 is configured to execute a program stored in the memory 113, and implement the following steps:
acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying a face image, and the image data set contains a plurality of sample face images;
determining a difficult image in a plurality of sample face images;
for each difficult-case image, dividing the difficult-case image based on face parts contained in the difficult-case image to obtain a plurality of face part images;
determining a region to be processed on the difficult image based on a plurality of face bit images, wherein the region to be processed contains at least one face part;
performing augmentation treatment on the region to be treated to obtain a corresponding target face image;
And optimizing the model to be optimized by using all the target face images to obtain a corresponding target model.
The communication bus mentioned above for the electronic devices may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In yet another embodiment provided herein, there is also provided a computer readable storage medium having stored therein a computer program which when executed by a processor implements the steps of any of the model optimization methods described above.
In yet another embodiment provided herein, there is also provided a computer program product containing instructions that, when run on a computer, cause the computer to perform any of the model optimization methods of the above embodiments.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
From the above description of embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus a general purpose hardware platform, or may be implemented by hardware. Based on such understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the related art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the method described in the respective embodiments or some parts of the embodiments.
It is to be understood that the terminology used herein is for the purpose of describing particular example embodiments only, and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises," "comprising," "includes," "including," and "having" are inclusive and therefore specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order described or illustrated, unless an order of performance is explicitly stated. It should also be appreciated that additional or alternative steps may be used.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of model optimization, the method comprising:
acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying a face image, and the image data set contains a plurality of sample face images;
determining a difficult image in a plurality of sample face images;
for each difficult-case image, dividing the difficult-case image based on face parts contained in the difficult-case image to obtain a plurality of face part images;
determining a region to be processed on the difficult image based on a plurality of face bit images, wherein the region to be processed contains at least one face part;
performing augmentation treatment on the region to be treated to obtain a corresponding target face image;
and optimizing the model to be optimized by using all the target face images to obtain a corresponding target model.
2. The method of claim 1, wherein the determining a region to be processed on the refractory image based on the plurality of face bit images comprises:
extracting preliminary image features of the face part images aiming at each face part image, and carrying out pooling treatment on the preliminary image features to obtain target image features;
Performing stitching processing on all target image features corresponding to the difficult-to-sample image to obtain global features corresponding to the difficult-to-sample image;
determining fitting scores corresponding to each face part image based on the global features;
sequencing all face images according to the sequence of the fitting score from high to low;
and determining the front-ordered and preset number of face image images as first target part images, and determining the corresponding area of the first target part images in the difficult-to-treat image as the area to be processed.
3. The method according to claim 2, wherein the pooling the preliminary image features to obtain target image features comprises:
carrying out global maximum pooling treatment on the preliminary image features in the space dimension to obtain first image features;
carrying out global average pooling treatment on the preliminary image features in the space dimension to obtain second image features;
carrying out fusion processing on the first image feature and the second image feature in the channel dimension to obtain corresponding fusion features;
and performing full connection processing on the fusion characteristics to obtain the target image characteristics.
4. The method of claim 2, wherein the determining a fitting score for each face region image based on the global features comprises:
performing full connection processing on the global features to obtain features to be calculated;
calculating the feature to be calculated by using a normalized exponential function to obtain a weight score corresponding to each preliminary image feature in the feature to be calculated;
and aiming at each face part image, determining the weight score of the preliminary image feature corresponding to the face part image as the fitting score corresponding to the face part image.
5. The method of claim 1, wherein the determining a region to be processed on the refractory image based on the plurality of face bit images comprises:
displaying a plurality of face part images through a visualization component;
receiving first selection information input by a user based on a plurality of face image;
determining the face image corresponding to the first selection information as a second target part image;
and determining a corresponding region of the second target part image in the difficult-to-treat image as the region to be processed.
6. The method of claim 1, wherein said determining a difficult-to-case image among a plurality of said sample face images comprises:
Acquiring an image tag of each sample face image, and acquiring a recognition result of the model to be optimized on each sample face image;
and determining that each sample face image is a difficult image when the identification result corresponding to the sample face image is inconsistent with the image label corresponding to the sample face image.
7. The method of claim 1, wherein said determining a difficult-to-case image among a plurality of said sample face images comprises:
displaying a plurality of sample face images through a visualization component;
receiving second selection information input by a user based on a plurality of sample face images;
and determining the sample face image corresponding to the second selection information as a difficult image.
8. A model optimization apparatus, the apparatus comprising:
the acquisition module is used for acquiring a model to be optimized and an image data set corresponding to the model to be optimized, wherein the model to be optimized is used for identifying face images, and the image data set contains a plurality of sample face images;
the first determining module is used for determining difficult-case images in the plurality of sample face images;
The segmentation module is used for carrying out segmentation processing on each difficult-case image based on the face parts of the people contained in the difficult-case images to obtain a plurality of face part images;
the second determining module is used for determining a to-be-processed area on the difficult image based on a plurality of face bit images, wherein the to-be-processed area comprises at least one face part;
the processing module is used for carrying out amplification processing on the region to be processed to obtain a corresponding target face image;
and the optimization module is used for optimizing the model to be optimized by utilizing all the target face images to obtain a corresponding target model.
9. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for implementing the model optimization method of any one of claims 1-7 when executing a program stored on a memory.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the model optimization method according to any one of claims 1-7.
CN202311843499.8A 2023-12-28 2023-12-28 Model optimization method, device, electronic equipment and storage medium Pending CN117789275A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311843499.8A CN117789275A (en) 2023-12-28 2023-12-28 Model optimization method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311843499.8A CN117789275A (en) 2023-12-28 2023-12-28 Model optimization method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117789275A true CN117789275A (en) 2024-03-29

Family

ID=90398074

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311843499.8A Pending CN117789275A (en) 2023-12-28 2023-12-28 Model optimization method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117789275A (en)

Similar Documents

Publication Publication Date Title
US11348249B2 (en) Training method for image semantic segmentation model and server
CN108205655B (en) Key point prediction method and device, electronic equipment and storage medium
US20190279014A1 (en) Method and apparatus for detecting object keypoint, and electronic device
CN109740668B (en) Deep model training method and device, electronic equipment and storage medium
CN111860398B (en) Remote sensing image target detection method and system and terminal equipment
CN108229673B (en) Convolutional neural network processing method and device and electronic equipment
CN107886082B (en) Method and device for detecting mathematical formulas in images, computer equipment and storage medium
US20210390370A1 (en) Data processing method and apparatus, storage medium and electronic device
CN109740752B (en) Deep model training method and device, electronic equipment and storage medium
CN109285105A (en) Method of detecting watermarks, device, computer equipment and storage medium
US10891471B2 (en) Method and system for pose estimation
CN112085056B (en) Target detection model generation method, device, equipment and storage medium
CN111738269A (en) Model training method, image processing device, model training apparatus, and storage medium
CN110889437A (en) Image processing method and device, electronic equipment and storage medium
CN114022748B (en) Target identification method, device, equipment and storage medium
CN112991281B (en) Visual detection method, system, electronic equipment and medium
CN111985488B (en) Target detection segmentation method and system based on offline Gaussian model
CN117437423A (en) Weak supervision medical image segmentation method and device based on SAM collaborative learning and cross-layer feature aggregation enhancement
CN116310899A (en) YOLOv 5-based improved target detection method and device and training method
CN114494782B (en) Image processing method, model training method, related device and electronic equipment
CN117789275A (en) Model optimization method, device, electronic equipment and storage medium
CN113743448B (en) Model training data acquisition method, model training method and device
CN113886578A (en) Form classification method and device
CN112862002A (en) Training method of multi-scale target detection model, target detection method and device
CN111008604A (en) Prediction image acquisition method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination