CN118134765A - Image processing method, apparatus and storage medium - Google Patents

Image processing method, apparatus and storage medium Download PDF

Info

Publication number
CN118134765A
CN118134765A CN202410534365.6A CN202410534365A CN118134765A CN 118134765 A CN118134765 A CN 118134765A CN 202410534365 A CN202410534365 A CN 202410534365A CN 118134765 A CN118134765 A CN 118134765A
Authority
CN
China
Prior art keywords
image
super
algorithm
target
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410534365.6A
Other languages
Chinese (zh)
Other versions
CN118134765B (en
Inventor
孟祥飞
徐悦然
康波
庞晓磊
刘腾萧
赵欣婷
李长松
吴琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Supercomputer Center In Tianjin
Original Assignee
National Supercomputer Center In Tianjin
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Supercomputer Center In Tianjin filed Critical National Supercomputer Center In Tianjin
Priority to CN202410534365.6A priority Critical patent/CN118134765B/en
Publication of CN118134765A publication Critical patent/CN118134765A/en
Application granted granted Critical
Publication of CN118134765B publication Critical patent/CN118134765B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Image Processing (AREA)

Abstract

The invention relates to the field of image super-resolution, and discloses an image processing method, device and storage medium, wherein the method comprises the following steps: dividing an image to be processed, determining a first partial image and a second partial image, and dividing the first partial image into a plurality of first sub-images based on a preset size; performing super-resolution processing on each first sub-image based on each first super-division algorithm in a first super-division algorithm library to obtain first super-division images corresponding to each first super-division algorithm, and determining a first target image corresponding to each first sub-image according to each first super-division image; performing super-resolution processing on the second partial image based on a second super-resolution algorithm library, and determining a second target image corresponding to the second partial image; and splicing the first target image and the second target image to obtain a target superdivision image corresponding to the image to be processed. By the technical scheme, the super-resolution effect of the image can be effectively improved.

Description

Image processing method, apparatus and storage medium
Technical Field
The present invention relates to the field of image super-resolution, and in particular, to an image processing method, apparatus, and storage medium.
Background
With the popularization of high-definition display, users have increasingly high requirements on image quality. The terminal is affected by factors such as shooting equipment, transmission bandwidth, issuing strategy and the like, so that the resolution of the image quality displayed by the terminal is low, and the display effect is poor. The technique of recovering the low resolution input to the high resolution output to solve the above-described problem of display quality is called image super resolution, also called image super division.
Currently, image super-resolution algorithms are mainly classified into interpolation-based methods, reconstruction-based methods, machine learning-based and sample statistics-based methods, and deep neural network-based methods. The image super-resolution algorithm based on interpolation and the method based on reconstruction have poor effects, and the method based on machine learning and sample statistics and the method based on the deep neural network have good effects, but are difficult to implement due to the problems of calculation power or hardware and the like.
In view of this, the present invention has been made.
Disclosure of Invention
In order to solve the technical problems, the invention provides an image processing method, an image processing device and a storage medium, which realize the improvement of the image super-resolution effect.
The embodiment of the invention provides an image processing method, which comprises the following steps:
dividing an image to be processed, determining a first partial image and a second partial image, and dividing the first partial image into a plurality of first sub-images based on a preset size;
Performing super resolution processing on each first sub-image based on each first super-division algorithm in a first super-division algorithm library to obtain first super-division images corresponding to each first super-division algorithm, and determining a first target image corresponding to each first sub-image according to each first super-division image;
performing super-resolution processing on the second partial image based on a second super-resolution algorithm library, and determining a second target image corresponding to the second partial image;
and splicing the first target images and the second target images to obtain target superresolution images corresponding to the images to be processed.
The embodiment of the invention provides electronic equipment, which comprises:
A processor and a memory;
the processor is configured to execute the steps of the image processing method according to any of the embodiments by calling a program or instructions stored in the memory.
An embodiment of the present invention provides a computer-readable storage medium storing a program or instructions that cause a computer to execute the steps of the image processing method described in any of the embodiments.
The embodiment of the invention has the following technical effects:
The method comprises the steps of dividing an image to be processed, determining a first partial image and a second partial image, dividing the image into a plurality of first sub-images based on a preset size, refining the focused part, further, performing super-resolution processing on each first sub-image based on each first super-division algorithm in a first super-division algorithm library, obtaining first super-division images corresponding to each first super-division algorithm, determining a first target image corresponding to each first sub-image according to each first super-division image, selecting the most suitable super-division algorithm for each refined first sub-image, improving universality of the super-division algorithm, performing super-resolution processing on the second partial image based on a second super-division algorithm library, determining a second target image corresponding to the second partial image, selecting the proper super-division algorithm for the non-focused part, splicing the first target image and the second target image, obtaining target super-division images corresponding to the image to be processed, and performing comprehensive super-division algorithm and comprehensive analysis on the super-division images.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of an image processing method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the invention, are within the scope of the invention.
Embodiment one:
The image processing method provided by the embodiment of the invention is mainly suitable for the situation of super-resolution processing of the image comprising the first part and the second part. The image processing method provided by the embodiment of the invention can be executed by the electronic equipment.
Fig. 1 is a flowchart of an image processing method according to an embodiment of the present invention. Referring to fig. 1, the image processing method specifically includes:
s110, dividing the image to be processed, determining a first partial image and a second partial image, and dividing the first partial image into a plurality of first sub-images based on a preset size.
The image to be processed is an image which needs to be processed in super resolution, and the image to be processed comprises a first part and a second part. It should be noted that the first portion is a portion of main interest in the image to be processed, and the second portion is a remaining portion of the image to be processed other than the first portion. For example, the first portion may be a character portion and the second portion may be a background portion. The first partial image and the second partial image are respectively a partial image corresponding to the first portion and a partial image corresponding to the second portion. The preset size is a size preset for performing a segmentation process on the first partial image, for example, 64×64 or the like. The first sub-image is each sub-image obtained by image segmentation of the first partial image.
Specifically, the image to be processed is segmented according to the super-resolution object requirement, the image of the main attention part is determined to be a first part image, and the rest part image is determined to be a second part image. Further, the first partial image is segmented according to a preset size, and a plurality of first sub-images are obtained.
For example, the image to be processed is divided into two parts, namely, a person and a background, that is, a first partial image and a second partial image, using an image division method. If the image to be processed is a digital human image, the image segmentation method can adopt a method based on UNet, and since the positions of the characters and the background in the picture of the digital human image are relatively fixed, the ReLu in UNet can be replaced by using PReLu activation functions so as to avoid the phenomenon of gradient explosion. The first partial image is segmented into first sub-images of 64 x 64 (preset size) so as to ensure that the color of the graph of each part after the segmentation is relatively uniform. If the first sub-image belongs to the boundary area and is smaller than 64×64, the RGB average value of the existing partial pixels is used for complementation.
S120, performing super-resolution processing on each first sub-image based on each first super-division algorithm in the first super-division algorithm library to obtain first super-division images corresponding to each first super-division algorithm, and determining a first target image corresponding to each first sub-image according to each first super-division image.
The first super-resolution algorithm library stores a plurality of algorithms for performing super-resolution processing on the image of the first portion, i.e., the focused portion, i.e., a plurality of first super-resolution algorithms. The first super-resolution image is an image obtained by performing super-resolution processing on the first sub-image by using various first super-resolution algorithms. It will be appreciated that each first sub-image corresponds to a plurality of first hyper-split images, the number of first hyper-split images being consistent with the number of first hyper-split algorithms. The first target image is a result image of a suitable super resolution corresponding to the first sub-image, for example: the best super-division effect is one of the plurality of first super-division images of each first sub-image.
Specifically, the same processing can be performed on each first sub-image, and each first super-division algorithm in the first super-division algorithm library is used for performing super-resolution processing on the first sub-image respectively to obtain a first super-division image corresponding to each first super-division algorithm. Further, the first super-resolution images are subjected to effect analysis, and one of the first super-resolution images having the best super-resolution effect is determined as the first target image corresponding to the first sub-image.
For example, when performing effect analysis on the first superdivision image, the following manner may be adopted: for each first super-division image, carrying out RGB value difference on each pixel of the first super-division image and each pixel of the first sub-image, and calculating the variance of the difference value corresponding to each pixel; further, the first superdivision image with the smallest variance is determined as the first target image corresponding to the first sub-image.
On the basis of the above example, the super-resolution processing may be performed on each first sub-image based on each first super-division algorithm in the first super-division algorithm library in the following manner, so as to obtain a first super-division image corresponding to each first super-division algorithm:
for each first sub-image, determining a center image of the first sub-image according to the first sub-image;
And respectively carrying out super-resolution processing on the center image based on each first super-division algorithm in the first super-division algorithm library, and determining a first super-division image corresponding to each first super-division algorithm according to each processing result corresponding to the center image.
Wherein the center image is a partial image of the center region of the first sub-image, for example: the size of the first sub-image is 64×64, and the central image of the first sub-image is a partial image with a central size of 32×32 of the first sub-image, where the size of the central image may be determined according to the requirement, and is not specifically limited herein. The processing result is a result of super-resolution processing of the center image using various first super-resolution algorithms.
Specifically, determining a middle pixel point of the first sub-image, and expanding according to a preset center size by taking the middle pixel point as a center to obtain a center image of the first sub-image. The same processing can be carried out on the central image of each first sub-image, super-resolution processing is carried out on the central image by using each first super-division algorithm in the first super-division algorithm library, processing results corresponding to each first super-division algorithm are obtained, and each processing result is determined to be a first super-division image corresponding to each first super-division algorithm.
On the basis of the above example, the first target image corresponding to each first sub-image may be determined according to each first super-division image by:
for each first super-division algorithm, determining at least two similarity values according to a first super-division image and a center image corresponding to the first super-division algorithm;
determining a first evaluation value corresponding to a first super-division algorithm according to at least two similarity values and algorithm weights corresponding to the first super-division algorithm;
And determining a first target algorithm according to the first evaluation value corresponding to each first super-resolution algorithm, and performing super-resolution processing on the first sub-image according to the first target algorithm to obtain a first target image.
The similarity value is a numerical value which is calculated based on a similarity algorithm and used for describing the similarity between the first hyper-score image and the center image. The algorithm weight is a preset weight value used for describing the importance and the effectiveness of different super-division algorithms. The first evaluation value is a numerical value for integrating each similarity value and algorithm weight and is used for evaluating the effect of the corresponding first hyperspectral algorithm on the center image. The first target algorithm is a first hyper-score algorithm corresponding to the first evaluation value and is used for acquiring a first target image.
Specifically, for each first superdivision algorithm, similarity calculation is performed on the first superdivision image and the center image by using various pre-selected similarity algorithms (such as Euclidean distance, manhattan distance, cosine similarity, pearson correlation coefficient, jaccard coefficient and the like), so as to obtain a similarity value corresponding to each similarity algorithm. And carrying out normalization processing on the similarity values, integrating the similarity values together, for example, solving an average value and the like, and multiplying the similarity values by algorithm weights corresponding to the first super-division algorithm to obtain a first evaluation value corresponding to the first super-division algorithm. And comparing the first evaluation values corresponding to each first super-score algorithm, selecting an optimal first evaluation value from the first evaluation values, and determining the first super-score algorithm corresponding to the optimal first evaluation value as a first target algorithm. And performing super-resolution processing on the first sub-image through a first target algorithm to obtain a first target image.
On the basis of the above example, the first target algorithm may be determined according to the first evaluation value corresponding to each first super-resolution algorithm, and super-resolution processing may be performed on the first sub-image according to the first target algorithm, so as to obtain the first target image:
Judging whether a first evaluation value corresponding to each first super-division algorithm meets a preset evaluation threshold or not according to each first super-division algorithm, and if so, determining the first evaluation value as a filtering evaluation value;
If at least two filtering evaluation values exist, a first super-resolution algorithm corresponding to each filtering evaluation value is used as a first to-be-selected algorithm, and super-resolution processing is respectively carried out on the first sub-image based on each first to-be-selected algorithm, so that a first to-be-selected image corresponding to each first to-be-selected algorithm is obtained;
for each first image to be selected, determining a first similarity according to the first image to be selected and the first sub-image;
And determining a first target algorithm according to the first similarity corresponding to each first target algorithm, and determining a first target image corresponding to the first target algorithm as a first target image.
The preset evaluation threshold is a preset value for measuring whether each first evaluation value reaches the initial effective requirement. The filter evaluation value is a portion of the first evaluation value that satisfies a preset evaluation value. The first candidate algorithm is a first super-division algorithm corresponding to each filtering evaluation value. The first image to be selected is an image obtained by performing super-resolution processing on the whole first sub-image by each first algorithm to be selected. The first similarity is a value for measuring the similarity between the first image to be selected and the first sub-image, and may be determined by one or more similarity algorithms, which are not described herein. The first target algorithm is a first candidate algorithm with the first similarity being optimal.
Specifically, for each first super-score algorithm, it is required to determine whether a first evaluation value corresponding to the first super-score algorithm meets a preset evaluation threshold, and if so, determining the first evaluation value as a filtered evaluation value. Thus, the number of filter evaluation values may be zero, one, or at least two. If at least two filtering evaluation values exist, the first super-division algorithm corresponding to each filtering evaluation value is used as a first candidate algorithm. And then, performing super-resolution processing on the first sub-images by using each first candidate algorithm to obtain first candidate images corresponding to each first candidate algorithm. And determining the similarity between the first image to be selected and the first sub-image through a similarity algorithm respectively, and combining corresponding algorithm weights to obtain the first similarity. And determining a first to-be-selected algorithm corresponding to the optimal first similarity in the first similarities corresponding to the first to-be-selected algorithms as a first target algorithm, and determining a first to-be-selected image corresponding to the first target algorithm as a first target image.
On the basis of the above example, after the first evaluation value is determined as the filter evaluation value, there may be no filter evaluation value or only one filter evaluation value, which are handled as follows:
if the filtering evaluation value does not exist, judging whether the image size of the first sub-image is larger than the minimum size;
If yes, dividing the first sub-images based on the preset quantity, updating the first sub-images, and returning to execute the operation of performing super-resolution processing on the first sub-images based on each first super-division algorithm in the first super-division algorithm library for each first sub-image to obtain first super-division images corresponding to each first super-division algorithm;
If not, determining the minimum value in each first evaluation value as a first target value, and determining a first super-division algorithm corresponding to the first target value as a first target algorithm;
And performing super-resolution processing on the first sub-image according to a first target algorithm to obtain a first target image.
The minimum size is a predetermined minimum size of the divided image. The preset number is a preset value for segmenting the first sub-image, which is understood to mean that one first sub-image is segmented into a preset number of images, and these images are taken as new first sub-images. The first target value is a minimum value among the first evaluation values.
Specifically, if the filtering evaluation value does not exist, it is indicated that the super resolution processing of the first sub-image by the current various first super-resolution algorithms cannot meet the requirement, and then it is determined whether the image size of the first sub-image is greater than the minimum size, so as to determine whether the first sub-image can be further segmented. If yes, the first sub-images can be segmented, the first sub-images are segmented into a preset number of images to serve as new first sub-images, the first sub-images are returned to be executed, super-resolution processing is conducted on the first sub-images based on the first super-division algorithms in the first super-division algorithm library respectively, and operation of the first super-division images corresponding to the first super-division algorithms is obtained, so that the first sub-images with smaller sizes are processed to find out a proper first super-division algorithm. If not, the first sub-image is small in size, and therefore the segmentation is not suitable to continue, and therefore, according to the evaluation conditions, the minimum value in each first evaluation value is determined as a first target value, and a first super-resolution algorithm corresponding to the first target value is determined as a first target algorithm. And performing super-resolution processing on the first sub-image by using a first target algorithm, wherein the obtained image is the first target image.
If a filtering evaluation value exists, determining a first super-score algorithm corresponding to the filtering evaluation value as a first target algorithm;
And performing super-resolution processing on the first sub-image based on a first target algorithm to obtain a first target image.
Specifically, if there is one filtering evaluation value, the selection is not needed again, and the first super-score algorithm corresponding to the filtering evaluation value is determined as the first target algorithm. And further, performing super-resolution processing on the first sub-image by using a first target algorithm to obtain a first target image.
S130, performing super-resolution processing on the second partial image based on the second super-resolution algorithm library, and determining a second target image corresponding to the second partial image.
The second super-resolution algorithm library stores a plurality of algorithms for performing super-resolution processing on the image of the second portion, that is, the remaining portion except the portion of interest, that is, a plurality of second super-resolution algorithms. The second target image is a result image of a suitable super resolution corresponding to the second partial image.
Specifically, each second super-resolution algorithm in the second super-resolution algorithm library is used for performing super-resolution processing on the second partial image respectively, so that a super-resolution result image corresponding to each second super-resolution algorithm is obtained. Further, effect analysis is performed on these super-resolution result images, and one of the super-resolution result images having the best super-resolution effect is determined as a second target image corresponding to the second partial image.
On the basis of the above example, the super-resolution processing may be performed on the second partial image based on the second super-resolution algorithm library to determine a second target image corresponding to the second partial image in the following manner:
Dividing the second partial image into a plurality of second sub-images based on a preset size;
For each second sub-image, respectively carrying out super-resolution processing on the second sub-images based on each second super-division algorithm in a second super-division algorithm library to obtain second super-division images corresponding to each second super-division algorithm;
Determining a sub-target image corresponding to the second sub-image according to each second super-resolution image;
And splicing the sub-target images to obtain a second target image corresponding to the second partial image.
The second sub-image is each sub-image obtained by dividing the second partial image. The second super-resolution image is an image obtained by performing super-resolution processing on the second sub-image by using various second super-resolution algorithms. It will be appreciated that each second sub-image corresponds to a plurality of second hyper-split images, the number of second hyper-split images being identical to the number of second hyper-split algorithms. The sub-target image is a result image of a suitable super resolution corresponding to the second sub-image, for example: the best super-division effect is one of the plurality of second super-division images of each second sub-image.
Specifically, the second partial image is segmented according to a preset size, and a plurality of second sub-images are obtained. And carrying out the same processing on each second sub-image, and respectively carrying out super-resolution processing on the second sub-images by using each second super-division algorithm in a second super-division algorithm library to obtain second super-division images corresponding to each second super-division algorithm. Further, the second super-resolution images are subjected to effect analysis, and one of the super-resolution images having the best super-resolution effect is determined as a sub-target image corresponding to the second sub-image. And splicing the sub-target images according to the positions to obtain a second target image corresponding to the second partial image.
It should be noted that the super-resolution algorithms in the first super-resolution algorithm library and the second super-resolution algorithm library may be completely different or may be partially the same, and the super-resolution processing for the second partial image is similar to the super-resolution processing for the first partial image, which is not described herein.
Illustratively, after the background picture (the second partial image) and the person picture (the first partial image) are segmented, the background picture and the person picture are subjected to super-division algorithm selection by using a background picture super-division algorithm library (the second super-division algorithm library) and a person picture super-division algorithm library (the first super-division algorithm library) which are independently constructed based on the prior art, and the most suitable super-division algorithm is selected. For example, each super-division algorithm library is provided with 5 super-division algorithms which are already in open source, wherein each super-division algorithm library is provided with two methods codeformer and gfpgan, and the other three methods are respectively different three methods aiming at a background picture and a character picture.
For each characteristic of the super-division algorithm (the first super-division algorithm and the second super-division algorithm), different weights (algorithm weights) are respectively assigned to the algorithms, and the algorithm weights in each algorithm library are different values (for example, 1-5, the smaller the values are, the better the values are). Aiming at background pictures and character pictures, a better effect super-division algorithm is given lower weight. Taking a figure picture as an example, when the super-resolution algorithm is selected, algorithm selection is performed on the central area (the central image, for example, the size of 32×32) of the small picture (the first sub-image) from the first super-resolution algorithm library, so that the selection efficiency is improved. For each first super-division algorithm, the super-division result (first super-division image) of the center area of the small picture and the original image (center image) of the center area are used as inputs and input to a similarity comparison module, and the similarity comparison results (similarity values) are obtained by using a similarity comparison algorithm (for example, 5 kinds of similarity values can be included, namely Euclidean distance, manhattan distance, cosine similarity, pelson correlation coefficient and Jaccard coefficient), and the smaller the result is, the better the representing effect is. And carrying out linear normalization processing on each similarity comparison result, and taking an average value. And multiplying the average value of the similarity comparison results obtained by each first super-division algorithm by a preset algorithm weight to obtain a super-division algorithm selection result (first evaluation value). If the selection results of the super-segmentation algorithm are all greater than a preset threshold (a preset evaluation threshold, e.g. 2), the small picture is continuously segmented, for example: the second slicing will perform slicing according to the size of 16×16 (the preset number is 16) on the basis of 64×64, the size of the central area is 8×8, and the above procedure of selecting the super-division algorithm and comparing the similarity results is repeated until the result of selecting the first super-division algorithm is lower than the threshold value, or the slicing cannot be continued. And taking a first superdivision algorithm corresponding to the minimum result to superdivide the small pictures, and splicing the small pictures into character pictures after superdivision. The background picture can be processed similarly according to the content, and finally, the character and the background are spliced and restored into a complete image picture (target super-resolution image). Through continuous segmentation and iteration, an optimal algorithm of superdivision can be found, and meanwhile, algorithm selection and similarity comparison are carried out by using a small picture center area, so that the operation efficiency can be greatly improved.
And S140, splicing the first target images and the second target images to obtain target superresolution images corresponding to the images to be processed.
The target super-resolution image is a super-resolution result corresponding to the image to be processed.
Specifically, the first target image and the second target image are spliced according to the corresponding positions, if the first target image and the second target image are in the edge area, the filling part is removed, the first target image and the second target image are spliced again, and the spliced images are used as target super-resolution images corresponding to the images to be processed.
Alternatively, if the edge portion is obvious in the splicing process, the edge portion may be eliminated by a median filtering method, a wavelet transform method, or the like, which will not be described herein.
By the iterative preferential and continuous segmentation, the optimal effect of the super-segmentation of the picture can be obtained, further, more excellent super-segmentation results can be obtained after the pictures are spliced, and the operation efficiency can be greatly improved by the continuous segmentation.
After the first target image and the second target image are spliced to obtain the target superdivision image corresponding to the image to be processed, continuous target superdivision images can be combined into an image sequence, and the character expression in the image sequence is enriched according to the audio corresponding to the image sequence, so that the character expression is more vivid:
constructing an initial image sequence according to each target superdivision image;
Performing facial recognition and feature extraction on the initial image sequence, determining facial features, performing semantic recognition and emotion analysis on target audio corresponding to the initial image sequence, and determining an expression label;
Generating expression features according to the facial features and the expression labels, and rendering the initial image sequence according to the expression features and the facial features to obtain a target image sequence;
And generating a target video according to the target image sequence and the target audio.
Wherein the initial image sequence is a continuous image sequence of the synthesis of each target superdivision image. Facial features are features that describe the faces of people in an initial image sequence. The target audio is the audio corresponding to the initial image sequence, i.e. the audio to which the human part of the initial image sequence relates. The expression labels are categories of expressions corresponding to emotion obtained through analysis in the target audio. The expression features are facial feature changes that adapt to different expressions. The target image sequence is an image sequence after performing expression rendering on the character part. The target video is a video synthesized by a target image sequence and target audio, wherein the character expression is more vivid.
Specifically, each target superminute image is determined and an initial image sequence is constructed in sequence. Facial recognition and feature extraction are performed on the initial image sequence, facial features are determined, semantic recognition and emotion analysis are performed on target audio corresponding to the initial image sequence, and expression labels are determined, wherein the semantic analysis, the voice intonation analysis and the combination of the facial features and the emotion analysis can be performed, and are not limited in this regard. Further, the facial features and the expression labels are fused to obtain expression features, and the expression features and the facial features are used for rendering the initial image sequence to obtain a target image sequence. And synthesizing the target image sequence and the target audio to obtain a target video.
In the mode, a plurality of superdivision methods are used for superdivision of the image, and the optimal result is obtained through similarity comparison, so that high-definition video synthesis is performed, and the reduction degree of the video after superdivision is improved. Through expression synthesis, high-definition videos with various expressions are synthesized for digital people, and the vitality of the digital people is greatly improved.
For example, image frames containing characters and background video of various musical scales, words and the like can be designed and read, processed by the image processing method, processed target superdivision images corresponding to the musical scales, words and the like in the target audio can be determined according to the target audio, and then the synthesized initial image sequence is consistent with the target audio in terms of mouth shape.
For example, the original video can be super-divided into a plurality of images to be processed and target audio, wherein the character part in the original video has no vivid expression change and has lower resolution. The super-divided multiple target super-divided images are input into a video synthesis module, firstly, the size and image data of each target super-divided image are obtained, and the data are sequentially stored into an array. If the size of each target superdivision image is the same, the reading is clear, and the target superdivision image can be the content in the same video. And initializing a video synthesis class, selecting parameters according to an operating system and a video format, ensuring that the initialization frame rate is the same as that of the original video, and writing each frame of target super-resolution image into the video synthesis class to generate a clear initial image sequence. And finally, synthesizing the target audio corresponding to the initial image sequence and the initial image sequence, ensuring the synchronization of the audio and the video, and obtaining the complete high-definition video.
Furthermore, the high-definition video can be used as the input of the expression synthesis module, firstly, a picture (an initial image sequence) and a target audio are segmented, semantic recognition is carried out on the target audio, and the emotion trend of the target audio is deduced, so that an expression label is generated. And then recognizing the face in the initial image sequence, extracting the characteristics of the face, inputting the extracted facial characteristics and the expression labels into an expression characteristic generating module to obtain expression characteristics, inputting the expression characteristics and the facial characteristics into a face synthesizing module at the same time, performing face rendering on the initial image sequence, and finally obtaining an expression synthesizing result, namely a target image sequence. And synthesizing the target image sequence and the target audio to obtain a target video, namely a complete high-definition video with expression change.
For example, when the original video is super-divided into a plurality of images to be processed and target audio, whether the decomposition of the images to be processed is correct or not can be judged by the following ways: and firstly, reading an original video sequence by using an OpenCV library, and obtaining the total frame number of the original video. Since the original video sequence sometimes contains frames that are damaged or cannot be decoded by OpenCV, when the OpenCV attribute is directly used to obtain the total frame number, the frames are skipped, so that the obtained total frame number and the actual frame number are inconsistent. To avoid this problem, the way of traversal is added: first, a first frame number is acquired through an OpenCV library, then, a second frame number of an original video is acquired through traversal, and the first frame number and the second frame number are taken as actual frame numbers, so that the frame numbers of all frames of the original video are acquired. Next, acquiring video duration through the acquired actual frame number and video frame rate, and acquiring an image corresponding to each second of video, namely an image to be processed, according to the duration.
The invention has the following technical effects: the method comprises the steps of dividing an image to be processed, determining a first partial image and a second partial image, dividing the image into a plurality of first sub-images based on a preset size, refining the focused part, further, performing super-resolution processing on each first sub-image based on each first super-division algorithm in a first super-division algorithm library, obtaining first super-division images corresponding to each first super-division algorithm, determining a first target image corresponding to each first sub-image according to each first super-division image, selecting the most suitable super-division algorithm for each refined first sub-image, improving universality of the super-division algorithm, performing super-resolution processing on the second partial image based on a second super-division algorithm library, determining a second target image corresponding to the second partial image, selecting the proper super-division algorithm for the non-focused part, splicing the first target image and the second target image, obtaining target super-division images corresponding to the image to be processed, and performing comprehensive super-division algorithm and comprehensive analysis on the super-division images.
Embodiment two:
Fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. As shown in fig. 2, the electronic device 200 includes one or more processors 201 and memory 202.
The processor 201 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 200 to perform desired functions.
Memory 202 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, random Access Memory (RAM) and/or cache memory (cache) and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on a computer readable storage medium and the processor 201 may execute the program instructions to implement the image processing method and/or other desired functions of any of the embodiments of the present invention described above. Various content such as initial arguments, thresholds, etc. may also be stored in the computer readable storage medium.
In one example, the electronic device 200 may further include: an input device 203 and an output device 204, which are interconnected by a bus system and/or other forms of connection mechanisms (not shown). The input device 203 may include, for example, a keyboard, a mouse, and the like. The output device 204 may output various information to the outside, including warning prompt information, braking force, etc. The output device 204 may include, for example, a display, speakers, a printer, and a communication network and remote output apparatus connected thereto, etc.
Of course, only some of the components of the electronic device 200 that are relevant to the present invention are shown in fig. 2 for simplicity, components such as buses, input/output interfaces, etc. are omitted. In addition, the electronic device 200 may include any other suitable components depending on the particular application.
In addition to the methods and apparatus described above, embodiments of the invention may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform the steps of the image processing method provided by any of the embodiments of the invention.
Embodiment III:
The computer program product may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present invention may also be a computer-readable storage medium, on which computer program instructions are stored, which, when being executed by a processor, cause the processor to perform the steps of the image processing method provided by any of the embodiments of the present invention.
A computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application. As used in this specification, the terms "a," "an," "the," and/or "the" are not intended to be limiting, but rather are to be construed as covering the singular and the plural, unless the context clearly dictates otherwise. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method or apparatus that includes the element.
It should also be noted that the positional or positional relationship indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the positional or positional relationship shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or element in question must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention. Unless specifically stated or limited otherwise, the terms "mounted," "connected," and the like are to be construed broadly and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the essence of the corresponding technical solutions from the technical solutions of the embodiments of the present invention.

Claims (10)

1. An image processing method, comprising:
dividing an image to be processed, determining a first partial image and a second partial image, and dividing the first partial image into a plurality of first sub-images based on a preset size;
Performing super resolution processing on each first sub-image based on each first super-division algorithm in a first super-division algorithm library to obtain first super-division images corresponding to each first super-division algorithm, and determining a first target image corresponding to each first sub-image according to each first super-division image;
performing super-resolution processing on the second partial image based on a second super-resolution algorithm library, and determining a second target image corresponding to the second partial image;
and splicing the first target images and the second target images to obtain target superresolution images corresponding to the images to be processed.
2. The method according to claim 1, wherein the performing super-resolution processing on each of the first sub-images based on each of the first super-division algorithms in the first super-division algorithm library to obtain a first super-division image corresponding to each of the first super-division algorithms includes:
For each first sub-image, determining a center image of the first sub-image according to the first sub-image;
And respectively carrying out super-resolution processing on the center image based on each first super-resolution algorithm in a first super-resolution algorithm library, and determining each first super-resolution image corresponding to each first super-resolution algorithm according to each processing result corresponding to the center image.
3. The method of claim 2, wherein determining a first target image corresponding to each of the first sub-images from each of the first super-resolution images comprises:
For each first super-division algorithm, determining at least two similarity values according to a first super-division image corresponding to the first super-division algorithm and the center image;
Determining a first evaluation value corresponding to the first super-division algorithm according to the at least two similarity values and the algorithm weight corresponding to the first super-division algorithm;
And determining a first target algorithm according to first evaluation values corresponding to the first super-resolution algorithms, and performing super-resolution processing on the first sub-image according to the first target algorithm to obtain a first target image.
4. The method of claim 3, wherein determining a first target algorithm according to the first evaluation value corresponding to each first super-resolution algorithm, and performing super-resolution processing on the first sub-image according to the first target algorithm, to obtain a first target image, includes:
judging whether a first evaluation value corresponding to each first superdivision algorithm meets a preset evaluation threshold or not according to each first superdivision algorithm, and if so, determining the first evaluation value as a filtering evaluation value;
if at least two filtering evaluation values exist, a first super-resolution algorithm corresponding to each filtering evaluation value is used as a first standby algorithm, super-resolution processing is carried out on the first sub-image based on each first standby algorithm, and a first standby image corresponding to each first standby algorithm is obtained;
for each first image to be selected, determining a first similarity according to the first image to be selected and the first sub-image;
And determining a first target algorithm according to the first similarity corresponding to each first target algorithm, and determining a first target image corresponding to the first target algorithm as a first target image.
5. The method according to claim 4, wherein after determining, for each of the first super-division algorithms, whether a first evaluation value corresponding to the first super-division algorithm satisfies a preset evaluation threshold, if so, determining the first evaluation value as a filtered evaluation value, further comprises:
if the filtering evaluation value does not exist, judging whether the image size of the first sub-image is larger than the minimum size;
if yes, dividing the first sub-images based on the preset quantity, updating the first sub-images, and returning to execute the operation of executing the first super-resolution processing on the first sub-images according to the first super-resolution algorithms in the first super-resolution algorithm library to obtain first super-resolution images corresponding to the first super-resolution algorithms;
If not, determining the minimum value in each first evaluation value as a first target value, and determining a first super-division algorithm corresponding to the first target value as a first target algorithm;
And performing super-resolution processing on the first sub-image according to the first target algorithm to obtain a first target image.
6. The method according to claim 4, wherein after determining, for each of the first super-division algorithms, whether a first evaluation value corresponding to the first super-division algorithm satisfies a preset evaluation threshold, if so, determining the first evaluation value as a filtered evaluation value, further comprises:
if a filtering evaluation value exists, determining a first hyper-score algorithm corresponding to the filtering evaluation value as a first target algorithm;
and performing super-resolution processing on the first sub-image based on the first target algorithm to obtain a first target image.
7. The method according to claim 1, wherein the performing super resolution processing on the second partial image based on the second super-resolution algorithm library to determine a second target image corresponding to the second partial image includes:
dividing the second partial image into a plurality of second sub-images based on a preset size;
for each second sub-image, respectively carrying out super-resolution processing on the second sub-image based on each second super-division algorithm in a second super-division algorithm library to obtain a second super-division image corresponding to each second super-division algorithm;
Determining a sub-target image corresponding to the second sub-image according to each second super-resolution image;
and splicing the sub-target images to obtain a second target image corresponding to the second partial image.
8. The method according to any one of claims 1 to 7, further comprising, after the stitching each of the first target image and the second target image to obtain a target super-resolution image corresponding to the image to be processed:
constructing an initial image sequence according to each target superdivision image;
Performing facial recognition and feature extraction on the initial image sequence, determining facial features, performing semantic recognition and emotion analysis on target audio corresponding to the initial image sequence, and determining an expression label;
Generating expression features according to the facial features and the expression labels, and rendering the initial image sequence according to the expression features and the facial features to obtain a target image sequence;
and generating a target video according to the target image sequence and the target audio.
9. An electronic device, the electronic device comprising:
A processor and a memory;
the processor is configured to execute the steps of the image processing method according to any one of claims 1 to 8 by calling a program or instructions stored in the memory.
10. A computer-readable storage medium storing a program or instructions that cause a computer to execute the steps of the image processing method according to any one of claims 1 to 8.
CN202410534365.6A 2024-04-30 2024-04-30 Image processing method, apparatus and storage medium Active CN118134765B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410534365.6A CN118134765B (en) 2024-04-30 2024-04-30 Image processing method, apparatus and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410534365.6A CN118134765B (en) 2024-04-30 2024-04-30 Image processing method, apparatus and storage medium

Publications (2)

Publication Number Publication Date
CN118134765A true CN118134765A (en) 2024-06-04
CN118134765B CN118134765B (en) 2024-07-12

Family

ID=91235961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410534365.6A Active CN118134765B (en) 2024-04-30 2024-04-30 Image processing method, apparatus and storage medium

Country Status (1)

Country Link
CN (1) CN118134765B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108665509A (en) * 2018-05-10 2018-10-16 广东工业大学 A kind of ultra-resolution ratio reconstructing method, device, equipment and readable storage medium storing program for executing
WO2021083059A1 (en) * 2019-10-29 2021-05-06 Oppo广东移动通信有限公司 Image super-resolution reconstruction method, image super-resolution reconstruction apparatus, and electronic device
CN113240687A (en) * 2021-05-17 2021-08-10 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and readable storage medium
CN113596576A (en) * 2021-07-21 2021-11-02 杭州网易智企科技有限公司 Video super-resolution method and device
CN116739901A (en) * 2023-06-21 2023-09-12 南京炫佳网络科技有限公司 Video super-processing method and device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108665509A (en) * 2018-05-10 2018-10-16 广东工业大学 A kind of ultra-resolution ratio reconstructing method, device, equipment and readable storage medium storing program for executing
WO2021083059A1 (en) * 2019-10-29 2021-05-06 Oppo广东移动通信有限公司 Image super-resolution reconstruction method, image super-resolution reconstruction apparatus, and electronic device
CN113240687A (en) * 2021-05-17 2021-08-10 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and readable storage medium
CN113596576A (en) * 2021-07-21 2021-11-02 杭州网易智企科技有限公司 Video super-resolution method and device
CN116739901A (en) * 2023-06-21 2023-09-12 南京炫佳网络科技有限公司 Video super-processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN118134765B (en) 2024-07-12

Similar Documents

Publication Publication Date Title
US20220189142A1 (en) Ai-based object classification method and apparatus, and medical imaging device and storage medium
CN108229490B (en) Key point detection method, neural network training method, device and electronic equipment
CN110516577B (en) Image processing method, image processing device, electronic equipment and storage medium
CN110334753B (en) Video classification method and device, electronic equipment and storage medium
CN111401099B (en) Text recognition method, device and storage medium
CN111612696B (en) Image stitching method, device, medium and electronic equipment
KR20200109239A (en) Image processing method, device, server and storage medium
CN111292334B (en) Panoramic image segmentation method and device and electronic equipment
CN112308866A (en) Image processing method, image processing device, electronic equipment and storage medium
CN113762138A (en) Method and device for identifying forged face picture, computer equipment and storage medium
CN113077470B (en) Method, system, device and medium for cutting horizontal and vertical screen conversion picture
CN110796000A (en) Lip sample generation method and device based on bidirectional LSTM and storage medium
CN114639150A (en) Emotion recognition method and device, computer equipment and storage medium
CN114022497A (en) Image processing method and device
CN114972847A (en) Image processing method and device
CN112907569A (en) Head image area segmentation method and device, electronic equipment and storage medium
CN117689884A (en) Method for generating medical image segmentation model and medical image segmentation method
CN118134765B (en) Image processing method, apparatus and storage medium
CN113065561A (en) Scene text recognition method based on fine character segmentation
CN112184580A (en) Face image enhancement method, device, equipment and storage medium
CN113674230B (en) Method and device for detecting key points of indoor backlight face
CN113095239B (en) Key frame extraction method, terminal and computer readable storage medium
CN114821733A (en) Method, device and medium for compensating robustness of mode recognition model of unconstrained scene
CN114202723A (en) Intelligent editing application method, device, equipment and medium through picture recognition
Koumparoulis et al. Audio-assisted image inpainting for talking faces

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant