WO2024078308A1 - 图像优化方法、装置、电子设备、介质和程序产品 - Google Patents

图像优化方法、装置、电子设备、介质和程序产品 Download PDF

Info

Publication number
WO2024078308A1
WO2024078308A1 PCT/CN2023/120931 CN2023120931W WO2024078308A1 WO 2024078308 A1 WO2024078308 A1 WO 2024078308A1 CN 2023120931 W CN2023120931 W CN 2023120931W WO 2024078308 A1 WO2024078308 A1 WO 2024078308A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
optimized
generation network
network
Prior art date
Application number
PCT/CN2023/120931
Other languages
English (en)
French (fr)
Inventor
林楚铭
王烟波
罗栋豪
邰颖
张志忠
谢源
汪铖杰
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP23861677.5A priority Critical patent/EP4386657A1/en
Priority to US18/421,016 priority patent/US20240161245A1/en
Publication of WO2024078308A1 publication Critical patent/WO2024078308A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present application relates to the field of image processing technology, and in particular to image optimization.
  • the embodiments of the present application provide an image optimization method, device, electronic device, medium and program product, which can improve the image optimization effect.
  • An embodiment of the present application provides an image optimization method, including: acquiring an image generation network, an image to be optimized, and a plurality of preset image features; selecting a target feature from the plurality of preset image features, wherein the target feature and the image to be optimized satisfy a preset similarity condition; inputting the target feature and the initial offset parameter into the image generation network, adjusting the initial offset parameter according to a difference determined between an output of the image generation network and the image to be optimized, and obtaining a target offset parameter; and inputting the target feature and the target offset parameter into the image generation network to generate an optimized image.
  • the embodiment of the present application also provides an image optimization device, including: an acquisition unit, used to acquire an image generation network, an image to be optimized, and a plurality of preset image features; a determination unit, used to select a target feature from the plurality of preset image features, wherein the target feature and the image to be optimized satisfy a preset similarity condition; an adjustment unit, used to input the target feature and the initial offset parameter into the image generation network, and adjust the initial offset parameter according to the difference determined between the output of the image generation network and the image to be optimized to obtain a target offset parameter; and a generation unit, used to input the target feature and the target offset parameter into the image generation network to generate an optimized image.
  • an acquisition unit used to acquire an image generation network, an image to be optimized, and a plurality of preset image features
  • a determination unit used to select a target feature from the plurality of preset image features, wherein the target feature and the image to be optimized satisfy a preset similarity condition
  • an adjustment unit used to input the target feature and the initial offset
  • An embodiment of the present application also provides an electronic device, including a processor and a memory, wherein the memory stores multiple instructions; the processor loads instructions from the memory to execute the steps in any one of the image optimization methods provided in the embodiments of the present application.
  • An embodiment of the present application also provides a computer-readable storage medium, which stores a plurality of instructions, and the instructions are suitable for a processor to load to execute the steps in any one of the image optimization methods provided in the embodiments of the present application.
  • An embodiment of the present application also provides a computer program product, including a computer program, which, when executed by a processor, implements the steps in any one of the image optimization methods provided in the embodiments of the present application.
  • the embodiment of the present application can obtain an image generation network, an image to be optimized, and a plurality of preset image features; select a target feature from the plurality of preset image features, and the target feature and the image to be optimized meet a preset similarity. Condition; input the target features and the initial offset parameters into the image generation network, adjust the initial offset parameters according to the difference between the output of the image generation network and the image to be optimized, and obtain target offset parameters; input the target features and the target offset parameters into the image generation network to generate an optimized image.
  • a target feature corresponding to the image to be optimized is selected from a plurality of preset image features.
  • the target feature can be used as a starting point and combined with a target offset parameter to determine the feature used to generate the optimized image, so as to generate the optimized image.
  • the correlation between the features can be reduced, and the control ability of the visual features in the image can be improved to improve the optimization effect of the image; by adjusting the initial offset parameter, the input vector used to generate the optimized image is brought closer to the adjustment target, the authenticity of the optimized image is increased, and the optimization effect of the image is improved.
  • the target feature and the image to be optimized meet the preset similarity condition, which can reduce the distance between the target feature and the feature of the optimized image, reduce the difficulty of adjusting the initial offset parameter, and improve the image optimization efficiency.
  • FIG. 1a is a schematic diagram of a scene of an image optimization method provided in an embodiment of the present application.
  • FIG1b is a schematic diagram of a flow chart of an image optimization method provided in an embodiment of the present application.
  • FIG1c is a schematic diagram of inversion search performed using different methods
  • FIG1d is a schematic diagram of a process for adjusting initial offset parameters provided in an embodiment of the present application.
  • FIG. 1e is a schematic diagram of a process for adjusting network parameters of an image generation network provided in an embodiment of the present application
  • FIG2a is a schematic diagram of the structure of a StyleGAN-XL network provided in an embodiment of the present application.
  • FIG2b is a schematic diagram of a flow chart of an image optimization method provided by another embodiment of the present application.
  • FIG2c is a schematic diagram of an iterative training process provided by an embodiment of the present application.
  • FIG2d is a schematic diagram of an optimized image generated by different optimization methods provided in an embodiment of the present application.
  • FIG2e is a schematic diagram of comparison results of different optimization methods provided in embodiments of the present application on different repair tasks and different indicators;
  • FIG3 is a schematic diagram of the structure of an image optimization device provided in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of the structure of a computer device provided in an embodiment of the present application.
  • Embodiments of the present application provide an image optimization method, device, electronic device, medium, and program product.
  • the image optimization device can be integrated into an electronic device, which can be a terminal, a server, or other devices.
  • the terminal can be a mobile phone, a tablet computer, a smart Bluetooth device, a notebook computer, or a personal computer (PC), etc.
  • the server can be a single server or a server cluster composed of multiple servers.
  • the image optimization device may also be integrated into multiple electronic devices.
  • the image optimization device may be integrated into multiple servers, and the image optimization method of the present application may be implemented by multiple servers.
  • the server may also be implemented in the form of a terminal.
  • the image optimization method can be implemented by an image optimization device, which can be integrated in a server.
  • the server can obtain an image generation network, an image to be optimized, and a plurality of preset image features; select a target feature from the plurality of preset image features, and the target feature and the image to be optimized meet a preset similarity condition; adjust the initial offset parameter according to the image generation network, the target feature, and the image to be optimized to obtain the target offset parameter; input the target feature and the target offset parameter into the image generation network to generate an optimized image.
  • the image generation process is to transform an input vector (input feature) into a high-quality image.
  • Image inversion is to infer (search) the corresponding input vector through an input image (not necessarily a high-quality image). This process is called inversion.
  • An image generation network can refer to a neural network that can be used to generate images.
  • the image generation network can decode the input vector to reconstruct a high-quality image corresponding to the input vector.
  • the input vector of a low-quality image can include random noise or conditional vectors. Therefore, a low-quality image can be used to infer the input vector under the image generation network through inversion technology, and then the image generation network can process the input vector to generate the corresponding high-quality image to achieve applications such as image restoration.
  • the image optimization method of the embodiment of the present application can obtain the input vector of the image generation network (i.e., the feature vector obtained by combining the target feature and the offset parameter) through inversion search of the image to be optimized and multiple preset image features, so as to generate the optimized image according to the input vector.
  • the image optimization method of the embodiment of the present application can be applied to the field of artificial intelligence based on computer vision and other technologies, and can be specifically applied to the fields of image super-resolution, image restoration, image enhancement, image editing, etc.
  • the embodiment of the present application can use the target feature of the image to be optimized as the starting point of the inversion search, obtain the target search result by adjusting the initial offset parameter search, and use the target search result as the input vector of the image generation network. Then, the input vector is input into the image generation network, and the image generation network generates a high-quality optimized image.
  • Artificial Intelligence is a technology that uses digital computers to simulate human perception of the environment, acquire knowledge, and use knowledge. This technology can enable machines to have functions similar to human perception, reasoning, and decision-making.
  • the basic technologies of artificial intelligence generally include sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, mechatronics, and other technologies.
  • Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, as well as machine learning/deep learning, autonomous driving, smart transportation, and other major directions.
  • Computer Vision is a technology that uses computers to replace human eyes to identify, measure and further process optimized images.
  • Computer vision technology usually includes image generation, image recognition, image semantic understanding, image retrieval, virtual reality, augmented reality, simultaneous positioning and map construction, automatic driving, smart transportation and other technologies, as well as common biometric recognition technologies such as face recognition and fingerprint recognition.
  • image generation technologies such as image coloring and image stroke extraction.
  • artificial intelligence technology has been studied and applied in many fields, such as common Smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, driverless cars, autonomous driving, drones, robots, smart medical care, smart customer service, Internet of Vehicles, autonomous driving, smart transportation, etc. I believe that with the development of technology, artificial intelligence technology will be applied in more fields and play an increasingly important role.
  • an image optimization method involving artificial intelligence is provided, as shown in FIG1b , and the specific process of the image optimization method can be as follows:
  • the image to be optimized may refer to an image of low quality or an image that needs to be improved in quality.
  • the image to be optimized may have noise, color loss, detail loss, resolution, and other issues, resulting in low image quality.
  • the present application does not limit the type of image to be optimized.
  • the image to be optimized may include but is not limited to face images, animal images, building images, landscape images, etc.
  • the image generation network may refer to a neural network that can be used to generate images.
  • the image generation network may be one or more of a convolutional network (CNN), a variational autoencoder (VAE), a generative adversarial network (GAN), etc.
  • a generative adversarial network may also be a generative network in a generative adversarial network.
  • Generative Adversarial Networks is a network framework composed of a generative network and a discriminative network. By inputting a Gaussian random vector into the generative network, a high-quality image can be generated. In this way, the generative network in the generative adversarial network can be used as the image generation network in the embodiment of the present application to generate a corresponding image based on image features, etc.
  • the generative network of a generative adversarial network can include multiple convolutional layers.
  • the input vector such as the w vector can be transformed into an affine transformation and random noise through a mapping network and input into each convolutional layer in the generative network.
  • the affine transformation can be used to control the style of the generated image
  • the random noise can be used to enrich the details of the generated image.
  • Each convolutional layer can adjust the style of the image according to the input affine transformation and adjust the details of the image through the input random noise.
  • the original image may be first degraded to obtain the image to be optimized, thereby reducing the feature dimension of the image while retaining effective information.
  • the method for obtaining the image to be optimized includes:
  • the image degradation process may refer to a process for reducing the image quality.
  • the image degradation process may include methods such as downsampling.
  • the downsampling process may include, but is not limited to, looping through each pixel point by a for loop in alternate rows and columns, copying a matrix in alternate rows and columns, and other methods. Downsampling can reduce the image feature dimension while retaining valid information to avoid overfitting and reduce the amount of computation in the image optimization process.
  • the optimized image i.e., the high-resolution image, is generated by the method of the embodiment of the present application.
  • the optimized image has more accurate filling details, the color is closer to the real situation, and the texture details are richer.
  • the image feature can refer to the feature of the random variable.
  • the preset image feature is a feature that is irrelevant to the image to be optimized.
  • the variable may take various different values, which are uncertain and random, but the probability that these values fall within a certain range is certain. This kind of variable is called a random variable. Random variables can be discrete or continuous.
  • the feature vector of a random variable is randomly distributed in the feature space according to a certain statistical trajectory, and the feature vector is a point in the feature space, and the statistical trajectory can be determined by a probability distribution function.
  • the characteristics of the random variable can be obtained by a probability distribution function such as 0-1 distribution, binomial distribution, Poisson distribution, geometric distribution, uniform distribution, exponential distribution or Gaussian distribution, and used as the preset image features in the embodiments of the present application.
  • a feature vector of the random variable can be generated by a random number generator, that is, a plurality of preset image features are generated.
  • the random number generator can obey a certain statistical trajectory random distribution, such as the random number generator can be a random number generator that obeys a Gaussian distribution.
  • the feature space can be a combination of n-dimensional features, and the feature vector of the random variable is a point in the feature space, and all the feature vectors with different values constitute the n-dimensional space. Therefore, the feature vector of the random variable can also be transformed from one feature space to another feature space, so that the transformed features can be used as the preset image features in the embodiments of the present application.
  • the original features (feature vectors of random variables) obtained by the probability distribution function can be transformed into a preset space to improve the expressiveness of the features.
  • the method for obtaining multiple preset image features includes:
  • a plurality of original features are mapped into a preset feature space to obtain a plurality of preset image features.
  • the distribution feature may refer to the distribution mode of the random variable.
  • the distribution feature type may be 0-1 distribution, binomial distribution, Poisson distribution, geometric distribution, uniform distribution, exponential distribution or Gaussian distribution, etc.
  • the feature vector of the random variable in the initial feature space may be used as the original feature, and the initial feature space may refer to the feature space formed by the feature vector of the random variable.
  • the preset feature space may refer to a feature space set according to an actual application scenario.
  • the preset feature space may be a W space or the like.
  • multiple z vectors i.e., multiple original features
  • a random number generator that obeys a Gaussian distribution
  • the z vectors are transformed from the Z space to the W space to obtain multiple w vectors (i.e., multiple preset image features).
  • the preset feature space is W space, which is a subset of the image feature space, and the relationship between the vectors therein is more linear.
  • the Z space can be obtained by Gaussian distribution sampling, and the Z space is a Gaussian distribution space.
  • the Z space can be transformed to obtain the W space.
  • the w vector of the W space can be passed back to the image generation network to obtain multiple control vectors, so that different elements of the control vector can control different visual features to control the style of the generated image.
  • the style of the image generated by the z vector is usually relatively fixed, but the style of the generated image can be changed by adjusting the w vector.
  • the image can be adjusted from style A to style B.
  • the style of the image generated by the w vector can be gradually changed to be similar to the style of the image to be optimized, thereby improving the optimization effect of the image.
  • mapping can be performed through a mapping network, which can include multiple fully connected layers.
  • a mapping network can include multiple fully connected layers.
  • multiple preset image features can be obtained after being processed by multiple fully connected layers.
  • M z vectors can be sampled in, is a Gaussian distribution, the M z-direction
  • the M w vectors constitute the W space, where Mapping (.) represents the processing of the mapping network.
  • the processing of the mapping network can be characterized as Where ⁇ Mapping represents the mapping network, c is the specified category, are M w vectors.
  • the implicit feature i.e., the target feature
  • the target feature and the image to be optimized meet the preset similarity condition, which can shorten the distance between the search starting point of the inversion search and the target search result, reduce the search difficulty, and improve the search efficiency.
  • the preset similarity condition may refer to a similarity condition set according to an actual application scenario.
  • the preset similarity condition may refer to the similarity with the image to be optimized being greater than a similarity threshold, or the similarity with the image to be optimized satisfying a preset ranking such as the highest similarity.
  • the target feature can be determined by judging whether all preset image features and the image to be optimized meet the preset similarity.
  • the target feature can also be determined by screening a portion of image features from a plurality of preset image features through clustering or other methods, and then judging whether the portion of image features and the image to be optimized meet the preset similarity.
  • a plurality of preset image features may be classified by clustering processing, and a target feature may be determined from the classified central features, so as to reduce the number of features to be judged whether they meet the similarity condition and improve the efficiency of determining the target feature.
  • the target feature is selected from the plurality of preset image features, including:
  • Clustering is performed on a plurality of preset image features to obtain a plurality of feature clusters, wherein the feature clusters include a central feature;
  • clustering processing may refer to the process of dividing all preset image features into multiple classes composed of similar preset image features.
  • a feature cluster may be a class obtained by clustering, and a central feature may refer to the center of the class obtained by clustering, i.e., the centroid.
  • Clustering processing methods may include K-Means algorithm, DBSCAN algorithm, BIRCH algorithm, etc. The embodiments of the present application do not limit the parameters used in clustering processing, such as clustering radius, etc., and can be set according to the actual application scenario.
  • the K-Means algorithm can be used to transform M w vectors Perform clustering to obtain N feature clusters and N centroids of N classes And the centroid with the highest similarity to the image to be optimized among the N centroids can be determined as the target feature.
  • the target feature can be determined by comparing the similarity between the image corresponding to the central feature and the image to be optimized to increase the accuracy of the determined target feature.
  • the target feature is selected from the central features of multiple feature clusters, including:
  • the central feature corresponding to the target image is determined as the target feature.
  • N center features can be input into an image generation network, which is then After processing, N center images are output in Denote N center images. And calculate the image similarity between each center image and the image to be optimized I d , and determine the center image with the highest image similarity among the center images as the target image. Generate the center feature of the target image, which is the target feature.
  • the feature corresponding to the central image that is closest to the feature distance between the image to be optimized can be determined as the target feature, so that the target feature is close to the adjustment target (target search result) to shorten the distance between the target feature and the adjustment target.
  • determining the target image from the central image includes:
  • the central image with the shortest feature distance to the image to be optimized is determined as the target image.
  • the feature distance can be Euclidean distance, cosine distance, absolute value distance, tangent distance, Ming distance or Mahalanobis distance, etc.
  • method 2 in the figure that is, the method of clustering the w vectors of M W spaces in the embodiment of the present application, obtains multiple original feature vectors by random sampling, and clusters these sampled vectors to obtain four cluster centers (centroids): and Obviously, among these four cluster centroids, the centroid closest to the target search result w t The generated image has the highest similarity with the image corresponding to the target search result and the centroid farthest from the target search result w t The generated image has the lowest similarity with the image corresponding to the target search result.
  • the clustering method of the embodiment of the present application can shorten the distance between the search starting point and the target search result, reduce the search difficulty, improve the search efficiency, and improve the efficiency of adjusting the initial offset parameter.
  • the feature distance between the center image and the image to be optimized is calculated, including:
  • the feature extraction network can be used to extract features from the center image and the image to be optimized, respectively, to obtain the first feature and the second feature, and the Euclidean distance or cosine distance between the first feature and the second feature can be calculated, and the center image corresponding to the first feature closest to the second feature of the image to be optimized can be determined as the target image.
  • the Kth image The feature distance to the image to be optimized is the closest, then the vector corresponding to the image is the target feature.
  • the feature extraction network may refer to a neural network used for image feature extraction.
  • the feature extraction network may include one or more of a convolutional network (CNN), a feedforward neural network (BP), a recurrent network (RNN), etc.
  • CNN convolutional network
  • BP feedforward neural network
  • RNN recurrent network
  • the difference between the target feature and the image to be optimized can be calculated, and the target offset parameter can be obtained by adjusting the initial offset parameter according to the difference.
  • the input vector of the optimized image generated using the target offset parameter is close to the adjustment target (target search result).
  • the target feature can be adjusted for the first time by the initial adjustment parameter to obtain the input vector of the image generation network, and through multiple adjustment processes, the input vector can continuously learn the implicit features (latent vectors) in the image to be optimized, so that the input vector is constantly changing and is close to the image to be optimized, so as to increase the authenticity of the optimized image.
  • the offset parameter may refer to a parameter used to adjust a feature to reduce the difference between the feature and the target.
  • the initial offset parameter may refer to an offset parameter set according to an application scenario or experience and used to adjust the target feature. For example, the initial offset parameter may be set to 0.
  • the target feature is determined from the preset image feature, there is a difference between it and the image to be optimized, so the impact of the difference on the generated optimized image can be reduced by introducing the initial offset parameter.
  • the target feature can be reduced by introducing the offset term w off Adjust
  • the initial value of the offset term w off is the initial offset parameter. If the offset term w off is adjusted at least once, the value of w off obtained by the last adjustment can be used as the target offset parameter.
  • the initial offset parameters can be adjusted by calculating the loss value between the degraded second image generated by the target features and the initial offset parameters and the image to be optimized, so that the input vector used to generate the optimized image is closer to the adjustment target.
  • constraints on the initial offset parameters are added to limit the scope of the inversion search. Specifically, according to the image generation network, the target features and the image to be optimized, the initial offset parameters are adjusted to obtain the target offset parameters, including:
  • the image to be optimized and the second image are calculated to obtain an offset loss value
  • the initial offset parameters are adjusted to obtain the target offset parameters.
  • the constraints may include mandatory constraints such as equality constraints, direct truncation constraints (limiting the maximum and minimum ranges), soft constraints such as L1 constraints, L2 constraints, etc.
  • the loss value of the image to be optimized and the second image can be calculated by a loss function with constraints. Adding constraints to the loss function can prevent overfitting of model training, thereby enhancing generalization ability and avoiding distortion of the optimized image.
  • the loss function may include but is not limited to a combination of one or more of a structural similarity index (SSIM) loss function, a learning perceptual image block similarity (LPIPS) loss function, a mean square error (MSE) loss function, a square term loss function, etc.
  • SSIM structural similarity index
  • LPIPS learning perceptual image block similarity
  • MSE mean square error
  • the constraint condition on the initial offset parameter may refer to a condition for constraining the offset parameter.
  • the target features and initial offset parameters can be used as input vectors of the image generation network, and the image generation network outputs the first image.
  • the loss value of the second image and the image to be optimized i.e., the offset loss value
  • the initial offset parameters are adjusted according to the offset loss value, so that the input vector used to generate the optimized image is close to the adjustment target, and the second image is close to the image to be optimized.
  • the optimization image is moved closer until the loss function converges.
  • a regularization method is used to limit the inversion search range, so that the target search result is close to the target result in color and texture, so that the generated image achieves a quality-distortion balance, and a high-quality image close to the input image is obtained.
  • the initial offset parameters can be iteratively optimized by the offset loss value until the loss function converges to obtain the target offset parameters, so that the input vector used to generate the optimized image is closer to the adjustment target through multiple iterations, so as to optimize and obtain more accurate offset parameters.
  • the initial offset parameters are adjusted to obtain the target offset parameters, including:
  • the initial offset parameter is adjusted to obtain the intermediate offset parameter
  • the intermediate offset parameters are determined as the initial offset parameters, and the execution step is returned to input the target features and the initial offset parameters into the image generation network to generate the first image.
  • the initial offset parameters are adjusted according to the offset loss value to obtain the intermediate offset parameters until the offset loss value converges, and the offset parameters obtained by the last adjustment are determined as the target offset parameters.
  • the first image can be generated by combining the target features and the offset parameters adjusted last time in each iteration.
  • the second image is degraded based on the generated first image, and the offset loss value is obtained by calculating the second image and the image to be optimized through the loss function, and then the offset parameters adjusted last time are adjusted according to the loss value until the loss function converges, and the offset parameters adjusted last time are used as the target offset parameters.
  • the range of the offset parameter is limited by the regularized initial offset parameter to improve the efficiency and accuracy of adjusting the initial offset parameter.
  • the constraint condition on the initial offset parameter includes an offset parameter constraint item, and based on the constraint condition on the initial offset parameter, the image to be optimized and the second image are calculated to obtain an offset loss value, including:
  • the first loss term is constrained by the offset parameter constraint term to obtain the offset loss value.
  • regularization processing can refer to a method of adding constraints to the parameters to be optimized.
  • the embodiment of the present application can use the target feature as the starting point of the inversion search, and continuously adjust the offset parameter through the inversion search according to the image to be optimized to obtain the target offset parameter. Then, the target feature and the target offset parameter are used to obtain Generate an input vector of an optimized image, and generate the optimized image from the input vector. It is understandable that if the optimized image is generated only with features in the image to be optimized, the ability to control the visual features in the image is limited by the features in the image to be optimized. However, the embodiment of the present application is based on the target features determined by the preset image features, which can reduce the correlation between the features, improve the ability to control the visual features in the image, and improve the image quality.
  • the network can be generated by By target feature and the target offset parameter w off generate the optimized image I syn , where ⁇ Synthesis represents the image generation network.
  • the offset parameter after adjusting the offset parameter, can be fixed to adjust the image generation network to optimize the image generation network and improve the quality of the generated optimized image.
  • the initial offset parameter according to the offset loss value and obtaining the target offset parameter it also includes:
  • the target offset parameters and the image to be optimized are adjusted to obtain the adjusted image generation network.
  • the adjusted image generation network can be determined as the initial image generation network, and the step of adjusting the initial offset parameters according to the image generation network, the target features and the image to be optimized is returned to obtain the target offset parameters, and the step of adjusting the network parameters of the image generation network according to the target features, the target offset parameters and the image to be optimized is alternately executed to obtain the adjusted image generation network, and the step of adjusting the initial offset parameters according to the image generation network, the target features and the image to be optimized is performed to obtain the target offset parameters, until the preset end condition is met.
  • the preset end condition may be an end condition set according to an application scenario.
  • the preset end condition may be that the number of times the above steps are alternately executed reaches a threshold, or that the loss function in the process of adjusting the initial offset parameters and/or the loss function in the process of adjusting the network parameters of the image generation network converges to a loss threshold or is equal to zero, etc.
  • the initial offset parameters and/or the network parameters of the image generation network may be adjusted once or multiple times in each alternating process.
  • the initial offset parameters can be adjusted once based on the target features and the image to be optimized to obtain the target offset parameters, and then the network parameters of the image generation network can be adjusted once based on the target features, the target offset parameters and the image to be optimized to obtain the adjusted image generation network, and then the target offset parameters can be used as the initial offset parameters, and the adjusted image generation network can be used as the image generation network, and the process of adjusting the initial offset parameters and the network parameters of the image generation network can be repeated, and so on, the process of adjusting the initial offset parameters and the image generation network can be repeated alternately until the loss function converges.
  • the initial offset parameters may be iteratively adjusted multiple times until a preset number of iterations is met or the loss function corresponding to the offset loss value converges to a first loss threshold
  • the network parameters of the image generation network may be iteratively adjusted multiple times until a preset number of iterations is met or the loss function corresponding to the network loss value converges to a second loss threshold.
  • the process of adjusting the initial offset parameters and the image generation network is repeated alternately until the number of times the above steps are alternately performed reaches a threshold, or the loss function corresponding to the offset loss value and the loss function corresponding to the network loss value converge to a third loss threshold.
  • the target feature and the initial offset parameters can be calculated.
  • the loss value between the fourth image after the deterioration is generated and the image to be optimized is used to adjust the parameters of the image generation network to continuously optimize the image generation network.
  • constraints on the initial offset parameters are added to limit the parameter range to avoid distortion of the optimized image due to overfitting.
  • the image to be optimized and the fourth image are calculated to obtain a network loss value
  • the network parameters of the image generation network are adjusted to obtain an adjusted image generation network, and the adjusted image generation network is used to generate an optimized image.
  • the constraint conditions on the image generation network may refer to conditions for constraining network parameters of the image generation network.
  • the offset parameter can be fixed and only the parameters of the image generation network can be optimized.
  • the target features and the target offset parameters can be used as input vectors of the image generation network, and the image generation network outputs the third image.
  • the loss value of the fourth image and the image to be optimized i.e., the network loss value
  • the network parameters of the image generation network are adjusted according to the network loss value until the loss function converges.
  • the network parameters of the image generation network can be iteratively adjusted by the network loss value until the loss function converges to obtain an adjusted image generation network, so as to obtain a better image generation network.
  • the network parameters of the image generation network are adjusted to obtain the adjusted image generation network, including:
  • the network loss value adjust the network parameters of the current image generation network to obtain the intermediate image generation network
  • the intermediate image generation network is determined as the current image generation network, and the execution step is returned to input the target features and the target offset parameters into the image generation network to generate a third image.
  • the network parameters of the current image generation network are adjusted according to the network loss value to obtain the adjusted image generation network until the offset loss value converges, and the intermediate image generation network obtained by the last adjustment is determined as the adjusted image generation network.
  • the current image generation network may refer to an image generation network whose network parameters are currently to be adjusted during the adjustment process.
  • the target features and target offset parameters can be input into the current image generation network in each iteration process to generate a third image.
  • the fourth image is degraded based on the generated third image, and the network loss value is calculated by the loss function from the fourth image and the image to be optimized, and then the network parameters of the current image generation network are adjusted according to the loss value.
  • the next iteration process is started, and the image generation network adjusted in the last iteration process is used as the current image generation network, and so on, until the loss function converges, and the image generation network obtained by the last adjustment is used as the adjusted image generation network.
  • the range of network parameters can be limited by the difference between the initial image generation network and the current image generation network to improve the efficiency and accuracy of adjusting the network parameters.
  • the constraints on the current image generation network include network constraints, and based on the constraints on the image generation network, the image to be optimized and the fourth image are calculated to obtain a network loss value, including:
  • the second loss term is constrained by the network constraint term to obtain the network loss value.
  • the initial image generation network may refer to an image generation network whose network parameters have not been adjusted. For example, in an image generation network adjusted through multiple iterations, the current image generation network in the first iteration is the initial image generation network.
  • the network constraint term may be determined by comparing the difference between the image generated by the initial image generation network and the image generated by the current image generation network. Specifically, the output result of the initial image generation network and the output result of the current image generation network are calculated to obtain the network constraint term, including:
  • the initial image and the current image are calculated to obtain the network constraints.
  • the image optimization solution provided by the embodiment of the present application can be applied in various image optimization scenarios. For example, taking image restoration as an example, an image generation network, an image to be optimized, and multiple preset image features are obtained; a target feature is selected from the multiple preset image features, and the target feature and the image to be optimized meet the preset similarity condition; according to the image generation network, the target feature, and the image to be optimized, the initial offset parameter is adjusted to obtain the target offset parameter; the target feature and the target offset parameter are input into the image generation network to generate an optimized image.
  • the embodiment of the present application selects the target feature corresponding to the image to be optimized from a plurality of preset image features, and can use the target feature as a starting point, combined with the target offset parameter, to determine the feature used to generate the optimized image, so as to generate the optimized image.
  • the target feature determined by the preset image feature it is possible to reduce the correlation between the features, improve the ability to control the visual features in the image, and improve the image optimization effect; by adjusting the initial offset parameter, the input vector used to generate the optimized image is brought closer to the adjustment target, the authenticity of the optimized image is increased, and the image optimization effect is improved.
  • the target feature and the image to be optimized meet the preset similarity condition, which can reduce the distance between the target feature and the feature of the optimized image, reduce the difficulty of adjusting the initial offset parameter, and improve the image optimization efficiency.
  • the method of the embodiment of the present application is described by taking the StyleGAN-XL network for image optimization as an example. Method for detailed description.
  • the StyleGAN-XL network is a generative adversarial network that can generate high-resolution and rich images.
  • the embodiment of the present application uses the StyleGAN-XL network as the image generation network.
  • the StyleGAN-XL network may include a Mapping network and a Synthesis network.
  • the mapping network can be used to transform the z vector into the w vector, and the generation network can be used to generate an image.
  • the generation network is the image generation network in the embodiment of the present application.
  • the StyleGAN-XL network used in the embodiment of the present application is pre-trained on ImageNet, that is, the image generation network can generate images of corresponding categories according to the categories in the specified ImageNet.
  • ImageNet is a large-scale visual database for visual object recognition software research.
  • I d image to be optimized
  • I d D(I)
  • D(.) the degradation process
  • I the high-definition image
  • ⁇ Synthesis represents the generation network of StyleGAN-XL.
  • the initial search starting point that is, the initial centroid (i.e., the target feature).
  • the w vectors of M W spaces i.e., multiple preset image features
  • ⁇ Mapping of StyleGAN-XL: in is a Gaussian distribution
  • c is the specified category.
  • N centroids i.e., center feature
  • the feature space can be used to measure the distance between two images.
  • the Visual Geometry Group (VGG) network can be used to extract the features of the image, and then the Euclidean distance or cosine distance of the extracted features can be calculated to find the image that is "closest" to the input image.
  • the kth image is the "closest” image, then the image corresponds to The vector is the latent vector to be optimized (i.e. the initial search starting point).
  • the embodiment of the present application does not directly optimize the initial latent vector Instead, the implicit vector is fixed into a The term offset term w off is used to optimize the offset term.
  • the initial value of the offset term is the initial offset parameter.
  • the latent vector can be obtained from the latent vector and the offset term.
  • the Caintiq vector can be used as the input vector to input the image generation network and iteratively trained to output the image
  • the target offset parameters and the image to be optimized are adjusted to obtain an adjusted image generation network.
  • the iterative training can be divided into two stages.
  • the network parameter ⁇ of the image generation network ⁇ Synthesis in the first stage, can be fixed, and only the w off parameter (offset term) can be optimized, that is, step 240.
  • the w off parameter (offset term) in the second stage, can be fixed, and only the network parameter ⁇ can be optimized, that is, step 250.
  • the two stages are repeated alternately until the loss function converges, and the training is stopped.
  • L LPIPS is the function for calculating the LPIPS index
  • L 2 is the square loss function
  • ⁇ 1 and ⁇ 2 are hyperparameters.
  • L ft L LPIPS (I d ,D(I syn ))+ ⁇ L2 L 2 (I d ,D(I syn ))+ ⁇ R L R ;
  • ⁇ L2 and ⁇ R are hyperparameters
  • LR is a local regularization term, which is expressed as follows:
  • the specific implementation process of the first stage can refer to the process shown in Figure 1d, and the specific implementation process of the second stage can refer to the process shown in Figure 1e, as well as the corresponding descriptions in the aforementioned embodiments, which will not be repeated here.
  • the image generated by the last iteration can be used as the optimized image. It can be understood that in the last iteration, the image generation network input by the latent vector is the adjusted image generation network, and the parameter value corresponding to the offset item in the latent vector input to the adjusted image generation network is the target offset parameter.
  • each row represents the input images with different degradation conditions, and different methods are used to invert them to obtain the optimized images output by the StyleGAN-XL network.
  • the first row represents the removal of a piece of information in the middle of the image, and the filling of the missing information in the middle through the inversion technology;
  • the second row represents the removal of the color information of the image, and the filling of the color of the image through the inversion technology;
  • the third row represents the downsampling of the image into a low-resolution image, and the generation of the corresponding high-resolution image through the inversion technology.
  • FIG. 2e the comparison results of different optimization methods on different restoration tasks and different indicators are shown.
  • This figure compares the indicators of the image optimization method of the embodiment of the present application, the DGP method based on the StyleGAN-XL network, and the PTI method based on the StyleGAN-XL network, and compares them on three different image degradation restoration tasks, including image completion (inpainting), image colorization (colorization), and image super-resolution (SR).
  • image completion inpainting
  • image colorization colorization
  • SR image super-resolution
  • the LPIPS image perceived similarity
  • FID image quality assessment
  • NIQE no reference image evaluation
  • the image obtained by inversion of the existing optimization method is quite different from the actual target result (reference image), especially for the case where the input image is a degraded image, these searched inversion results are often poor.
  • the DGP method is inverted on BigGAN (large-scale generative adversarial network), which can only generate images with a resolution of 256 ⁇ 256, and the DGP method is not effective when used on other generative networks.
  • the embodiment of the present application uses the generative network of the StyleGAN-XL network as the image generation network, which can generate high-resolution and rich images. By inverting the network, the corresponding input vector can be inverted for any image and the corresponding high-quality and high-resolution image can be generated.
  • a degraded image refers to an image with noise, color loss, detail loss, low resolution, etc.
  • the corresponding input vector in the latent space can be found, so that the input vector is sent to the generation network, and a similar and high-quality image (i.e., an optimized image) can be generated.
  • the embodiment of the present application also provides an image optimization device, which can be integrated in an electronic device, and the electronic device can be a terminal, a server, etc.
  • the terminal can be a mobile phone, a tablet computer, a smart Bluetooth device, a laptop, a personal computer, etc.
  • the server can be a single server or a server cluster composed of multiple servers.
  • the image optimization device may include an acquisition unit 310, a determination unit 320, an adjustment unit 330, and a generation unit 340, as follows:
  • the image generation network Used to obtain the image generation network, the image to be optimized, and multiple preset image features.
  • the acquisition unit 310 has a function that can be used to:
  • a plurality of original features are mapped into a preset feature space to obtain a plurality of preset image features.
  • the acquisition unit 310 has a function that can be used to:
  • the determining unit 320 may be specifically configured to:
  • Clustering is performed on a plurality of preset image features to obtain a plurality of feature clusters, wherein the feature clusters include a central feature;
  • selecting a target feature from central features of multiple feature clusters includes:
  • the central feature corresponding to the target image is determined as the target feature.
  • determining a target image from a center image includes:
  • the central image with the shortest feature distance to the image to be optimized is determined as the target image.
  • the target feature and the initial offset parameter are input into the image generation network, and the initial offset parameter is adjusted according to the difference between the output of the image generation network and the image to be optimized to obtain the target offset parameter.
  • the adjustment unit 330 may be specifically configured to:
  • the image to be optimized and the second image are calculated to obtain an offset loss value
  • the initial offset parameters are adjusted to obtain the target offset parameters.
  • the constraint condition on the initial offset parameter includes an offset parameter constraint item, and based on the constraint condition on the initial offset parameter, the image to be optimized and the second image are calculated to obtain an offset loss value, including:
  • the first loss term is constrained by the offset parameter constraint term to obtain the offset loss value.
  • the adjustment unit 330 may also be configured to:
  • the image to be optimized and the fourth image are calculated to obtain a network loss value
  • the network parameters of the image generation network are adjusted to obtain an adjusted image generation network, and the adjusted image generation network is used to generate an optimized image.
  • the constraints on the current image generation network include network constraints, and based on the constraints on the image generation network, the image to be optimized and the fourth image are calculated to obtain a network loss value, including:
  • the output results of the initial image generation network and the output results of the current image generation network are calculated to obtain the network Network constraints
  • the second loss term is constrained by the network constraint term to obtain the network loss value.
  • the output result of the initial image generation network and the output result of the current image generation network are calculated to obtain network constraint items, including:
  • the initial image and the current image are calculated to obtain the network constraints.
  • the above units can be implemented as independent entities, or can be arbitrarily combined to be implemented as the same or several entities.
  • the specific implementation of the above units can refer to the previous method embodiments, which will not be repeated here.
  • the embodiment of the present application can select the target features corresponding to the image to be optimized from multiple preset image features, and obtain the target offset parameters through adjustment.
  • the target features can be combined with the target offset parameters to generate an optimized image to improve the image optimization effect.
  • the embodiment of the present application also provides an electronic device, which can be a terminal, a server, etc.
  • the terminal can be a mobile phone, a tablet computer, a smart Bluetooth device, a laptop, a personal computer, etc.
  • the server can be a single server or a server cluster composed of multiple servers, etc.
  • the image optimization device may also be integrated into multiple electronic devices.
  • the image optimization device may be integrated into multiple servers, and the image optimization method of the present application may be implemented by multiple servers.
  • the electronic device of this embodiment is a server as an example for detailed description.
  • FIG. 4 it shows a schematic diagram of the structure of the server involved in the embodiment of the present application. Specifically:
  • the server may include one or more processing core processors 410, one or more computer-readable storage media memories 420, a power supply 430, an input module 440, and a communication module 450.
  • processing core processors 410 one or more computer-readable storage media memories 420
  • power supply 430 an input module 440
  • communication module 450 a communication module 450
  • the processor 410 is the control center of the server, and uses various interfaces and lines to connect various parts of the entire server. It executes various functions of the server and processes data by running or executing software programs and/or modules stored in the memory 420, and calling data stored in the memory 420.
  • the processor 410 may include one or more processing cores; in some embodiments, the processor 410 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface, and application programs, and the modem processor mainly processes wireless communications. It is understandable that the above-mentioned modem processor may not be integrated into the processor 410.
  • the memory 420 can be used to store software programs and modules.
  • the processor 410 executes various functional applications and data processing by running the software programs and modules stored in the memory 420.
  • the memory 420 can mainly include a program storage area and a data storage area.
  • the program storage area can store an operating system, an application required for at least one function (such as a sound playback function, an image playback function, etc.), etc.; the data storage area can store data created according to the use of the server, etc.
  • the memory 420 can include a high-speed random access memory and can also include a non-volatile memory, such as at least one Accordingly, the memory 420 may further include a memory controller to provide the processor 410 with access to the memory 420 .
  • the server also includes a power supply 430 for supplying power to various components.
  • the power supply 430 can be logically connected to the processor 410 through a power management system, so as to manage charging, discharging, and power consumption through the power management system.
  • the power supply 430 can also include any components such as one or more DC or AC power supplies, recharging systems, power failure detection circuits, power converters or inverters, and power status indicators.
  • the server may further include an input module 440, which may be used to receive input digital or character information and generate keyboard, mouse, joystick, optical or trackball signal input related to user settings and function control.
  • an input module 440 which may be used to receive input digital or character information and generate keyboard, mouse, joystick, optical or trackball signal input related to user settings and function control.
  • the server may further include a communication module 450.
  • the communication module 450 may include a wireless module.
  • the server may perform short-range wireless transmission through the wireless module of the communication module 450, thereby providing wireless broadband Internet access to the user.
  • the communication module 450 may be used to help the user send and receive emails, browse web pages, and access streaming media.
  • the server may further include a display unit, etc., which will not be described in detail herein.
  • the processor 410 in the server will load the executable files corresponding to the processes of one or more application programs into the memory 420 according to the following instructions, and the processor 410 will run the application programs stored in the memory 420, thereby realizing various functions, as follows:
  • An image generation network an image to be optimized, and a plurality of preset image features are obtained; a target feature is selected from the plurality of preset image features, and the target feature and the image to be optimized meet a preset similarity condition; the target feature and the initial offset parameter are input into the image generation network, and the initial offset parameter is adjusted according to the difference determined between the output of the image generation network and the image to be optimized to obtain a target offset parameter; the target feature and the target offset parameter are input into the image generation network to generate an optimized image.
  • the embodiment of the present application can select the target features corresponding to the image to be optimized from multiple image features, and obtain the target offset parameters through adjustment.
  • the target features can be combined with the target offset parameters to generate an optimized image to improve the image optimization effect.
  • an embodiment of the present application provides a computer-readable storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute the steps in any one of the image optimization methods provided in the embodiments of the present application.
  • the instructions can execute the following steps:
  • An image generation network an image to be optimized, and a plurality of preset image features are obtained; a target feature is selected from the plurality of preset image features, and the target feature and the image to be optimized meet a preset similarity condition; the target feature and the initial offset parameter are input into the image generation network, and the initial offset parameter is adjusted according to the difference determined between the output of the image generation network and the image to be optimized to obtain a target offset parameter; the target feature and the target offset parameter are input into the image generation network to generate an optimized image.
  • the storage medium may include: a read-only memory (ROM), a random access memory (RAM), Memory (RAM, Random Access Memory), disk or CD, etc.
  • ROM read-only memory
  • RAM random access memory
  • RAM Memory
  • Random Access Memory disk or CD, etc.
  • a computer program product comprising a computer program, the computer program being stored in a computer-readable storage medium.
  • a processor of a computer device reads the computer program from the computer-readable storage medium, and the processor executes the computer program, so that the computer device executes the method provided in various optional implementations provided in the above embodiments.
  • the computer program stored in the storage medium can execute the steps in any image optimization method provided in the embodiments of the present application, the beneficial effects that can be achieved by any image optimization method provided in the embodiments of the present application can be achieved. Please refer to the previous embodiments for details and will not be repeated here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本申请实施例公开了一种图像优化方法、装置、电子设备、介质和程序产品,可以应用于基于计算机视觉等技术的人工智能领域;本申请实施例获取图像生成网络、待优化图像以及多个预设的图像特征;从多个预设的图像特征中,选取目标特征,目标特征与待优化图像满足预设相似度条件;根据图像生成网络、目标特征以及待优化图像,调整初始偏移参数,得到目标偏移参数;将目标特征以及目标偏移参数输入图像生成网络,生成优化后的图像。在本申请实施例中,从多个预设的图像特征中选取对应待优化图像的目标特征,并通过调整得到目标偏移参数,可以由目标特征结合目标偏移参数,生成优化后的图像,以提升图像的优化效果。

Description

图像优化方法、装置、电子设备、介质和程序产品
本申请要求于2022年10月13日提交中国专利局、申请号为202211252059.0、申请名称为“图像优化方法、装置、电子设备、介质和程序产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,具体涉及图像优化。
背景技术
图像在成像、传输、获取的过程中,会受到外界的干扰、传输设备不完善等因素的影响,使得图像有噪声、色彩缺失、细节缺失、分辨率低等问题,导致图像质量较低。为了提升图像的质量,就需要对图像进行优化处理。
然而,现有的图像优化方法,例如对图像的噪声、模糊进行修复的方法,优化效果不佳。
发明内容
本申请实施例提供一种图像优化方法、装置、电子设备、介质和程序产品,可以提升图像的优化效果。
本申请实施例提供一种图像优化方法,包括:获取图像生成网络、待优化图像以及多个预设的图像特征;从所述多个预设的图像特征中,选取目标特征,所述目标特征与所述待优化图像满足预设相似度条件;将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数;将所述目标特征以及所述目标偏移参数输入所述图像生成网络,生成优化后的图像。
本申请实施例还提供一种图像优化装置,包括:获取单元,用于获取图像生成网络、待优化图像以及多个预设的图像特征;确定单元,用于从所述多个预设的图像特征中,选取目标特征,所述目标特征与所述待优化图像满足预设相似度条件;调整单元,用于将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数;生成单元,用于将所述目标特征以及所述目标偏移参数输入所述图像生成网络,生成优化后的图像。
本申请实施例还提供一种电子设备,包括处理器和存储器,所述存储器存储有多条指令;所述处理器从所述存储器中加载指令,以执行本申请实施例所提供的任一种图像优化方法中的步骤。
本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质存储有多条指令,所述指令适于处理器进行加载,以执行本申请实施例所提供的任一种图像优化方法中的步骤。
本申请实施例还提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现本申请实施例所提供的任一种图像优化方法中的步骤。
本申请实施例可以获取图像生成网络、待优化图像以及多个预设的图像特征;从所述多个预设的图像特征中,选取目标特征,所述目标特征与所述待优化图像满足预设相似度 条件;将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数;将所述目标特征以及所述目标偏移参数输入所述图像生成网络,生成优化后的图像。
在本申请中,从多个预设的图像特征中选取对应待优化图像的目标特征,可以以目标特征为起点,结合目标偏移参数,确定用于生成优化后的图像的特征,以生成优化后的图像。其中,基于由预设的图像特征确定的目标特征,能够减少特征之间的关联性,提升对图像中视觉特征的控制能力,以提升图像的优化效果;通过调整初始偏移参数,使用于生成优化后的图像的输入向量向调整目标靠近,增加优化后图像的真实性,提升图像的优化效果。而且,目标特征与待优化图像满足预设相似度条件,能够减小目标特征与优化后的图像的特征之间的距离,减小调整初始偏移参数的难度,提升图像优化效率。
附图说明
图1a是本申请实施例提供的图像优化方法的场景示意图;
图1b是本申请实施例提供的图像优化方法的流程示意图;
图1c是以不同方法进行反演搜索的示意图;
图1d是本申请实施例提供的调整初始偏移参数的流程示意图;
图1e是本申请实施例提供的调整图像生成网络的网络参数的流程示意图;
图2a是本申请实施例提供的StyleGAN-XL网络的结构示意图;
图2b是本申请另一个实施例提供的图像优化方法的流程示意图;
图2c是本申请实施例提供的迭代训练过程的示意图;
图2d是本申请实施例提供的不同优化方法生成的优化后的图像的示意图;
图2e是本申请实施例提供的不同优化方法在不同修复任务以及不同指标上的对比结果的示意图;
图3是本申请实施例提供的图像优化装置的结构示意图;
图4是本申请实施例提供的计算机设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例提供一种图像优化方法、装置、电子设备、介质和程序产品。
其中,该图像优化装置具体可以集成在电子设备中,该电子设备可以为终端、服务器等设备。其中,终端可以为手机、平板电脑、智能蓝牙设备、笔记本电脑、或者个人电脑(Personal Computer,PC)等设备;服务器可以是单一服务器,也可以是由多个服务器组成的服务器集群。
在一些实施例中,该图像优化装置还可以集成在多个电子设备中,比如,图像优化装置可以集成在多个服务器中,由多个服务器来实现本申请的图像优化方法。
在一些实施例中,服务器也可以以终端的形式来实现。
例如,参考图1a,该图像优化方法可以由图像优化装置实现,该图像优化装置可以集成在服务器中,该服务器可以获取图像生成网络、待优化图像以及多个预设的图像特征;从多个预设的图像特征中,选取目标特征,目标特征与待优化图像满足预设相似度条件;根据图像生成网络、目标特征以及待优化图像,调整初始偏移参数,得到目标偏移参数;将目标特征以及目标偏移参数输入图像生成网络,生成优化后的图像。
图像的生成过程是把一个输入向量(输入特征)转变成一张高质量图像,图像反演则是通过一张输入图像(不一定需要高质量的图像)推算(搜索)出对应的输入向量,这个过程叫做反演。这个输入向量输入到图像生成网络中,就可以生成出跟输入图像相似且高质量的图像。图像生成网络可以指能用于生成图像的神经网络,图像生成网络可以通过对输入向量进行解码以重建输入向量对应的高质量图像。一张低质量图像的输入向量可以包括随机的噪声或条件向量,因此,可以将一张低质量图像,通过反演技术推算出在图像生成网络下的输入向量,再由图像生成网络对该输入向量进行处理以生成对应的高质量图像,实现图像修复等应用。
本申请实施例的图像优化方法可以通过待优化图像以及多个预设的图像特征反演搜索得到图像生成网络的输入向量(即结合目标特征以及偏移参数得到的特征向量),以根据该输入向量生成优化后的图像。本申请实施例的图像优化方法可以应用于基于计算机视觉等技术的人工智能领域,具体可以应用于图像超分辨率,图像修复,图像增强,图像编辑等领域。
具体地,本申请实施例可以以对应待优化图像的目标特征作为反演搜索的起点,通过调整初始偏移参数搜索得到目标搜索结果,并将该目标搜索结果作为图像生成网络的输入向量。再该将输入向量输入图像生成网络,由图像生成网络生成高质量的优化后的图像。
以下分别进行详细说明。可以理解的是,在本申请的具体实施方式中,涉及到与用户相关的图像等相关的数据,当本申请实施例运用到具体产品或技术中时,任一项数据均需要单独获得用户的许可或者同意,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。
人工智能(Artificial Intelligence,AI)是一种利用数字计算机来模拟人类感知环境、获取知识并使用知识的技术,该技术可以使机器具有类似于人类的感知、推理与决策的功能。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互***、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习、自动驾驶、智慧交通等几大方向。
其中,计算机视觉(Computer Vision,CV)是利用计算机代替人眼对优化后的图像进行识别、测量等操作并进一步进行处理的技术。计算机视觉技术通常包括图像生成、图像识别、图像语义理解、图像检索、虚拟现实、增强现实、同步定位与地图构建、自动驾驶、智慧交通等等技术,还包括常见的人脸识别、指纹识别等生物特征识别技术。比如,图像着色、图像描边提取等图像生成技术。
随着人工智能技术研究和进步,人工智能技术在多个领域展开研究和应用,例如常见 的智能家居、智能穿戴设备、虚拟助理、智能音箱、智能营销、无人驾驶、自动驾驶、无人机、机器人、智能医疗、智能客服、车联网、自动驾驶、智慧交通等,相信随着技术的发展,人工智能技术将在更多的领域得到应用,并发挥越来越重要的价值。
在本实施例中,提供了一种涉及人工智能的图像优化方法,如图1b所示,该图像优化方法的具体流程可以如下:
110、获取图像生成网络、待优化图像以及多个预设的图像特征。
其中,待优化图像可以指低质量的图像或需要提升质量的图像,例如,待优化图像可以表现有噪声、色彩缺失、细节缺失、分辨率等问题,以导致图像质量较低。本申请对待优化图像的类型不做限制,例如待优化图像可以包括但不限于人脸图像、动物图像、建筑物图像、风景图像等。
其中,图像生成网络可以指能用于生成图像的神经网络。例如,图像生成网络可以为卷积网络(CNN)、变分自编码(VAE)、生成对抗网络(GAN)等中的一个或多个。例如,生成对抗网络也可以为对抗生成网络中的生成网络。
生成对抗网络(Generative Adversarial Networks,GAN)是由生成网络和判别网络组合成的网络框架,输入一个高斯随机向量到生成网络中,可以生成出一张高质量图像。以此,可以以生成对抗网络中的生成网络作为本申请实施例中的图像生成网络,基于图像特征等生成相应的图像。
例如,生成对抗网络的生成网络可以包括多个卷积层,可以通过映射网络将输入向量如w向量转换得到仿射变换以及随机噪声,并输入生成网络中的每个卷积层,仿射变换可以用于控制生成图像的风格,随机噪声可以用于丰富生成图像的细节,每个卷积层都能根据输入的仿射变换来调整图像的风格,通过输入的随机噪声来调节图像的细节。
在一些实施方式中,可以先对原始图像进行降质处理以得到待优化图像,在降低图像特征维度的同时保留有效信息。具体地,待优化图像的获取方法,包括:
获取原始图像;
对原始图像进行图像劣化处理,得到待优化图像。
其中,图像劣化处理可以指用于降低图像质量的处理过程。例如,图像劣化处理可以包括下采样处理等方法。下采样处理可以包括但不限于通过for循环隔行隔列循环遍历每一个像素点、矩阵隔行隔列复制等方法。通过下采样处理可以在降低图像特征维度的同时保留有效信息,以避免过拟合,并减少图像优化过程的运算量。
例如,可以将原始图像进行图像劣化处理,得到低分辨率图像即待优化图像Id=D(I),其中,D(.)为图像劣化处理过程,I为原始图像,再通过本申请实施例的方法生成优化后的图像即高分辨率图像。优化后的图像与原始图像相比,填充的细节更准确,颜色与真实情况更贴近,纹理细节更丰富。
其中,图像特征可以指随机变量的特征,可以理解的是,预设的图像特征是与待优化图像无关的特征。在不同的条件下由于偶然因素影响,变量可能取各种不同的值,具有不确定性和随机性,但这些取值落在某个范围的概率是一定的,此种变量称为随机变量。随机变量可以是离散型的,也可以是连续型的。
随机变量的特征矢量在特征空间中是按照某种统计轨迹随机分布的,该特征矢量即是该特征空间中的一个点,该统计轨迹可以由概率分布函数确定。例如,可以通过0-1分布、二项分布、泊松分布、几何分布、均匀分布、指数分布或高斯分布等概率分布函数得到随机变量的特征,并作为本申请实施例中的预设的图像特征。例如,在要对待优化图像进行优化前,可以由随机数生成器生成随机变量的特征矢量,即生成多个预设的图像特征。该随机数生成器可以服从某种统计轨迹随机分布,如该随机数生成器可以为服从高斯分布的随机数生成器。
再如,特征空间可以为n维特征的组合,随机变量的特征矢量便是特征空间中的一个点,各种不同取值的特征矢量的全体构成了n维空间。因此也可以将随机变量的特征矢量由一个的特征空间变换为另一特征空间,以将变换后的特征作为本申请实施例中的预设的图像特征。
在一些实施方式中,可以将由概率分布函数得到的原始特征(随机变量的特征矢量),变换到预设的空间内,以提升特征的表达能力。具体地,多个预设的图像特征的获取方法,包括:
根据随机变量的分布特征类型,采样得到多个原始特征;
将多个原始特征映射到预设的特征空间中,得到多个预设的图像特征。
其中,分布特征可以指随机变量的分布方式。例如,分布特征类型可以为0-1分布、二项分布、泊松分布、几何分布、均匀分布、指数分布或高斯分布等。可以将随机变量在初始特征空间中的特征矢量作为原始特征,初始特征空间可以指随机变量的特征矢量构成的特征空间。
其中,预设的特征空间可以指根据实际应用场景设置的特征空间。例如,预设的特征空间可以为W空间等。
可以理解的是,由多个原始特征映射得到多个图像特征的过程,实际上也是将多个原始特征构成的初始特征空间变换得到多个预设的图像特征构成的预设的特征空间。例如,可以由服从高斯分布的随机数生成器生成多个z向量(即多个原始特征),再将z向量由Z空间变换至W空间,以得到多个w向量(即多个预设的图像特征)。
可选地,为了控制生成的图像的风格,预设的特征空间为W空间,W空间是图像特征空间的一个子集,其中的向量的相互关系更加线性。例如,可以通过高斯分布采样得到Z空间,Z空间是一个高斯分布空间,可以将Z空间变换得到W空间,W空间的w向量在生成图像的过程中,可以向后传递给图像生成网络,以得到多个控制向量,使该控制向量的不同元素能够控制不同的视觉特征,以控制生成的图像的风格。例如,通常由z向量所生成图像的风格较为固定,然而通过调整w向量可以改变生成图像的风格,例如,通过对w向量的多次调整,可以将图像从A风格调整为B风格。以此,通过w向量作为预设的图像特征,可以逐渐改变w向量所生成图像的风格,使其与待优化图像风格相似,提升图像的优化效果。
可选地,可以通过映射网络(Mapping)进行映射,该映射网络可以包括多个全连接层。将多个原始特征输入映射网络后,经多个全连接层处理后可以得到多个预设的图像特征。例如,可以采样得到M个z向量其中,为高斯分布,该M个z向 量构成Z空间。将M个z向量输入映射网络中,由W=Mapping(Z)得到M个w向量,该M个w向量构成W空间,其中,Mapping(.)表示映射网络的处理过程。具体地,映射网络的处理过程可以表征为其中φMapping表示映射网络,c是指定的类别,为M个w向量。
120、从多个预设的图像特征中,选取目标特征,目标特征与待优化图像满足预设相似度条件。
在本申请实施例中,为了获取用于生成优化后图像的输入向量,可以在多个图像特征中找到对应待优化图像的隐含特征(即目标特征),并以该隐含特征作为向量搜索的起点,以确定输入向量。而且,目标特征与待优化图像满足预设相似度条件,能缩短反演搜索的搜索起点与目标搜索结果之间的距离,减少搜索难度,提升搜索效率。
其中,预设相似度条件可以指根据实际应用场景设置的相似度条件。例如,预设相似度条件可以指与待优化图像的相似度大于相似度阈值,或与待优化图像的相似度满足预设排序如相似度为最高等。
例如,可以通过判断所有预设的图像特征与待优化图像是否满足预设相似度,以确定目标特征。也可以通过聚类等方法从多个预设的图像特征中筛选得到部分的图像特征,再判断该部分的图像特征与待优化图像是否满足预设相似度,以确定目标特征。
在一些实施方式中,可以通过聚类处理对多个预设的图像特征进行分类,并从分类后的中心特征中确定目标特征,以减少要判断是否满足相似度条件的特征的数量,提升确定目标特征的效率。具体地,从多个预设的图像特征中,选取目标特征,包括:
对多个预设的图像特征进行聚类处理,得到多个特征簇,特征簇包括中心特征;
从多个特征簇的中心特征中,选取目标特征。
其中,聚类处理可以指将所有预设的图像特征分成由类似的预设的图像特征组成的多个类的过程。特征簇可以为聚类得到一个类,中心特征可以指聚类得到的类的中心即质心。聚类处理的方法可以包括K-Means算法、DBSCAN算法、BIRCH算法等,本申请实施例对聚类处理采用的参数如聚类半径等不做限定,可以根据实际应用场景设置。
例如,可以采用K-Means算法,将M个w向量进行聚类,得到N个特征簇,以及N个类的N个质心并可以将N个质心中,与待优化图像的相似度为最高相似度的质心确定为目标特征。
在一些实施方式中,可以通过比较中心特征对应的图像与待优化图像之间的相似度,来确定目标特征,以增加确定的目标特征的准确性。具体地,从多个特征簇的中心特征中,选取目标特征,包括:
将中心特征输入图像生成网络,生成中心图像;
从中心图像中,确定目标图像,目标图像为与待优化图像满足预设相似度的中心图像;
将与目标图像对应的中心特征,确定为目标特征。
例如,可以将N个中心特征输入图像生成网络,由该图像生成网络经 处理后,输出N个中心图像其中 表示N个中心图像。并计算各中心图像与待优化图像Id的图像相似度,将各中心图像中,图像相似度为最高相似图的中心图像确定为目标图像。生成该目标图像的中心特征,即为目标特征。
在一些实施方式中,可以将与待优化图像之间的特征距离最近的中心图像所对应的特征,确定为目标特征,使目标特征靠近调整目标(目标搜索结果),以缩短目标特征与调整目标之间的距离。具体地,从中心图像中,确定目标图像,包括:
计算中心图像与待优化图像之间的特征距离;
将与待优化图像之间的特征距离最短的中心图像,确定为目标图像。
其中,特征距离可以为欧式距离、余弦距离(Cosine Distance)、绝对值距离、切式距离、明式距离或马氏距离等。
例如,如图1c所示,若采用图中的方法1,即对M个W空间的w向量取平均得到wavg,然后以wavg为起点,即令根据损失函数迭代更新向量以迭代S次后的向量作为最终反演搜索的结果向量,以wavg为起点开始搜索,若以方法1中的wt为目标搜索结果,那wavg与wt在空间上有一定距离,因此搜索难度大。
然而,如图1c所示,采用图中的方法2,即本申请实施例的对M个W空间的w向量进行聚类方法,通过随机采样得到多个原始特征的向量,并对这些采样出来的向量进行聚类,得到四个聚类中心(质心)显然,这四个聚类质心中,距离目标搜索结果wt最近的质心生成的图像与目标搜索结果对应的图像的相似度最高,距离目标搜索结果wt最远的质心生成的图像与目标搜索结果对应的图像的相似度最低。由于待优化图像与目标搜索结果对应的图像的区别仅在于图像质量不同,以此,可以通过分别比较这四个质心对应的图像与待优化图像的相似度,找到相似度最高的图像即质心生成的图像,并以该质心为起点进行反演搜索,显然该质心与目标搜索结果wt在空间之间的距离最短。以此,本申请实施例的聚类方法,能缩短搜索起点与目标搜索结果之间的距离,减少搜索难度,提升搜索效率,提升调整初始偏移参数的效率。
可选地,为了提升确定的目标特征的准确性。计算中心图像与待优化图像之间的特征距离,包括:
分别对中心图像以及待优化图像进行特征提取,得到第一特征以及第二特征;
计算第一特征以及第二特征之间的特征距离。
例如,可以通过特征提取网络分别对中心图像以及待优化图像进行特征提取,得到第一特征以及第二特征,并计算第一特征以及第二特征之间的欧式距离或余弦距离,将与待优化图像的第二特征距离最近的第一特征对应的中心图像,确定为目标图像。如,在N个质心生成的N个中心图像中,第K个图像与待优化图像的特征距离最近,则该图像对应的向量即为目标特征。
其中,特征提取网络可以指用于图像特征提取的神经网络,例如,特征提取网络可以包括卷积网络(CNN)、前馈神经网络(BP)、循环网络(RNN)等中的一个或多个。
130、将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数。
例如,可以计算目标特征以及待优化图像之间的差异如相似度、损失值等,根据该差异来调整初始偏移参数可得到目标偏移参数,使用目标偏移参数生成的优化后的图像,其输入向量向调整目标(目标搜索结果)靠近。例如,可以由初始调整参数对目标特征进行第一次调整,以得到图像生成网络的输入向量,并通过多次调整过程,使输入向量可以不断学习待优化图像中隐含的特征(隐向量),使输入向量不断变化,并使输入向量向待优化图像靠近,以增加优化后的图像的真实性。
其中,偏移参数可以指用于调整特征,以减小与调整目标之间差异的参数。初始偏移参数可以指根据应用场景或经验设置的、用于调整目标特征的偏移参数。例如可以设置初始偏移参数为0。
可以理解的是,由于目标特征是从预设的图像特征中确定的,其与待优化图像之间存在差异,以此可以通过引入初始偏移参数来减小该差异对生成的优化后图像的影响。例如,可以通过引入偏移项woff,将目标特征调整得到该偏移项woff的初始值即为初始偏移参数。若对偏移项woff进行了至少一次调整,则可以将最后一次调整得到的woff的值作为目标偏移参数。
在一些实施方式中,可以通过计算由目标特征以及初始偏移参数生成的降质(劣化)后的第二图像与待优化图像之间的损失值,来调整初始偏移参数,以此使用于生成优化后的图像的输入向量向调整目标靠近。此外,在计算损失值的过程中,加入对初始偏移参数的约束条件,以限制反演搜索的范围。具体地,根据图像生成网络、目标特征以及待优化图像,调整初始偏移参数,得到目标偏移参数,包括:
将目标特征以及初始偏移参数输入图像生成网络,生成第一图像;
对第一图像进行图像劣化处理,得到第二图像;
基于对初始偏移参数的约束条件,对待优化图像以及第二图像进行计算,得到偏移损失值;
根据偏移损失值,调整初始偏移参数,得到目标偏移参数。
其中,约束条件可以包括强制约束如等式约束、直接截断约束(限制最大最小范围),软约束如L1约束、L2约束等约束条件。例如,可以通过带约束条件的损失函数计算待优化图像以及第二图像的损失值,在损失函数中加入约束条件,可以防止模型训练的过拟合,进而增强泛化能力,避免优化后的图像失真。
其中,损失函数可以包括但不限于结构相似性指数(SSIM)损失函数、学***方项损失函数等中的一种或多种的组合。其中,对初始偏移参数的约束条件可以指用于约束偏移参数的条件。
例如,在反演搜索过程中,可以先由目标特征以及初始偏移参数作为图像生成网络的输入向量,由图像生成网络输出第一图像。在对第一图像进行降质处理后,通过带约束条件的损失函数计算第二图像以及待优化图像的损失值即偏移损失值。再根据偏移损失值调整初始偏移参数,使用于生成优化后的图像的输入向量向调整目标靠近,使第二图像向待 优化图像靠近,直至损失函数收敛。
例如,如图1c所示,若采用图中的方法1,由于该在搜索时没有加任何限制,导致搜索结果容易得到一个局部最优解,如图中的方法1的wre的结果,虽然纹理上与目标结果wt接近,但颜色上有一定差距。
然而,本申请实施例在以质心为起点的反演搜索中,用正则化方法限制了反演搜索范围,以此使目标搜索结果与目标结果在颜色和纹理上都接近,使生成的图像实现质量-失真平衡,得到高质量且与输入图像接近的图像。
可选地,可以通过偏移损失值迭代优化初始偏移参数,直至损失函数收敛,得到目标偏移参数,以通过多次迭代使用于生成优化后的图像的输入向量向调整目标靠近,以优化得到表达更为精准的偏移参数。具体地,根据偏移损失值,调整初始偏移参数,得到目标偏移参数,包括:
根据偏移损失值,调整初始偏移参数,得到中间偏移参数;
将中间偏移参数确定为初始偏移参数,返回执行步骤将目标特征以及初始偏移参数输入图像生成网络,生成第一图像,至步骤根据偏移损失值,调整初始偏移参数,得到中间偏移参数,直至偏移损失值收敛,将最后一次调整得到的偏移参数确定为目标偏移参数。
例如,如图1d所示调整初始偏移参数的流程,在迭代优化初始偏移参数时,每次迭代过程中,可以结合目标特征以及上一次调整得到的偏移参数生成第一图像。并基于生成的第一图像降质得到第二图像,以通过损失函数计算第二图像以及待优化图像得到偏移损失值,再根据该损失值调整上一次调整得到的偏移参数,直至损失函数收敛,将最后一次调整得到的偏移参数作为目标偏移参数。
在一些实施方式中,通过正则化处理的初始偏移参数来限制偏移参数的范围,以提高调整初始偏移参数的效率和准确性。具体地,对初始偏移参数的约束条件包括偏移参数约束项,基于对初始偏移参数的约束条件,对待优化图像以及第二图像进行计算,得到偏移损失值,包括:
对待优化图像以及第二图像进行计算,得到第一损失项;
对初始偏移参数进行正则化处理,得到偏移参数约束项;
通过偏移参数约束项约束第一损失项,得到偏移损失值。
其中,正则化处理可以指向要优化的参数添加约束的方法。
例如,用于计算偏移损失值的损失函数可以为Lop=LLPIPS(Id,D(Isyn))+λ1L2(Id,D(Isyn))+λ2reg,其中,LLPIPS(Id,D(Isyn))+λ1L2(Id,D(Isyn))为第一损失项,LLPIPS为LPIPS损失函数,λ2reg为偏移参数约束项,L2为平方损失函数,λ1与λ2为超参数,reg表示对偏移参数进行正则化处理,reg=||woff||2
140、将目标特征以及目标偏移参数输入图像生成网络,生成优化后的图像。
例如,本申请实施例可以以目标特征作为反演搜索的起点,依据待优化图像,通过反演搜索不断调整偏移参数,以得到目标偏移参数。再由目标特征和目标偏移参数得到用于 生成优化后的图像的输入向量,并由该输入向量生成优化后的图像。可以理解的是,若仅以待优化图像中的特征来生成优化后的图像,其对图像中视觉特征的控制能力受限于待优化图像中的特征。然而,本申请实施例基于由预设的图像特征确定的目标特征,能够减少特征之间的关联性,提升对图像中视觉特征的控制能力,提升图像质量。
例如,可以由生成网络,经由目标特征以及目标偏移参数woff生成优化后的图像Isyn,其中,φSynthesis表示图像生成网络。
在一些实施方式中,在调整偏移参数后,可以固定偏移参数,以调整图像生成网络,以优化图像生成网络,提升生成的优化后的图像的质量。具体地,根据偏移损失值,调整初始偏移参数,得到目标偏移参数之后,还包括:
根据目标特征、目标偏移参数以及待优化图像,调整图像生成网络的网络参数,得到调整后的图像生成网络。
可选地,为了进一步优化图像生成网络,在执行步骤根据目标特征、目标偏移参数以及待优化图像,调整图像生成网络的网络参数,得到调整后的图像生成网络之后,可以将调整后的图像生成网络确定为初始图像生成网络,返回执行步骤根据图像生成网络、目标特征以及待优化图像,调整初始偏移参数,得到目标偏移参数,以此交替执行步骤根据目标特征、目标偏移参数以及待优化图像,调整图像生成网络的网络参数,得到调整后的图像生成网络,以及步骤根据图像生成网络、目标特征以及待优化图像,调整初始偏移参数,得到目标偏移参数,直至满足预设的结束条件。
预设的结束条件可以为根据应用场景设置的结束条件,例如,预设的结束条件可以为交替执行上述步骤的次数达到阈值,也可以为调整初始偏移参数过程中的损失函数、和/或调整图像生成网络的网络参数过程中的损失函数收敛到损失阈值或者等于零等。
需说明的是,在交替执行上述步骤的过程中,每次交替过程可以调整一次或调整多次初始偏移参数和/或图像生成网络的网络参数。
例如,可以先目标特征以及待优化图像,对初始偏移参数进行一次调整,得到目标偏移参数,再根据目标特征、目标偏移参数以及待优化图像,对图像生成网络的网络参数进行一次调整,得到调整后的图像生成网络,再将目标偏移参数作为初始偏移参数,将调整后的图像生成网络作为图像生成网络,重复对初始偏移参数进行一次调整以及对图像生成网络的网络参数进行一次调整的过程,以此类推,重复交替执行对初始偏移参数和图像生成网络的调整过程,直至损失函数收敛。
再如,也可以在每次交替过程中,对初始偏移参数进行多次迭代调整,直至满足预设的迭代次数或偏移损失值对应的损失函数收敛到第一损失阈值,以及对图像生成网络的网络参数进行多次迭代调整,直至满足预设的迭代次数或网络损失值对应的损失函数收敛到第二损失阈值,以此,重复交替执行对初始偏移参数和图像生成网络的调整过程,直至交替执行上述步骤的次数达到阈值、或偏移损失值对应的损失函数以及网络损失值对应的损失函数收敛到第三损失阈值。
在一些实施方式中,在调整偏移参数后,可以通过计算由目标特征以及初始偏移参数 生成的降质后的第四图像与待优化图像之间的损失值,来调整图像生成网络的参数,以不断优化图像生成网络。此外,在计算损失过程中加入对初始偏移参数的约束条件,以限制参数范围,避免因过度拟合导致优化后的图像失真。在将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数之后,包括:
将目标特征以及目标偏移参数输入图像生成网络,生成第三图像;
对第三图像进行图像劣化处理,得到第四图像;
基于对图像生成网络的约束条件,对待优化图像以及第四图像进行计算,得到网络损失值;
根据网络损失值,调整图像生成网络的网络参数,得到调整后的图像生成网络,调整后的图像生成网络用于生成优化后的图像。
其中,对图像生成网络的约束条件可以指用于约束图像生成网络的网络参数的条件。
例如,在将初始偏移参数调整为目标偏移参数后,可以固定偏移参数,只优化图像生成网络的参数。以此,可以将目标特征以及目标偏移参数作为图像生成网络的输入向量,由图像生成网络输出第三图像。在对第三图像进行降质处理后,通过带约束条件的损失函数计算第四图像以及待优化图像的损失值即网络损失值。再根据网络损失值调整图像生成网络的网络参数,直至损失函数收敛。
可选地,可以通过网络损失值迭代调整图像生成网络的网络参数,直至损失函数收敛,得到调整后的图像生成网络,以得到较优的图像生成网络。具体地,根据网络损失值,调整图像生成网络的网络参数,得到调整后的图像生成网络,包括:
根据网络损失值,调整当前图像生成网络的网络参数,得到中间图像生成网络;
将中间图像生成网络确定为当前图像生成网络,返回执行步骤将目标特征以及目标偏移参数输入图像生成网络,生成第三图像,至步骤根据网络损失值,调整当前图像生成网络的网络参数,得到调整后的图像生成网络,直至偏移损失值收敛,将最后一次调整得到的中间图像生成网络确定为调整后的图像生成网络。
其中,当前图像生成网络可以指调整过程中当前要调整网络参数的图像生成网络。
例如,如图1e所示的调整图像生成网络的网络参数的流程,在迭代调整图像生成网络的网络参数时,每次迭代过程中,可以将目标特征以及目标偏移参数输入当前图像生成网络,生成第三图像。并基于生成的第三图像降质得到第四图像,以通过损失函数由第四图像以及待优化图像计算得到网络损失值,再根据该损失值调整当前图像生成网络的网络参数。并开始下一次迭代过程,将上次迭代过程中调整后的图像生成网络作为当前图像生成网络,以此类推,直至损失函数收敛,将最后一次调整得到的图像生成网络作为调整后的图像生成网络。
在一些实施方式中,可以通过初始图像生成网络以及当前图像生成网络之间的差异来限制网络参数的范围,以提高调整网络参数的效率和准确性。具体地,对当前图像生成网络的约束条件包括网络约束项,基于对图像生成网络的约束条件,对待优化图像以及第四图像进行计算,得到网络损失值,包括:
对待优化图像以及第四图像进行计算,得到第二损失项;
对初始图像生成网络的输出结果以及当前图像生成网络的输出结果进行计算,得到网络约束项;
通过网络约束项约束第二损失项,得到网络损失值。
其中,初始图像生成网络可以指未调整网络参数的图像生成网络。例如,在通过多次迭代过程得到调整后的图像生成网络中,第一次迭代过程中的当前图像生成网络即为初始图像生成网络。
例如,用于计算网络损失值的损失函数可以为Lft=LLPIPS(Id,D(Isyn))+λL2L2(Id,D(Isyn))+λRLR,其中,LLPIPS(Id,D(Isyn))+λL2L2(Id,D(Isyn))为第二损失项,LLPIPS为LPIPS损失函数,λRLR为网络约束项,λL2与λR为超参数。
在一些实施方式中,可以通过比较初始图像生成网络以及当前图像生成网络生成的图像之间的差异,以确定网络约束项。具体地,对初始图像生成网络的输出结果以及当前图像生成网络的输出结果进行计算,得到网络约束项,包括:
将目标特征以及目标偏移参数输入初始图像生成网络,生成初始图像,并将目标特征以及目标偏移参数输入当前图像生成网络,生成当前图像;
对初始图像以及当前图像进行计算,得到网络约束项。
例如,网络损失项λRLR中的LR为局部正则项,可以表示为 其中,为超参数,xr=φSynthesis(wr;θ)表示采用初始图像生成网络生成的图像(即初始图像),表达采用当前图像生成网络生成的图像(即当前图像)。
本申请实施例提供的图像优化方案可以应用在各种图像优化场景中。比如,以图像修复为例,获取图像生成网络、待优化图像以及多个预设的图像特征;从多个预设的图像特征中,选取目标特征,目标特征与待优化图像满足预设相似度条件;根据图像生成网络、目标特征以及待优化图像,调整初始偏移参数,得到目标偏移参数;将目标特征以及目标偏移参数输入图像生成网络,生成优化后的图像。
由上可知,本申请实施例从多个预设的图像特征中选取对应待优化图像的目标特征,可以以目标特征为起点,结合目标偏移参数,确定用于生成优化后的图像的特征,以生成优化后的图像。其中,基于由预设的图像特征确定的目标特征,能够减少特征之间的关联性,提升对图像中视觉特征的控制能力,以提升图像的优化效果;通过调整初始偏移参数,使用于生成优化后的图像的输入向量向调整目标靠近,增加优化后图像的真实性,提升图像的优化效果。而且,目标特征与待优化图像满足预设相似度条件,能够减小目标特征与优化后的图像的特征之间的距离,减小调整初始偏移参数的难度,提升图像优化效率。
根据上述实施例所描述的方法,以下将作进一步详细说明。
在本实施例中,将以采用StyleGAN-XL网络进行图像优化为例,对本申请实施例的方 法进行详细说明。
StyleGAN-XL网络是一种可以生成高分辨率且种类丰富的图像的生成对抗网络。本申请实施例以StyleGAN-XL网络作为图像生成网络。如图2a所示,StyleGAN-XL网络可以包括Mapping network(映射网络)以及Synthesis network(生成网络),映射网络可以用于将z向量变换为w向量,生成网络可以用于生成图像,该生成网络即为本申请实施例中的图像生成网络。
本申请实施例使用的StyleGAN-XL网络是预训练在ImageNet上的,也就是该图像生成网络能根据指定的ImageNet中的类别,生成对应类别的图像。其中,ImageNet是一个用于视觉对象识别软件研究的大型可视化数据库。ImageNet数据集中有1024种类别,也就意味着StyleGAN-XL能生成1024种不同类别的图像。
如图2b所示,一种图像优化方法具体流程如下:
210、对原始图像进行图像劣化处理,得到待优化图像。
例如,给定一张输入的降质图像Id(待优化图像),它是由高清图像(原始图像)降质而来,即Id=D(I),D(.)为降质过程,I为高清图像,φSynthesis表示StyleGAN-XL的生成网络。
220、对多个预设的图像特征进行聚类处理,得到多个特征簇,特征簇包括中心特征。
例如,本申请实施例的图像优化方法的目标是找到隐向量w,使其满足:w=argmin L(D(φSynthesis(w)),Id),其中L(.)表示图像中的距离度量或要素空间,argmin表示使L(.)的值最小。
为了找到隐向量w,可以先找到初始的搜索起点,也就是初始质心(即目标特征)。可以先获取M个W空间的w向量(即多个预设的图像特征),w向量可以从StyleGAN-XL的mapping网络φMapping得到:其中,为高斯分布,c是指定的类别。
230、从多个特征簇的中心特征中,选取目标特征。
例如,可以把(即多个预设的图像特征)采用K-Means方法进行聚类,得到N个质心(即中心特征)。然后把N个质心输入到图像生成网络中去获得N个中心图像:
对于给定的输入图像Id,可以对这N个图像找到离Id距离“最近”的图像。例如,可以采用特征空间来衡量两个图像的距离,如可以采用视觉几何组(VGG)网络提取图像的特征,然后计算提取的特征的欧式距离或者cosine距离来找到距离输入图像“最近”的图像。假设N个图像中,第k张图像是“最近”的图像,则该图像所对应向量即是要优化的隐向量(即初始的搜索起点)。
240、将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数。
例如,本申请实施例并没有直接优化初始的隐向量而是将该隐向量固定引入一 项偏移项woff,以优化偏移项,该偏移项的初始值即为初始偏移参数。可以由隐向量以及偏移项得到隐向量并可以以该隐向量作为输入向量输入图像生成网络,并进行迭代训练,以输出图像
在训练迭代中,可以将woff引入正则化:reg=||woff||2,以将正则化约束会体现在迭代训练的损失函数中。
250、根据目标特征、目标偏移参数以及待优化图像,调整图像生成网络的网络参数,得到调整后的图像生成网络。
例如,在迭代训练中,可以将迭代训练分别两个阶段,如图2c所示的迭代训练过程,第一阶段可以固定图像生成网络φSynthesis的网络参数θ,只优化woff参数(偏移项),即步骤240。第二阶段可以固定woff参数(偏移项),只优化网络参数θ,即步骤250。在训练过程中,两个阶段交替重复进行,直到损失函数收敛之后,停止训练。
第一个阶段的损失函数如下:
Lop=LLPIPS(Id,D(Isyn))+λ1L2(Id,D(Isyn))+λ2reg;
其中,LLPIPS为计算LPIPS指标的函数,L2为平方损失函数,λ1与λ2为超参数。
第二个阶段的损失函数如下:
Lft=LLPIPS(Id,D(Isyn))+λL2L2(Id,D(Isyn))+λRLR
其中,λL2与λR为超参数,LR为局部正则项,表示如下:
其中,为超参数,xr=φSynthesis(wr;θ)表示采用原始的网络参数生成的图像(即初始图像),表达采用当前的网络参数生成的图像(即当前图像),wr表示随机潜在向量和关键潜在向量之间的插值码,LL2为均方误差。
第一阶段的具体实现流程可以参见如图1d所示的流程,第二阶段的具体实现流程可以参见如图1e所示的流程,以及前述实施例中相应的描述,在此不再赘述。
260、将目标特征以及目标偏移参数输入调整后的图像生成网络,生成优化后的图像。
例如,在两个阶段的损失函数均收敛后,可以将最后一次迭代生成的图像作为优化后的图像。可以理解的是,该最后一次迭代中,隐向量输入的图像生成网络即为调整后的图像生成网络,输入该调整后的图像生成网络的隐向量中的偏移项对应的参数值即为目标偏移参数。
例如,将本申请实施例的图像优化方法、以及基于StyleGAN-XL网络的PULSE(基于隐式空间的图像超分辨率算法)方法、DGP(基于图像的先验概率分布)方法、PTI(关键调谐反转)方法等优化方法进行比较,可以得到如图2d所示以及如图2e所示的结果,图中GT表示高质量的参照图像(降质处理前的原始图像)。
如图2d所示,展示了不同优化方法生成的优化后的图像,每一行表示的是输入不同降质情况的图片,并且使用不同的方法进行反演,得到StyleGAN-XL网络输出的优化后的图 像。第一行表示的是将图像中间去除一块信息,通过反演技术将中间缺失的信息填补上;第二行表示的是将图像的颜色信息去除,通过反演技术将图像的颜色填充上;第三行表示的是将图像下采样成低分辨率图像,通过反演技术生成对应的高分辨率图像。从图2d中可以看出,本申请实施例的图像优化方法相对比其他方法而言,填充的细节更准确,颜色与真实情况(参照图像)更贴近,纹理细节更丰富。
如图2e所示,展示了不同优化方法在不同修复任务以及不同指标上的对比结果。该图中比较了本申请实施例的图像优化方法、基于StyleGAN-XL网络的DGP方法与基于StyleGAN-XL网络的PTI方法的指标,分别在三种不同的图像降质修复任务上做对比,包含了图像补全(inpainting)、图像上色(colorization)、图像超分辨率(SR)。在这三个任务上,本申请实施例的图像优化方法的LPIPS(图像感知相似度)指标、FID(图像质量评估)指标、NIQE(无参考图像评价)指标均达到最优。
由上可知,现有优化方法通过反演得到的图像与实际目标结果(参照图像)差距较大,特别是针对输入图像为降质图像的情况,这些搜索到的反演结果往往较差。例如,DGP方法是在BigGAN(大规模生成对抗网络)上进行反演,BigGAN只能生成256×256分辨率的图像,并且DGP方法用在其他生成网络上效果不佳。然而,本申请实施例采用StyleGAN-XL网络的生成网络作为图像生成网络,该网络可以生成高分辨率且种类丰富的图像,通过针对该网络进行反演,使得针对任意图像,都能反演出对应的输入向量以及生成对应的高质量高分辨率图像。以此,本申请实施例在给定一张图像或者降质图像(降质图像指图像有噪声、色彩缺失、细节缺失、分辨率低等图像)时,可以找到对应的隐空间中的输入向量,使得该输入向量送入到生成网络中,能生成相似且高质量的图像(即优化后的图像)。
为了更好地实施以上方法,本申请实施例还提供一种图像优化装置,该图像优化装置具体可以集成在电子设备中,该电子设备可以为终端、服务器等设备。其中,终端可以为手机、平板电脑、智能蓝牙设备、笔记本电脑、个人电脑等设备;服务器可以是单一服务器,也可以是由多个服务器组成的服务器集群。
比如,在本实施例中,将以图像优化装置具体集成在服务器为例,对本申请实施例的方法进行详细说明。
例如,如图3所示,该图像优化装置可以包括获取单元310、确定单元320、调整单元330以及生成单元340,如下:
(一)获取单元310
用于获取图像生成网络、待优化图像以及多个预设的图像特征。
在一些实施方式中,获取单元310具有可以用于:
根据随机变量的分布特征类型,采样得到多个原始特征;
将多个原始特征映射到预设的特征空间中,得到多个预设的图像特征。
在一些实施方式中,获取单元310具有可以用于:
获取原始图像;
对原始图像进行图像劣化处理,得到待优化图像。
(二)确定单元320
用于从多个预设的图像特征中,选取目标特征,目标特征与待优化图像满足预设相似度条件。
在一些实施方式中,确定单元320具体可以用于:
对多个预设的图像特征进行聚类处理,得到多个特征簇,特征簇包括中心特征;
从多个特征簇的中心特征中,选取目标特征。
在一些实施方式中,从多个特征簇的中心特征中,选取目标特征,包括:
将中心特征输入图像生成网络,生成中心图像;
从中心图像中,确定目标图像,目标图像为与待优化图像满足预设相似度的中心图像;
将与目标图像对应的中心特征,确定为目标特征。
在一些实施方式中,从中心图像中,确定目标图像,包括:
计算中心图像与待优化图像之间的特征距离;
将与待优化图像之间的特征距离最短的中心图像,确定为目标图像。
(三)调整单元330
用于将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数。
在一些实施方式中,调整单元330具体可以用于:
将目标特征以及初始偏移参数输入图像生成网络,生成第一图像;
对第一图像进行图像劣化处理,得到第二图像;
基于对初始偏移参数的约束条件,对待优化图像以及第二图像进行计算,得到偏移损失值;
根据偏移损失值,调整初始偏移参数,得到目标偏移参数。
在一些实施方式中,对初始偏移参数的约束条件包括偏移参数约束项,基于对初始偏移参数的约束条件,对待优化图像以及第二图像进行计算,得到偏移损失值,包括:
对待优化图像以及第二图像进行计算,得到第一损失项;
对初始偏移参数进行正则化处理,得到偏移参数约束项;
通过偏移参数约束项约束第一损失项,得到偏移损失值。
在一些实施方式中,调整单元330还可以用于:
将目标特征以及目标偏移参数输入图像生成网络,生成第三图像;
对第三图像进行图像劣化处理,得到第四图像;
基于对图像生成网络的约束条件,对待优化图像以及第四图像进行计算,得到网络损失值;
根据网络损失值,调整图像生成网络的网络参数,得到调整后的图像生成网络,调整后的图像生成网络用于生成优化后的图像。
在一些实施方式中,对当前图像生成网络的约束条件包括网络约束项,基于对图像生成网络的约束条件,对待优化图像以及第四图像进行计算,得到网络损失值,包括:
对待优化图像以及第四图像进行计算,得到第二损失项;
对初始图像生成网络的输出结果以及当前图像生成网络的输出结果进行计算,得到网 络约束项;
通过网络约束项约束第二损失项,得到网络损失值。
在一些实施方式中,对初始图像生成网络的输出结果以及当前图像生成网络的输出结果进行计算,得到网络约束项,包括:
将目标特征以及目标偏移参数输入初始图像生成网络,生成初始图像,并将目标特征以及目标偏移参数输入当前图像生成网络,生成当前图像;
对初始图像以及当前图像进行计算,得到网络约束项。
(四)生成单元340
用于将目标特征以及目标偏移参数输入图像生成网络,生成优化后的图像。
具体实施时,以上各个单元可以作为独立的实体来实现,也可以进行任意组合,作为同一或若干个实体来实现,以上各个单元的具体实施可参见前面的方法实施例,在此不再赘述。
由此,本申请实施例可以从多个预设的图像特征中选取对应待优化图像的目标特征,并通过调整得到目标偏移参数,可以由目标特征结合目标偏移参数,生成优化后的图像,以提升图像的优化效果。
本申请实施例还提供一种电子设备,该电子设备可以为终端、服务器等设备。其中,终端可以为手机、平板电脑、智能蓝牙设备、笔记本电脑、个人电脑,等等;服务器可以是单一服务器,也可以是由多个服务器组成的服务器集群,等等。
在一些实施例中,该图像优化装置还可以集成在多个电子设备中,比如,图像优化装置可以集成在多个服务器中,由多个服务器来实现本申请的图像优化方法。
在本实施例中,将以本实施例的电子设备是服务器为例进行详细描述,比如,如图4所示,其示出了本申请实施例所涉及的服务器的结构示意图,具体来讲:
该服务器可以包括一个或者一个以上处理核心的处理器410、一个或一个以上计算机可读存储介质的存储器420、电源430、输入模块440以及通信模块450等部件。本领域技术人员可以理解,图4中示出的服务器结构并不构成对服务器的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:
处理器410是该服务器的控制中心,利用各种接口和线路连接整个服务器的各个部分,通过运行或执行存储在存储器420内的软件程序和/或模块,以及调用存储在存储器420内的数据,执行服务器的各种功能和处理数据。在一些实施例中,处理器410可包括一个或多个处理核心;在一些实施例中,处理器410可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作***、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器410中。
存储器420可用于存储软件程序以及模块,处理器410通过运行存储在存储器420的软件程序以及模块,从而执行各种功能应用以及数据处理。存储器420可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作***、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据服务器的使用所创建的数据等。此外,存储器420可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一 个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器420还可以包括存储器控制器,以提供处理器410对存储器420的访问。
服务器还包括给各个部件供电的电源430,在一些实施例中,电源430可以通过电源管理***与处理器410逻辑相连,从而通过电源管理***实现管理充电、放电、以及功耗管理等功能。电源430还可以包括一个或一个以上的直流或交流电源、再充电***、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。
该服务器还可包括输入模块440,该输入模块440可用于接收输入的数字或字符信息,以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。
该服务器还可包括通信模块450,在一些实施例中通信模块450可以包括无线模块,服务器可以通过该通信模块450的无线模块进行短距离无线传输,从而为用户提供了无线的宽带互联网访问。比如,该通信模块450可以用于帮助用户收发电子邮件、浏览网页和访问流式媒体等。
尽管未示出,服务器还可以包括显示单元等,在此不再赘述。具体在本实施例中,服务器中的处理器410会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行文件加载到存储器420中,并由处理器410来运行存储在存储器420中的应用程序,从而实现各种功能,如下:
获取图像生成网络、待优化图像以及多个预设的图像特征;从多个预设的图像特征中,选取目标特征,目标特征与待优化图像满足预设相似度条件;将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数;将目标特征以及目标偏移参数输入图像生成网络,生成优化后的图像。
以上各个操作的具体实施可参见前面的实施例,在此不再赘述。
由上可知,本申请实施例可以从多个图像特征中选取对应待优化图像的目标特征,并通过调整得到目标偏移参数,可以由目标特征结合目标偏移参数,生成优化后的图像,以提升图像的优化效果。
本领域普通技术人员可以理解,上述实施例的各种方法中的全部或部分步骤可以通过指令来完成,或通过指令控制相关的硬件来完成,该指令可以存储于一计算机可读存储介质中,并由处理器进行加载和执行。
为此,本申请实施例提供一种计算机可读存储介质,其中存储有多条指令,该指令能够被处理器进行加载,以执行本申请实施例所提供的任一种图像优化方法中的步骤。例如,该指令可以执行如下步骤:
获取图像生成网络、待优化图像以及多个预设的图像特征;从多个预设的图像特征中,选取目标特征,目标特征与待优化图像满足预设相似度条件;将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数;将目标特征以及目标偏移参数输入图像生成网络,生成优化后的图像。
其中,该存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取记 忆体(RAM,Random Access Memory)、磁盘或光盘等。
根据本申请的一个方面,提供了一种计算机程序产品,该计算机程序产品包括计算机程序,该计算机程序存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机程序,处理器执行该计算机程序,使得该计算机设备执行上述实施例中提供各种可选实现方式中提供的方法。
由于该存储介质中所存储的计算机程序,可以执行本申请实施例所提供的任一种图像优化方法中的步骤,因此,可以实现本申请实施例所提供的任一种图像优化方法所能实现的有益效果,详见前面的实施例,在此不再赘述。
以上对本申请实施例所提供的一种图像优化方法、装置、电子设备、介质和程序产品进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上,本说明书内容不应理解为对本申请的限制。

Claims (15)

  1. 一种图像优化方法,所述方法由电子设备执行,所述方法包括:
    获取图像生成网络、待优化图像以及多个预设的图像特征;
    从所述多个预设的图像特征中,选取目标特征,所述目标特征与所述待优化图像满足预设相似度条件;
    将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数;
    将所述目标特征以及所述目标偏移参数输入所述图像生成网络,生成优化后的图像。
  2. 如权利要求1所述的图像优化方法,所述从所述多个预设的图像特征中,选取目标特征,包括:
    对所述多个预设的图像特征进行聚类处理,得到多个特征簇,所述特征簇包括中心特征;
    从所述多个特征簇的所述中心特征中,选取目标特征。
  3. 如权利要求2所述的图像优化方法,所述从所述多个特征簇的所述中心特征中,选取目标特征,包括:
    将所述中心特征输入所述图像生成网络,生成中心图像;
    从所述中心图像中,确定目标图像,所述目标图像为与所述待优化图像满足所述预设相似度的所述中心图像;
    将与所述目标图像对应的所述中心特征,确定为目标特征。
  4. 如权利要求3所述的图像优化方法,所述从所述中心图像中,确定目标图像,包括:
    计算所述中心图像与所述待优化图像之间的特征距离;
    将与所述待优化图像之间的特征距离最短的所述中心图像,确定为目标图像。
  5. 如权利要求1-4任意一项所述的图像优化方法,将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数,包括:
    将所述目标特征以及所述初始偏移参数输入所述图像生成网络,生成第一图像;
    对所述第一图像进行图像劣化处理,得到第二图像;
    基于对所述初始偏移参数的约束条件,对所述待优化图像以及所述第二图像进行计算,得到偏移损失值;
    根据所述偏移损失值,调整所述初始偏移参数,得到目标偏移参数。
  6. 如权利要求5所述的图像优化方法,所述对所述初始偏移参数的约束条件包括偏移参数约束项,所述基于对所述初始偏移参数的约束条件,对所述待优化图像以及所述第二图像进行计算,得到偏移损失值,包括:
    对所述待优化图像以及所述第二图像进行计算,得到第一损失项;
    对所述初始偏移参数进行正则化处理,得到偏移参数约束项;
    通过所述偏移参数约束项约束所述第一损失项,得到偏移损失值。
  7. 如权利要求5所述的图像优化方法,所述将所述目标特征以及所述初始偏移参数输 入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数之后,还包括:
    将所述目标特征以及所述目标偏移参数输入所述图像生成网络,生成第三图像;
    对所述第三图像进行图像劣化处理,得到第四图像;
    基于对所述图像生成网络的约束条件,对所述待优化图像以及所述第四图像进行计算,得到网络损失值;
    根据所述网络损失值,调整所述图像生成网络的网络参数,得到调整后的图像生成网络,所述调整后的图像生成网络用于生成所述优化后的图像。
  8. 如权利要求7所述的图像优化方法,所述对所述图像生成网络的约束条件包括网络约束项,所述基于对所述图像生成网络的约束条件,对所述待优化图像以及所述第四图像进行计算,得到网络损失值,包括:
    对所述待优化图像以及所述第四图像进行计算,得到第二损失项;
    对初始图像生成网络的输出结果以及当前图像生成网络的输出结果进行计算,得到网络约束项;
    通过所述网络约束项约束所述第二损失项,得到网络损失值。
  9. 如权利要求8所述的图像优化方法,所述对初始图像生成网络的输出结果以及当前图像生成网络的输出结果进行计算,得到网络约束项,包括:
    将所述目标特征以及所述目标偏移参数输入所述初始图像生成网络,生成初始图像,并将所述目标特征以及所述目标偏移参数输入所述当前图像生成网络,生成当前图像;
    对所述初始图像以及所述当前图像进行计算,得到网络约束项。
  10. 如权利要求1-9任意一项所述的图像优化方法,所述多个预设的图像特征通过如下方式获取:
    根据随机变量的分布特征类型,采样得到多个原始特征;
    将所述多个原始特征映射到预设的特征空间中,得到多个预设的图像特征。
  11. 如权利要求1-10任一项所述的图像优化方法,所述待优化图像通过如下方式获取:
    获取原始图像;
    对所述原始图像进行图像劣化处理,得到待优化图像。
  12. 一种图像优化装置,包括:
    获取单元,用于获取图像生成网络、待优化图像以及多个预设的图像特征;
    确定单元,用于从所述多个预设的图像特征中,选取目标特征,所述目标特征与所述待优化图像满足预设相似度条件;
    调整单元,用于将所述目标特征以及所述初始偏移参数输入所述图像生成网络,根据所述图像生成网络的输出和所述待优化图像确定的差异,调整初始偏移参数,得到目标偏移参数;
    生成单元,用于将所述目标特征以及所述目标偏移参数输入所述图像生成网络,生成优化后的图像。
  13. 一种电子设备,包括处理器和存储器,所述存储器存储有多条指令;所述处理器 从所述存储器中加载指令,以执行如权利要求1~11任一项所述的图像优化方法中的步骤。
  14. 一种计算机可读存储介质,所述计算机可读存储介质存储有多条指令,所述指令适于处理器进行加载,以执行权利要求1~11任一项所述的图像优化方法中的步骤。
  15. 一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现权利要求1~11任一项所述的图像优化方法中的步骤。
PCT/CN2023/120931 2022-10-13 2023-09-25 图像优化方法、装置、电子设备、介质和程序产品 WO2024078308A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP23861677.5A EP4386657A1 (en) 2022-10-13 2023-09-25 Image optimization method and apparatus, electronic device, medium, and program product
US18/421,016 US20240161245A1 (en) 2022-10-13 2024-01-24 Image optimization

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211252059.0A CN117036180A (zh) 2022-10-13 2022-10-13 图像优化方法、装置、电子设备、介质和程序产品
CN202211252059.0 2022-10-13

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/421,016 Continuation US20240161245A1 (en) 2022-10-13 2024-01-24 Image optimization

Publications (1)

Publication Number Publication Date
WO2024078308A1 true WO2024078308A1 (zh) 2024-04-18

Family

ID=88637798

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/120931 WO2024078308A1 (zh) 2022-10-13 2023-09-25 图像优化方法、装置、电子设备、介质和程序产品

Country Status (4)

Country Link
US (1) US20240161245A1 (zh)
EP (1) EP4386657A1 (zh)
CN (1) CN117036180A (zh)
WO (1) WO2024078308A1 (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488865A (zh) * 2020-06-28 2020-08-04 腾讯科技(深圳)有限公司 图像优化方法、装置、计算机存储介质以及电子设备
US20210365710A1 (en) * 2019-02-19 2021-11-25 Boe Technology Group Co., Ltd. Image processing method, apparatus, equipment, and storage medium
CN115131218A (zh) * 2021-03-25 2022-09-30 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机可读介质及电子设备

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210365710A1 (en) * 2019-02-19 2021-11-25 Boe Technology Group Co., Ltd. Image processing method, apparatus, equipment, and storage medium
CN111488865A (zh) * 2020-06-28 2020-08-04 腾讯科技(深圳)有限公司 图像优化方法、装置、计算机存储介质以及电子设备
CN115131218A (zh) * 2021-03-25 2022-09-30 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机可读介质及电子设备

Also Published As

Publication number Publication date
US20240161245A1 (en) 2024-05-16
CN117036180A (zh) 2023-11-10
EP4386657A1 (en) 2024-06-19

Similar Documents

Publication Publication Date Title
US10817805B2 (en) Learning data augmentation policies
CN109949255B (zh) 图像重建方法及设备
WO2019101836A1 (en) Population based training of neural networks
JP2023523029A (ja) 画像認識モデル生成方法、装置、コンピュータ機器及び記憶媒体
JP2023549070A (ja) 意味特徴の学習を介したUnseenドメインからの顔認識
US20190065899A1 (en) Distance Metric Learning Using Proxies
CN114240735B (zh) 任意风格迁移方法、***、存储介质、计算机设备及终端
TWI831016B (zh) 機器學習方法、機器學習系統以及非暫態電腦可讀取媒體
US20240135643A1 (en) Information processing method, computer device, and storage medium
CN113743474A (zh) 基于协同半监督卷积神经网络的数字图片分类方法与***
WO2023231887A1 (zh) 基于张量的持续学习方法和装置
US20230021551A1 (en) Using training images and scaled training images to train an image segmentation model
Song et al. A novel partial point cloud registration method based on graph attention network
Liu et al. Attentive semantic and perceptual faces completion using self-attention generative adversarial networks
US20230072445A1 (en) Self-supervised video representation learning by exploring spatiotemporal continuity
CN114330514A (zh) 一种基于深度特征与梯度信息的数据重建方法及***
CN114648103A (zh) 用于处理深度学习网络的自动多目标硬件优化
WO2024060839A1 (zh) 对象操作方法、装置、计算机设备以及计算机存储介质
CN115795355B (zh) 一种分类模型训练方法、装置及设备
WO2024078308A1 (zh) 图像优化方法、装置、电子设备、介质和程序产品
Huang Image super-resolution reconstruction based on generative adversarial network model with double discriminators
US20200410290A1 (en) Information processing apparatus and information processing method
CN113283530A (zh) 基于级联特征块的图像分类***
Xiao et al. Optimizing generative adversarial networks in Latent Space
CN113111957B (zh) 基于特征去噪的防伪方法、装置、设备、产品及介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2023861677

Country of ref document: EP

Effective date: 20240314