CN116777732A - Image generation method, device, equipment and storage medium based on random noise - Google Patents

Image generation method, device, equipment and storage medium based on random noise Download PDF

Info

Publication number
CN116777732A
CN116777732A CN202310082757.9A CN202310082757A CN116777732A CN 116777732 A CN116777732 A CN 116777732A CN 202310082757 A CN202310082757 A CN 202310082757A CN 116777732 A CN116777732 A CN 116777732A
Authority
CN
China
Prior art keywords
image
noise
sample
superposition
random
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310082757.9A
Other languages
Chinese (zh)
Inventor
邱才明
冯湛搏
朱椿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huagong Future Communication Jiangsu Co ltd
Original Assignee
Huagong Future Communication Jiangsu Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huagong Future Communication Jiangsu Co ltd filed Critical Huagong Future Communication Jiangsu Co ltd
Priority to CN202310082757.9A priority Critical patent/CN116777732A/en
Publication of CN116777732A publication Critical patent/CN116777732A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image generation method, device, equipment and storage medium based on random noise, which only need to learn a correlation matrix of noise by using a neural network model when generating an image, so that the training of the neural network is more stable, the defects of mode collapse and difficult convergence are avoided, and the problem that the generation countermeasure network has no practicability because the neural network is difficult to converge is avoided; meanwhile, compared with the traditional diffusion model which only learns the single variance of noise, the invention uses the pixel correlation matrix as an output parameter, and the separable Gaussian process which considers the correlation between the horizontal pixels and the vertical pixels greatly reserves the capability of the model to learn the correlation between the pixels of the image, so that a clearer and more real image can be generated when the image is generated, and the problem of neglecting the correlation between the pixels in the prior art is improved; therefore, compared with the prior art, the practicability and the image authenticity of the invention are greatly improved.

Description

Image generation method, device, equipment and storage medium based on random noise
Technical Field
The invention belongs to the technical field of image generation based on artificial intelligence, and particularly relates to an image generation method, device and equipment based on random noise and a storage medium.
Background
In recent years, with the development of deep learning and artificial intelligence technology, in the field of image generation based on artificial intelligence algorithms, a number of technologies have emerged, for example, image generation, video generation, and speech generation technologies based on technologies such as generation of an countermeasure network, variation self-encoder, etc., all of which use noise of random sampling to obtain corresponding contents, that is, by controlling distribution of random sampling, generate corresponding types of contents (such as generation of different kinds of images (human, cat, dog, other animals, etc.)); in addition, by combining the natural language processing technology and coupling language features with noise, image content corresponding to input text can be generated, for example, an astronaut in space can be input, and a corresponding picture can be generated; therefore, image generation techniques based on artificial intelligence algorithms have been widely used in different fields.
At present, several image generation methods exist, respectively: generating an antagonism network and a diffusion model; the generation of the countermeasure network is to train two independent networks (a generation network and a discrimination network) to generate an image, namely the generation network is used for transforming random noise into the generated image, the discrimination network is used for judging whether the generated image is real or not, the countermeasure training is carried out by the generation network and the discrimination network, and finally, the generation network can generate very real pictures in a balance state; however, since the countermeasure training is adopted in the generation of the countermeasure network, it is difficult to converge both networks, and in many cases, the generated image is not useful.
The diffusion model gradually degenerates a photo into a pure random image through a forward process (a noise adding process), and then restores the random noise image into an original photo through a reverse process (a noise removing process), wherein the forward process is to determine Gaussian noise with corresponding mean and variance each time overlapped, the continuous overlapping is carried out until the photo becomes the pure random image, and the reverse process is to continuously subtract the noise by learning the mean and variance of each overlapped Gaussian noise until a real image is obtained; however, since the mean and variance of the generated gaussian noise are fixed values, this method cannot learn the correlation between adjacent pixels of the image, thereby limiting the authenticity of the generated image.
In view of the foregoing, it is an urgent need to provide an image generation method that has practical applicability and can improve the authenticity of the generated image.
Disclosure of Invention
The invention aims to provide an image generation method, device, equipment and storage medium based on random noise, which are used for solving the problems that the practicability is not achieved and the generated image is low in authenticity in the prior art.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, there is provided an image generation method based on random noise, including:
the method comprises the steps of obtaining a target neural network model, wherein the target neural network model is obtained by taking noise images of massive sample images as input and noise parameters of each sample image as output and training, the noise parameters of any sample image in the massive sample images comprise a noise matrix corresponding to any sample image, a first correlation matrix and a second correlation matrix, the first correlation matrix is used for representing correlation between transverse adjacent pixels in any sample image, and the second correlation matrix is used for representing correlation between longitudinal adjacent pixels in any sample image;
acquiring a random noise image, and inputting the random noise image into the target neural network model to obtain noise parameters of the random noise image;
and generating a target image by using the target neural network model, the random noise image and noise parameters of the random noise image, wherein the image content of the target image and the image content of any sample image are of the same type.
Based on the disclosure, the invention firstly utilizes the neural network model to learn the mapping relation between the noise image and the noise parameter of the image, wherein the noise parameter comprises a noise matrix, a first correlation matrix and a second correlation matrix, the first correlation matrix is used for representing the correlation between the horizontal adjacent pixels in the sample image, and the second correlation matrix is used for representing the correlation between the vertical adjacent pixels in the sample image; thus, the functional relation between the image noise and the noise matrix and the pixel correlation is learned; then, when model training is completed and in actual use, the noise parameters of the random noise image can be obtained only by inputting the acquired random noise image into a trained network; finally, the noise parameters of the random noise image and the trained network are utilized to continuously restore the random noise image, and then the target image which is the same as the sample image can be obtained.
Through the design, when an image is generated, the correlation matrix of noise is only needed to be learned by using the neural network model, so that the training of the neural network is more stable, the defects of mode collapse and difficult convergence are avoided, and the problem that the traditional generation countermeasure network has no practicability because the model is difficult to converge is solved; meanwhile, compared with the traditional diffusion model which only learns the single variance of noise, the invention uses the pixel correlation matrix as an output parameter, and the separable Gaussian process which considers the correlation between the horizontal pixels and the vertical pixels greatly reserves the capability of the model to learn the correlation between the pixels of the image, so that a clearer and more real image can be generated when the image is generated, and the problem of neglecting the correlation between the pixels in the prior art is improved; therefore, compared with the prior art, the practicability and the image authenticity of the invention are greatly improved, and the invention is suitable for wide application and popularization in the technical field of image generation.
In one possible design, the image noise of the arbitrary sample image is obtained by performing noise superposition on the arbitrary sample image multiple times, and accordingly, generating a target image by using the target neural network model, the random noise image, and noise parameters of the random noise image includes:
based on the noise parameters of the random noise image, carrying out t-th denoising processing on the random noise image to obtain a t-th preprocessing noise image, wherein the initial value of t is 1;
inputting the t pre-processing noise image into the target neural network model to obtain noise parameters of the t pre-processing noise image;
updating the random noise image into the T-th preprocessing noise image, updating the noise parameter of the random noise image into the noise parameter of the T-th preprocessing noise image, adding T to 1, and carrying out T-th denoising processing on the random noise image again based on the noise parameter of the random noise image until T is equal to T, wherein when T is equal to T, the T-th preprocessing noise image is taken as a target image, and T is the total number of times of noise superposition of any sample image.
Based on the above disclosure, the invention discloses a specific generation process of a target image, namely, denoising a random noise image by utilizing noise parameters of the random noise image to obtain a preprocessed noise image; then, inputting the pre-processed noise image into a target neural network to obtain noise parameters corresponding to the pre-processed noise image, taking the pre-processed noise image as a random noise image, and carrying out denoising processing on the random noise image again by utilizing the noise parameters corresponding to the pre-processed noise image, so that the target image can be obtained until the denoising processing times reach the total times of noise superposition; therefore, the method is equivalent to predicting the denoising parameters in each iteration by using the target neural network, so that the denoising processing of the random noise image is continuously performed by using the predicted denoising parameters, and the target image can be obtained after the iteration is completed.
In one possible design, based on the noise parameter of the random noise image, performing a t-th denoising process on the random noise image to obtain a t-th preprocessed noise image, including:
based on the noise parameters of the random noise image, carrying out t-th denoising processing on the random noise image according to the following formula (1) to obtain a t-th preprocessing noise image;
In the above formula (1), x t Representing the t-th preprocessed noise image, x' t Representing random noise image, beta t Representing step weights, Σ, of any sample image at t-th noise superposition s Representing a first correlation matrix, Σ, in the noise parameters corresponding to the random noise image n And representing a second correlation matrix in the noise parameters corresponding to the random noise image, wherein omega represents a noise matrix in the noise parameters corresponding to the random noise image.
In one possible design, the loss function of the target neural network model is:
L=||ε T -Σ′ s ω′Σ′ n || (2)
in the above formula (2), L represents a loss function, Σ '' s Representing a first correlation matrix, Σ ', in the noise parameter corresponding to any sample image' n Representing a second correlation matrix in the noise parameter corresponding to the arbitrary sample image, ω' representing a noise matrix in the noise parameter corresponding to the arbitrary sample image, ε T The actual image random noise representing any sample image is derived from the noise image of any sample image.
Based on the above disclosure, the present invention decomposes the generation of gaussian noise into a combination of two correlation matrices and one noise matrix, and the advantage of this structure is: the covariance matrix of the noise is not a simple diagonal matrix any more, so that complex noise correlation characteristics can be modeled, and compared with the conventional noise generation using only the diagonal matrix as the covariance matrix, the correlation complexity of the noise is greatly increased under the condition that the calculation complexity is not obviously improved, so that the model can keep the capability of learning the correlation between pixels of an image, and the reality of image generation is further improved when the model is used.
In one possible design, before acquiring the target neural network model, the method includes:
acquiring a mass of sample images, and carrying out noise superposition on each sample image in the mass of sample images for a plurality of times to obtain a noise image of each sample image, wherein the total number of times of noise superposition of any sample image is obtained according to the complexity of a target image;
and training a neural network model by taking the noise image of each sample image as input and the noise parameter of each sample image as output so as to obtain the target neural network model after training is completed.
Based on the above disclosure, the total number of times of noise superposition on the sample image is determined according to the complexity of the image to be generated, that is, the iteration number of the input data of the model is determined by the complexity of the target image, and the number of times of noise superposition is the same as the total number of times of noise superposition when the random noise image is denoised later; therefore, the invention can adjust the noise superposition and denoising times according to the complexity of the generated image, so that the process can still ensure the reality of the generated image when facing the complex image, and the reality and the size of the generated image are not limited by the complexity of the model.
In one possible design, performing noise superposition multiple times on each sample image in the massive sample images to obtain a noise image of each sample image includes:
for any sample image, step size weight in the t-th noise superposition is obtained, and t-th noise superposition processing is carried out on any sample image based on the step size weight in the t-th noise superposition, so that a t-th noise superposition image is obtained;
and (3) self-adding 1 to T, updating any sample image into the T-th noise superposition image, and repeatedly carrying out T-th noise superposition processing on any sample image based on the step weight when the T-th noise superposition is carried out until T is equal to T, so that when T is equal to T, the T-th noise superposition image is taken as the noise image of any sample image, wherein T represents the total number of noise superposition.
In one possible design, based on the step weight of the t-th noise superposition, performing a t-th noise superposition process on the arbitrary sample image to obtain a t-th noise superposition image, where the method includes:
according to the step weight during the t-th noise superposition, adopting the following formula (3), and performing t-th noise superposition processing on any sample image to obtain a t-th noise superposition image;
In the above formula (3), x '' t Representing the t-th noise superimposed image, x t-1 Representing either sample image, beta t Representing step weight epsilon of any sample image during t-th noise superposition t Image random noise at the time of the T-th noise superposition of any sample image is represented, and t=1, 2.
In a second aspect, there is provided an image generation apparatus based on random noise, comprising:
the model acquisition unit is used for acquiring a target neural network model, wherein the target neural network model is obtained by taking noise images of a large number of sample images as input and noise parameters of each sample image as output, the noise parameters of any sample image in the large number of sample images comprise a noise matrix corresponding to the any sample image, a first correlation matrix and a second correlation matrix, the first correlation matrix is used for representing correlation between transverse adjacent pixels in the any sample image, and the second correlation matrix is used for representing correlation between longitudinal adjacent pixels in the any sample image;
the noise parameter generation unit is used for acquiring a random noise image, inputting the random noise image into the target neural network model and obtaining the noise parameter of the random noise image;
And the image generation unit is used for generating a target image by utilizing the target neural network model, the random noise image and the noise parameters of the random noise image, wherein the image content of the target image and the image content of any sample image are of the same type.
In a third aspect, another image generating apparatus based on random noise is provided, taking the apparatus as an electronic device, and the apparatus includes a memory, a processor and a transceiver, which are sequentially communicatively connected, where the memory is configured to store a computer program, the transceiver is configured to send and receive a message, and the processor is configured to read the computer program, and execute the image generating method based on random noise as in the first aspect or any one of the first aspect and the second aspect.
In a fourth aspect, a storage medium is provided, on which instructions are stored which, when run on a computer, perform the random noise based image generation method as in the first aspect or any one of the possible designs of the first aspect.
In a fifth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the random noise based image generation method as in the first aspect or any one of the possible designs of the first aspect.
The beneficial effects are that:
(1) When the method and the device are used for generating the image, the correlation matrix of the noise is only needed to be learned by using the neural network model, so that the training of the neural network is more stable, the defects of mode collapse and difficult convergence are avoided, and the problem that the traditional generation countermeasure network has no practicability because the neural network is difficult to converge is solved; meanwhile, compared with the traditional diffusion model which only learns the single variance of noise, the invention uses the pixel correlation matrix as an output parameter, and the separable Gaussian process which considers the correlation between the horizontal pixels and the vertical pixels greatly reserves the capability of the model to learn the correlation between the pixels of the image, so that a clearer and more real image can be generated when the image is generated, and the problem of neglecting the correlation between the pixels in the prior art is improved; therefore, compared with the prior art, the practicability and the image authenticity of the invention are greatly improved, and the invention is suitable for wide application and popularization in the technical field of image generation.
(2) The invention decomposes the generation of Gaussian noise into a combination of two correlation matrices and a noise matrix, and the structure has the advantages that: the covariance matrix of the noise is not a simple diagonal matrix any more, so that complex noise correlation characteristics can be modeled, and compared with the conventional noise generation using only the diagonal matrix as the covariance matrix, the correlation complexity of the noise is greatly increased under the condition that the calculation complexity is not obviously improved, so that the model can keep the capability of learning the correlation between pixels of an image, and the reality of image generation is improved when the model is used.
(3) The invention can adjust the noise superposition and denoising times according to the complexity of the generated image, thus, when facing the complex image, the process can still ensure the reality of the generated image, and the reality and the size of the generated image can not be limited due to the complexity of the model, thereby further improving the reality of the image generation.
Drawings
Fig. 1 is a schematic flow chart of steps of an image generating method based on random noise according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an image generating device based on random noise according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the present invention will be briefly described below with reference to the accompanying drawings and the description of the embodiments or the prior art, and it is obvious that the following description of the structure of the drawings is only some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort to a person skilled in the art. It should be noted that the description of these examples is for aiding in understanding the present invention, but is not intended to limit the present invention.
It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments of the present application.
It should be understood that for the term "and/or" that may appear herein, it is merely one association relationship that describes an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a alone, B alone, and both a and B; for the term "/and" that may appear herein, which is descriptive of another associative object relationship, it means that there may be two relationships, e.g., a/and B, it may be expressed that: a alone, a alone and B alone; in addition, for the character "/" that may appear herein, it is generally indicated that the context associated object is an "or" relationship.
Examples:
referring to fig. 1, in the random noise-based image generation method provided in the embodiment, when an image is generated, only a neural network model is required to learn a correlation matrix of noise, so that training of the neural network is more stable, and the defects of mode collapse and difficulty in convergence are not easy to occur; meanwhile, a neural network model is used for learning a correlation matrix among pixels of an image, the pixel correlation matrix is applied to a subsequent image generation process, and the total noise superposition times during model training and the denoising times during image generation can be adjusted according to the complexity of the generated image, so that compared with the prior art, the method considers the problem that the pixel correlation influences the image authenticity and the problem that the model complexity influences the authenticity, solves the problem by adding the pixel correlation matrix and adjusting the iteration times, and therefore, a clearer and more real image can be generated, and is more suitable for large-scale application and popularization in the image generation field; in this embodiment, the method may be executed, for example, but not limited to, on the image generating side, where, for example, the image generating side may be, for example, but not limited to, a personal computer (personal computer, PC), a tablet computer, or a smart phone, it is to be understood that the foregoing execution subject does not constitute limitation of the embodiment of the present application, and accordingly, the operation steps of the method may be, for example, not limited to, those shown in the following steps S1 to S3.
S1, acquiring a target neural network model, wherein the target neural network model is obtained by taking noise images of a large number of sample images as input and noise parameters of each sample image as output, wherein the noise parameters of any sample image in the large number of sample images comprise a noise matrix corresponding to any sample image, a first correlation matrix and a second correlation matrix, the first correlation matrix is used for representing correlation between transverse adjacent pixel points in any sample image, and the second correlation matrix is used for representing correlation between longitudinal adjacent pixel points in any sample image; in specific application, the embodiment is equivalent to learning the mapping relation between the noise of the image and the corresponding noise matrix and the pixel correlation matrix of the image by using a neural network model, so that the correlation of the pixels of the image is added into the image generation process in the actual use process, and the authenticity of the generated image is improved; alternatively, one of the training processes described below for providing the target neural network model may be, but is not limited to, as shown in the following steps S01 and S02.
S01, acquiring a large number of sample images, and carrying out noise superposition on each sample image in the large number of sample images for a plurality of times to obtain a noise image of each sample image, wherein the total number of times of noise superposition of any sample image is obtained according to the complexity of the target image; in this embodiment, the type of the sample image is determined according to the type of the image to be generated, if the image to be generated (i.e. the target image) is a landscape image, then the collected sample images all belong to the landscape image type, and if the image to be generated is a puppy image, then the collected sample images belong to the puppy image type; of course, the other types are not described in detail; optionally, the number of the sample images is at least 10000 or more, and it is understood that the number of the sample images may be specifically set according to practical use, and is not limited to the foregoing examples; in addition, in the present embodiment, the total number of times of noise superposition is obtained according to the complexity of the target image (i.e., the image to be generated), which may be, but is not limited to, preset to the image generation end by a person according to the image complexity, and the size thereof is typically between 100-200.
Furthermore, before the noise superposition is performed on each sample image for multiple times, the embodiment is further provided with a data preprocessing step, such as processing all sample images into the same size by means of scaling, average processing and the like, and adjusting each pixel of each sample image to be within a (-1, 1) range; thus, the unification of the format of the sample image can be completed.
After the formats of all the sample images are unified, noise superposition processing can be carried out on each sample image; in the present embodiment, since the noise superimposing process is the same for each sample image, the following specific explanation of the noise superimposing process will be made taking any sample image as an example, and may be, but not limited to, as shown in the following steps S01 a and S01 b.
S01 a, for any sample image, acquiring a step weight in the t-th noise superposition, and carrying out t-th noise superposition processing on any sample image based on the step weight in the t-th noise superposition to obtain a t-th noise superposition image; in specific application, the total noise superposition times are known, so that step weights in each noise superposition can be preset in the image generation end, and in actual use, the corresponding step weights can be selected according to the noise superposition times; further, the t-th noise superimposing process may be performed on any of the sample images by, but not limited to, using the following formula (3).
In the above formula (3), x t "represents the t-th noise superimposed image, x t-1 Representing either sample image, beta t Representing step weight epsilon of any sample image during t-th noise superposition t Representing the random noise of the image of any sample image at the time of the T-th noise superposition, and t=1, 2,. -%, T; in this embodiment ε t Is obtained from the mean and variance of the image at the time of the t-th noise superposition, i.e. epsilon at the time of the first noise superposition t Is obtained from the mean and variance of any sample image, and in the second noise superposition is obtained from the mean and variance of the first noise superposition image, thus ε t When the noise is overlapped, the noise is continuously changed, and when the noise is overlapped, the final image random noise can be obtained.
After the t-th noise superposition is completed, the next noise superposition can be performed based on the t-th noise superposition image, and the cycle is continuously performed until the superposition times reach the preset total noise superposition times, wherein the cycle process is as shown in the following step S01b.
And S01b, self-adding 1 to T, updating any sample image into the T-th noise superposition image, and carrying out T-th noise superposition processing on any sample image on the basis of step length weight in T-th noise superposition until T is equal to T, wherein when T is equal to T, the T-th noise superposition image is used as the noise image of any sample image, and T represents the total number of noise superposition times.
The following describes the foregoing steps S01a and S01b as an example, namely: in the first noise superposition, step weight and image random noise in the first noise superposition are utilized, and a first noise superposition image is obtained based on the formula (3); then, when the second noise superposition is carried out, utilizing the step weight and the random noise of the image during the second noise superposition, carrying out noise superposition processing on the first noise superposition image based on the formula (3) to obtain a second noise superposition image, and thus, continuously carrying out noise superposition on the image obtained by previous superposition until the superposition times reach the total noise superposition times, and obtaining the noise image of any sample; of course, the noise superposition process of the rest of each sample image is the same as that of any of the foregoing sample images, and will not be described herein.
After obtaining the noise image of each sample image, the training set may be composed by using the noise images, and the neural network model may be trained according to the training set, as shown in step S02 below.
S02, training a neural network model by taking a noise image of each sample image as input and taking a noise parameter of each sample image as output, so as to obtain the target neural network model after training is completed; in this embodiment, the exemplary neural network model may be, but is not limited to, a U-Net network, and the loss function employed is:
L=||ε T -Σ′ s ω′Σ′ n || (2)
In the above formula (2), L represents a loss function, Σ '' s Representing a first correlation matrix, Σ ', in the noise parameter corresponding to any sample image' n Representing a second correlation matrix in the noise parameter corresponding to the arbitrary sample image, ω' representing a noise matrix in the noise parameter corresponding to the arbitrary sample image, ε T The actual image random noise representing any sample image is derived from the noise image of any sample image.
When the method is specifically applied, each time a noise image of a sample image is input into a neural network model, the neural network model outputs noise parameters corresponding to the sample image, and meanwhile, the noise image of the input sample image is obtained through multiple times of noise superposition, so that the random noise of the image corresponding to the last time of noise superposition is used as the random noise of the actual image of the input sample image; then, using the above formula (2), the loss function value at this time of training can be calculated; based on the principle, a loss function can be obtained when each noise image is input, and finally, whether the model converges can be judged through the loss function value.
Through the detailed explanation of the model training process, the generation of Gaussian noise is decomposed into the combination form of two correlation matrixes and one noise matrix, so that complex noise correlation characteristics can be modeled, and compared with the generation of noise by using a diagonal matrix as a covariance matrix in the past, the correlation complexity of noise is greatly increased under the condition that the calculation complexity is not obviously improved, so that the model can keep the capability of learning the correlation among image pixels, and the reality of image generation is improved when the model is used.
After the training of the model is completed, the noise parameter of random noise can be predicted by utilizing the trained model, and then a target image is generated through the predicted noise parameter; the generation process of the target image is as follows in step S2 and step S3.
S2, acquiring a random noise image, and inputting the random noise image into the target neural network model to obtain noise parameters of the random noise image; when the method is applied specifically, the noise parameters of the random noise image comprise a noise matrix, a first correlation matrix and a second correlation matrix of the random noise image; then, the three matrices may be used to continuously perform noise removal on the random noise image, so as to restore the target image, as shown in step S3 below.
S3, generating a target image by utilizing the target neural network model, the random noise image and noise parameters of the random noise image, wherein the image content of the target image and the image content of any sample image are of the same type; in specific application, step S3 is equivalent to denoising the random noise image continuously by using the noise parameter, namely, denoising every time to obtain a new noise image, and then inputting the new noise image into the target neural network to obtain a new noise parameter; then, denoising the new noise image by using the new noise parameters to obtain a brand new noise image, and repeating the process until the denoising times are equal to the total times of noise superposition; the image obtained by the last denoising is the target image; alternatively, the specific procedure of the denoising process may be, but is not limited to, as shown in steps S31 to S33 described below.
S31, carrying out t-th denoising treatment on the random noise image based on the noise parameter of the random noise image to obtain a t-th preprocessing noise image, wherein the initial value of t is 1; in a specific application, the method may, but is not limited to, perform the t-th denoising process on the random noise image according to the following formula (1) to obtain a t-th preprocessed noise image.
In the above formula (1), x t Representing the t-th preprocessed noise image, x t ' represents a random noise image, beta t Representing step weights, Σ, of any sample image at t-th noise superposition s Representing a first correlation matrix, Σ, in the noise parameters corresponding to the random noise image n And representing a second correlation matrix in the noise parameters corresponding to the random noise image, wherein omega represents a noise matrix in the noise parameters corresponding to the random noise image.
After the t-th preprocessing noise image is calculated based on the formula (1), the t-th preprocessing noise image can be input into the target neural network model to obtain the corresponding noise parameters, then the t-th preprocessing noise image is taken as a random noise image, the noise parameters of the t-th preprocessing noise image are taken as the noise parameters of the random noise image, and the step S31 is repeatedly executed, so that the second denoising process can be completed, and the principle is repeated continuously, so that the multiple denoising processes can be completed, wherein the cyclic process is as shown in the following steps S32 and S33.
S32, inputting the t pre-processing noise image into the target neural network model to obtain noise parameters of the t pre-processing noise image.
S33, updating the random noise image into the T-th preprocessing noise image, updating the noise parameter of the random noise image into the noise parameter of the T-th preprocessing noise image, adding T to 1, and carrying out T-th denoising on the random noise image again based on the noise parameter of the random noise image until T is equal to T, wherein when T is equal to T, the T-th preprocessing noise image is taken as a target image, and T is the total number of times of noise superposition of any sample image.
The foregoing steps S31 to S33 are described below as an example; firstly, carrying out first denoising treatment on a random noise image by utilizing the formula (1) to obtain a first preprocessing noise image; then, inputting the first preprocessing noise image into a target neural network to obtain noise parameters of the first preprocessing noise image; at this time, updating the random noise image to the first preprocessing noise image, updating the noise parameter of the random noise image to the noise parameter of the first preprocessing noise image, and then performing a second denoising process on the new random noise image (i.e., the first preprocessing noise image) with the new noise parameter (i.e., the noise parameter of the first preprocessing noise image) to obtain a second preprocessing noise image; similarly, the random noise image and the noise parameters are updated (i.e. the random noise image is the second preprocessing noise image, and the corresponding noise parameters are updated to the noise parameters of the second preprocessing noise image), and the updated random noise image is subjected to the third denoising processing by utilizing the updated noise parameters, so that the principle is continuously circulated until the denoising times are equal to the total times of noise superposition, and the target image can be obtained when the denoising times are equal to the total times of noise superposition.
In this embodiment, since the number of denoising times is equal to the total number of times of noise superposition of the input noise image during model training, the total number of times of noise superposition is determined according to the complexity of generating the image as required; therefore, the method can also adjust the iteration times according to the complexity of the generated image when the image is generated, so that the process can still ensure the reality of the generated image when facing the complex image, and the reality and the size of the generated image are not limited by the complexity of the model.
Therefore, by the image generation method based on random noise, which is described in detail in the steps S1 to S3, the method only needs to learn the correlation matrix of noise by using the neural network model when generating the image, so that the training of the neural network is more stable, and the defects of pattern collapse and difficult convergence are not easy to occur; meanwhile, a neural network model is used for learning a correlation matrix among pixels of an image, the pixel correlation matrix is applied to a subsequent image generation process, and the total noise superposition times during model training and the denoising times during image generation can be adjusted according to the complexity of the generated image.
As shown in fig. 2, a second aspect of the present embodiment provides a hardware device for implementing the random noise-based image generating method according to the first aspect of the present embodiment, including:
the model acquisition unit is used for acquiring a target neural network model, wherein the target neural network model is obtained by taking noise images of a large number of sample images as input and noise parameters of each sample image as output, the noise parameters of any sample image in the large number of sample images comprise a noise matrix corresponding to the any sample image, a first correlation matrix and a second correlation matrix, the first correlation matrix is used for representing correlation between transverse adjacent pixels in the any sample image, and the second correlation matrix is used for representing correlation between longitudinal adjacent pixels in the any sample image.
The noise parameter generation unit is used for acquiring a random noise image, inputting the random noise image into the target neural network model and obtaining the noise parameter of the random noise image.
And the image generation unit is used for generating a target image by utilizing the target neural network model, the random noise image and the noise parameters of the random noise image, wherein the image content of the target image and the image content of any sample image are of the same type.
The working process, working details and technical effects of the device provided in this embodiment may refer to the first aspect of the embodiment, and are not described herein again.
As shown in fig. 3, a third aspect of the present embodiment provides another image generating apparatus based on random noise, taking the apparatus as an electronic device as an example, including: the device comprises a memory, a processor and a transceiver which are connected in sequence in communication, wherein the memory is used for storing a computer program, the transceiver is used for receiving and transmitting messages, and the processor is used for reading the computer program and executing the image generation method based on random noise according to the first aspect of the embodiment.
By way of specific example, the Memory may include, but is not limited to, random access Memory (random access Memory, RAM), read Only Memory (ROM), flash Memory (Flash Memory), first-in-first-out Memory (First Input First Output, FIFO) and/or first-in-last-out Memory (First In Last Out, FILO), etc.; in particular, the processor may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ), and may also include a main processor and a coprocessor, where the main processor is a processor for processing data in an awake state, and is also called CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state.
In some embodiments, the processor may be integrated with a GPU (Graphics Processing Unit, image processor) for taking charge of rendering and rendering of content required to be displayed by the display screen, for example, the processor may not be limited to a microprocessor employing a model number of STM32F105 family, a reduced instruction set computer (reduced instruction set computer, RISC) microprocessor, an X86 or other architecture processor, or a processor integrating an embedded neural network processor (neural-network processing units, NPU); the transceiver may be, but is not limited to, a wireless fidelity (WIFI) wireless transceiver, a bluetooth wireless transceiver, a general packet radio service technology (General Packet Radio Service, GPRS) wireless transceiver, a ZigBee protocol (low power local area network protocol based on the ieee802.15.4 standard), a 3G transceiver, a 4G transceiver, and/or a 5G transceiver, etc. In addition, the device may include, but is not limited to, a power module, a display screen, and other necessary components.
The working process, working details and technical effects of the electronic device provided in this embodiment may refer to the first aspect of the embodiment, and are not described herein again.
A fourth aspect of the present embodiment provides a storage medium storing instructions containing the random noise-based image generation method according to the first aspect of the present embodiment, i.e. the storage medium has instructions stored thereon, which when executed on a computer, perform the random noise-based image generation method according to the first aspect.
The storage medium refers to a carrier for storing data, and may include, but is not limited to, a floppy disk, an optical disk, a hard disk, a flash Memory, a flash disk, and/or a Memory stick (Memory stick), where the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
The working process, working details and technical effects of the storage medium provided in this embodiment may refer to the first aspect of the embodiment, and are not described herein again.
A fifth aspect of the present embodiment provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the random noise based image generation method of the first aspect of the embodiment, wherein the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus.
Finally, it should be noted that: the foregoing description is only of the preferred embodiments of the invention and is not intended to limit the scope of the invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An image generation method based on random noise, comprising:
The method comprises the steps of obtaining a target neural network model, wherein the target neural network model is obtained by taking noise images of massive sample images as input and noise parameters of each sample image as output and training, the noise parameters of any sample image in the massive sample images comprise a noise matrix corresponding to any sample image, a first correlation matrix and a second correlation matrix, the first correlation matrix is used for representing correlation between transverse adjacent pixels in any sample image, and the second correlation matrix is used for representing correlation between longitudinal adjacent pixels in any sample image;
acquiring a random noise image, and inputting the random noise image into the target neural network model to obtain noise parameters of the random noise image;
and generating a target image by using the target neural network model, the random noise image and noise parameters of the random noise image, wherein the image content of the target image and the image content of any sample image are of the same type.
2. The method according to claim 1, wherein the image noise of the arbitrary sample image is obtained by performing noise superposition on the arbitrary sample image multiple times, and the generating the target image by using the target neural network model, the random noise image, and the noise parameters of the random noise image includes:
Based on the noise parameters of the random noise image, carrying out t-th denoising processing on the random noise image to obtain a t-th preprocessing noise image, wherein the initial value of t is 1;
inputting the t pre-processing noise image into the target neural network model to obtain noise parameters of the t pre-processing noise image;
updating the random noise image into the T-th preprocessing noise image, updating the noise parameter of the random noise image into the noise parameter of the T-th preprocessing noise image, adding T to 1, and carrying out T-th denoising processing on the random noise image again based on the noise parameter of the random noise image until T is equal to T, wherein when T is equal to T, the T-th preprocessing noise image is taken as a target image, and T is the total number of times of noise superposition of any sample image.
3. The method of claim 2, wherein the performing a t-th denoising process on the random noise image based on the noise parameter of the random noise image to obtain a t-th preprocessed noise image comprises:
based on the noise parameters of the random noise image, carrying out t-th denoising processing on the random noise image according to the following formula (1) to obtain a t-th preprocessing noise image;
In the above formula (1), x t Representing the t-th preprocessed noise image, x t ' represents a random noise image, beta t Representing step weights, Σ, of any sample image at t-th noise superposition s Representing a first correlation matrix, Σ, in the noise parameters corresponding to the random noise image n And representing a second correlation matrix in the noise parameters corresponding to the random noise image, wherein omega represents a noise matrix in the noise parameters corresponding to the random noise image.
4. The method of claim 1, wherein the loss function of the target neural network model is:
L=ε T -Σ′ s ω′Σ′ n (2)
in the above formula (2), L represents a loss function, Σ '' s Representing a first correlation matrix, Σ ', in the noise parameter corresponding to any sample image' n Representing a second correlation matrix in the noise parameter corresponding to the arbitrary sample image, ω' representing a noise matrix in the noise parameter corresponding to the arbitrary sample image, ε T The actual image random noise representing any sample image is derived from the noise image of any sample image.
5. The method of claim 1, wherein prior to acquiring the target neural network model, the method comprises:
Acquiring a mass of sample images, and carrying out noise superposition on each sample image in the mass of sample images for a plurality of times to obtain a noise image of each sample image, wherein the total number of times of noise superposition of any sample image is obtained according to the complexity of a target image;
and training a neural network model by taking the noise image of each sample image as input and the noise parameter of each sample image as output so as to obtain the target neural network model after training is completed.
6. The method of claim 5, wherein performing noise superposition multiple times on each sample image in the plurality of sample images to obtain a noise image for each sample image comprises:
for any sample image, step size weight in the t-th noise superposition is obtained, and t-th noise superposition processing is carried out on any sample image based on the step size weight in the t-th noise superposition, so that a t-th noise superposition image is obtained;
and (3) self-adding 1 to T, updating any sample image into the T-th noise superposition image, and repeatedly carrying out T-th noise superposition processing on any sample image based on the step weight when the T-th noise superposition is carried out until T is equal to T, so that when T is equal to T, the T-th noise superposition image is taken as the noise image of any sample image, wherein T represents the total number of noise superposition.
7. The method according to claim 6, wherein performing the t-th noise superimposition processing on the arbitrary sample image based on the step weight at the time of the t-th noise superimposition to obtain the t-th noise superimposition image includes:
according to the step weight during the t-th noise superposition, adopting the following formula (3), and performing t-th noise superposition processing on any sample image to obtain a t-th noise superposition image;
in the above formula (3), x t "represents the t-th noise superimposed image, x t-1 Representing either sample image, beta t Representing step weight epsilon of any sample image during t-th noise superposition t Image random noise at the time of the T-th noise superposition of any sample image is represented, and t=1, 2.
8. An image generation apparatus based on random noise, comprising:
the model acquisition unit is used for acquiring a target neural network model, wherein the target neural network model is obtained by taking noise images of a large number of sample images as input and noise parameters of each sample image as output, the noise parameters of any sample image in the large number of sample images comprise a noise matrix corresponding to the any sample image, a first correlation matrix and a second correlation matrix, the first correlation matrix is used for representing correlation between transverse adjacent pixels in the any sample image, and the second correlation matrix is used for representing correlation between longitudinal adjacent pixels in the any sample image;
The noise parameter generation unit is used for acquiring a random noise image, inputting the random noise image into the target neural network model and obtaining the noise parameter of the random noise image;
and the image generation unit is used for generating a target image by utilizing the target neural network model, the random noise image and the noise parameters of the random noise image, wherein the image content of the target image and the image content of any sample image are of the same type.
9. An electronic device, comprising: a memory, a processor and a transceiver in communication with each other in sequence, wherein the memory is configured to store a computer program, the transceiver is configured to receive and transmit messages, and the processor is configured to read the computer program and perform the random noise based image generation method according to any one of claims 1 to 7.
10. A storage medium having instructions stored thereon which, when executed on a computer, perform the random noise based image generation method of any one of claims 1 to 7.
CN202310082757.9A 2023-02-03 2023-02-03 Image generation method, device, equipment and storage medium based on random noise Pending CN116777732A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310082757.9A CN116777732A (en) 2023-02-03 2023-02-03 Image generation method, device, equipment and storage medium based on random noise

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310082757.9A CN116777732A (en) 2023-02-03 2023-02-03 Image generation method, device, equipment and storage medium based on random noise

Publications (1)

Publication Number Publication Date
CN116777732A true CN116777732A (en) 2023-09-19

Family

ID=88012214

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310082757.9A Pending CN116777732A (en) 2023-02-03 2023-02-03 Image generation method, device, equipment and storage medium based on random noise

Country Status (1)

Country Link
CN (1) CN116777732A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117575894A (en) * 2024-01-16 2024-02-20 腾讯科技(深圳)有限公司 Image generation method, device, electronic equipment and computer readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117575894A (en) * 2024-01-16 2024-02-20 腾讯科技(深圳)有限公司 Image generation method, device, electronic equipment and computer readable storage medium
CN117575894B (en) * 2024-01-16 2024-04-30 腾讯科技(深圳)有限公司 Image generation method, device, electronic equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
US20220051025A1 (en) Video classification method and apparatus, model training method and apparatus, device, and storage medium
JP2022548712A (en) Image Haze Removal Method by Adversarial Generation Network Fusing Feature Pyramids
US11798145B2 (en) Image processing method and apparatus, device, and storage medium
US20200327309A1 (en) Image processing method and system
CN111815534A (en) Real-time skin makeup migration method, device, electronic device and readable storage medium
CN111292262B (en) Image processing method, device, electronic equipment and storage medium
CN112308866B (en) Image processing method, device, electronic equipment and storage medium
CN112487207A (en) Image multi-label classification method and device, computer equipment and storage medium
CN111832592A (en) RGBD significance detection method and related device
CN110569844A (en) ship recognition method and system based on deep learning
CN112991171B (en) Image processing method, device, electronic equipment and storage medium
CN116777732A (en) Image generation method, device, equipment and storage medium based on random noise
CN111429374A (en) Method and device for eliminating moire in image
CN115393191A (en) Method, device and equipment for reconstructing super-resolution of lightweight remote sensing image
CN115956247A (en) Neural network model optimization method and device
CN113034523A (en) Image processing method, image processing device, storage medium and computer equipment
CN110570375B (en) Image processing method, device, electronic device and storage medium
CN114743041B (en) Construction method and device of pre-training model decimation frame
CN115081616A (en) Data denoising method and related equipment
CN116452810A (en) Multi-level semantic segmentation method and device, electronic equipment and storage medium
CN115731214A (en) Medical image segmentation method and device based on artificial intelligence
CN115953533A (en) Three-dimensional human body reconstruction method and device
Zhu et al. Learning knowledge representation with meta knowledge distillation for single image super-resolution
CN110826563B (en) Finger vein segmentation method and device based on neural network and probability map model
CN116310105A (en) Object three-dimensional reconstruction method, device, equipment and storage medium based on multiple views

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination