WO2022227886A1 - Procédé de génération d'un modèle de réseau de réparation à super-résolution, et procédé et appareil de réparation à super-résolution d'image - Google Patents

Procédé de génération d'un modèle de réseau de réparation à super-résolution, et procédé et appareil de réparation à super-résolution d'image Download PDF

Info

Publication number
WO2022227886A1
WO2022227886A1 PCT/CN2022/080294 CN2022080294W WO2022227886A1 WO 2022227886 A1 WO2022227886 A1 WO 2022227886A1 CN 2022080294 W CN2022080294 W CN 2022080294W WO 2022227886 A1 WO2022227886 A1 WO 2022227886A1
Authority
WO
WIPO (PCT)
Prior art keywords
network model
image
loss function
super
trained
Prior art date
Application number
PCT/CN2022/080294
Other languages
English (en)
Chinese (zh)
Inventor
孙佳
袁泽寰
王长虎
Original Assignee
北京有竹居网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京有竹居网络技术有限公司 filed Critical 北京有竹居网络技术有限公司
Publication of WO2022227886A1 publication Critical patent/WO2022227886A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application relates to the technical field of image processing, and in particular, to a method for generating a super-resolution inpainting network model, a method and an apparatus for super-subsection restoration of images.
  • the embodiments of the present application provide a method for generating a network model for super-resolution restoration, a method and apparatus for super-resolution restoration of images, which can improve the restoration effect and reduce the restoration delay.
  • a method for generating a super-score repair network model is provided, and the method may include:
  • the image to be trained is a low-resolution variable image
  • the student network model is an enhanced super-resolution generative adversarial network ESRGAN.
  • the ESRGAN network includes a basic module, an upsampling module and a convolution module.
  • the basic module includes one or more residual density modules RRDB, and the RRDB includes A plurality of processing modules, the input of each processing module is used as the input of the subsequent processing module, the processing module includes a first convolution layer, a second convolution layer and an activation layer connected in sequence, the first convolution layer
  • the convolution kernel of the layer is smaller than the convolution kernel of the second convolution layer.
  • the convolution kernel of the first convolution layer is 1*1.
  • the basic module includes an RRDB module.
  • the convolution module includes a third convolution layer, and the convolution kernel of the third convolution layer is equal to the convolution kernel of the first convolution layer.
  • the image to be trained is input into the student network model and the teacher network model respectively, and the loss function corresponding to the student network model is obtained, including:
  • the to-be-trained images are respectively input to the student network model and the teacher network model to obtain the first loss function corresponding to pixel distillation and the second loss function corresponding to overall distillation;
  • a loss function corresponding to the student network model is obtained.
  • the image to be trained is input into the student network model and the teacher network model respectively, and the loss function corresponding to the student network model is obtained, including:
  • the to-be-trained images are respectively input into the student network model and the teacher network model to obtain the first loss function corresponding to the pixel distillation, the second loss function corresponding to the overall distillation, and the third loss function corresponding to the discriminator;
  • Weighted summation is performed on the first loss function, the second loss function and the third loss function to obtain a loss function corresponding to the student network model.
  • the first loss function represents the output result of the student network model for the first pixel in the image to be trained and the teacher network model for the first pixel in the image to be trained.
  • the loss function between the output results of one pixel represents the difference between the output result of the student network model for the image to be trained and the output result of the teacher network model for the image to be trained.
  • the third loss function represents the loss function between the discriminator corresponding to the student network model and the discriminator corresponding to the teacher network model, and the first pixel is any pixel in the image to be trained.
  • the method before the images to be trained are respectively input into the student network model, the method further includes:
  • the to-be-trained images are respectively input into the initial network model and the teacher network model to obtain the fourth loss function corresponding to pixel distillation;
  • the parameters of the initial network model are updated according to the fourth loss function to obtain a student network model.
  • the fourth loss function represents the output result of the initial network model for the second pixel in the image to be trained and the output result of the teacher network model for the first pixel in the image to be trained The loss function between the output results of two pixels, where the second pixel is any pixel in the image to be trained.
  • an image super-score restoration method comprising:
  • the image to be processed is a low-resolution image
  • an apparatus for generating a super-score repair network model includes:
  • a first acquisition unit used to acquire an image to be trained, the image to be trained is a low-resolution variable image
  • the second obtaining unit is used to input the to-be-trained images into the student network model and the teacher network model respectively, and obtain the loss function corresponding to the student network model;
  • a generating unit configured to update the parameters of the student network model according to the loss function, so that the loss function of the student network model satisfies a preset condition, and generate an over-score repair network;
  • the student network model is an enhanced super-resolution generative confrontation ESRGAN network.
  • the ESRGAN network includes a basic module, an upsampling module, and a convolution module.
  • the basic module includes one or more RRDB modules, and the RRDB module includes a plurality of processing module, the output of each processing module is used as the input of the subsequent processing module, and the processing module includes a first convolutional layer, a second convolutional layer and an activation layer that are connected in sequence, and the first convolutional layer is The convolution kernel is smaller than the convolution kernel of the second convolution layer.
  • an image super-resolution restoration device includes:
  • a first acquiring unit configured to acquire a to-be-processed image, where the to-be-processed image is a low-resolution image
  • the second obtaining unit is configured to input the to-be-processed image into a super-resolution inpainting network model to obtain a target image, where the target image is a super-resolution image corresponding to the to-be-processed image, and the super-resolution inpainting network model is based on
  • the method described in the first aspect is generated by training.
  • an electronic device in a fifth aspect of the embodiments of the present application, includes: a processor and a memory;
  • the memory for storing instructions or computer programs
  • the processor is configured to execute the instructions or the computer program in the memory, so that the electronic device executes the super-score restoration network model generation method described in the first aspect or the image super-score described in the second aspect Repair method.
  • a computer-readable storage medium which includes instructions, which, when run on a computer, cause the computer to execute the method for generating a super-score repair network model or the second method described in the first aspect above.
  • a computer program product in a seventh aspect of the embodiments of the present application, includes a computer program carried on a non-transitory computer-readable medium, and the computer program includes a computer program for executing the above first aspect.
  • an image to be trained (a low-resolution image) is obtained, and the image to be trained is input into the student network model and the teacher network model respectively, so as to obtain a loss function corresponding to the student network model.
  • the teacher network model is a large network that has been trained.
  • the parameters of the student network model are updated according to the loss function, so that the loss function of the student network model satisfies the preset conditions, and then the over-score repair network is generated.
  • the student network model is an enhanced super-resolution generative adversarial ESRGAN network, which includes a basic module, an upsampling module and a convolution module.
  • the basic module includes one or more RRDB modules
  • the RRDB module includes multiple processing modules
  • the input of each processing module is used as the input of the subsequent processing module. That is, when training the student network model, the input of each processing model is used as the input of each subsequent processing module to enhance the transmission of features, so that the subsequent processing modules can use more image features for training to improve the The effect of over-repair.
  • the processing module includes a first convolution layer, a second convolution layer, and an activation layer that are connected in sequence, and the convolution kernel of the first convolution layer is smaller than the convolution kernel of the second convolution layer. That is, a first convolution layer with a smaller convolution kernel is added to the processing module, thereby reducing the dimension of image features, reducing the amount of calculation, and improving the processing speed.
  • a low-resolution image to be processed is acquired, and the to-be-processed image is input into a super-resolution inpainting network model to obtain a high-resolution target image, that is, a super-resolution inpainting image.
  • Figure 1a is a traditional RRDB module structure diagram
  • Fig. 1b is a kind of RRDB module structure diagram provided by the embodiment of this application.
  • Fig. 1c is a kind of ESRGAN network structure diagram provided by the embodiment of this application.
  • FIG. 2 is a flowchart of a method for generating a super-score repair network model provided by an embodiment of the present application
  • FIG. 3 is a flowchart of an image super-resolution restoration method provided by an embodiment of the present application.
  • FIG. 4 is a structural diagram of an apparatus for generating a superdivision repair network model provided by an embodiment of the present application
  • FIG. 5 is a structural diagram of an apparatus for image super-resolution restoration provided by an embodiment of the present application.
  • FIG. 6 is a structural diagram of an electronic device according to an embodiment of the present application.
  • PSNR Peak Signal to Noise Ratio
  • an embodiment of the present application provides a super-resolution repair network model
  • the super-resolution repair network model is an enhanced super-resolution generative adversarial network (Enhanced super-Resolution Generative adversarial network, ESRGAN),
  • the ESRGAN network includes basic modules, Upsampling module and convolution module; wherein, the basic module includes one or more (Residual-in-Residual Dense Block, RRDB), the RRDB includes multiple processing modules, and each processing module includes sequentially connected first convolution layer , the second convolutional layer and the activation layer.
  • the input of each processing module is used as the input of each subsequent processing module, which increases the image features during training and improves the repair effect.
  • the first convolutional layer is smaller than the second convolutional layer, which reduces the amount of calculation and processing delay. .
  • the ESRGAN network is obtained by improving the super-Resolution Generative adversarial network (SRGAN). Specifically, three key parts of the three SRGAN networks are improved: 1) the network structure, the basic unit of the SRGAN network is changed from the basic residual unit to the RRDB; 2) the adversarial loss, the GAN network is improved to (Relativistic average GAN, RaGAN); 3) Receptive domain loss function, using the computer vision group (Visual Geometry Group, VGG) features before activation.
  • SRGAN super-Resolution Generative adversarial network
  • the structure of the RRDB module is shown in Figure 1a, including multiple convolutional layers conv and activation layers LRelu connected in sequence.
  • the input of each convolutional layer will also be used as the input of subsequent convolutional layers, thereby increasing the number of image features based on training.
  • this method can improve the model training effect, it will increase the amount of calculation, reduce the training rate, and increase the delay.
  • the embodiment of the present application adds a convolution layer, namely the first convolution layer, and the convolution kernel of the first convolution layer is smaller than the original convolution layer (the second convolution layer). the convolution kernel.
  • the original convolutional layer is Conv2, and the new first convolutional layer is Conv1.
  • the convolution kernel of Conv2 is 3*3, and the convolution kernel of Conv1 can be 1*1.
  • the convolution kernel of Conv2 is 5*5, and the convolution kernel of Conv1 can be 3*3.
  • the ERSGAN network itself includes a basic block composed of 16-23 RRDB modules connected in series.
  • the basic modules in the embodiments of the present application can be configured according to actual needs.
  • the basic module only includes one RRDB module.
  • the convolution module in the ESRGAN network in the embodiment of the present application may further include a third convolution layer.
  • the embodiment of the present application includes a basic module, an upsampling module Upsampling, and a convolution module, wherein the convolution module includes two third convolution layers covn1 and two fourth convolution layers covn2.
  • the convolution kernel of the third convolution layer may be equal to the convolution kernel of the first convolution layer
  • the convolution kernel of the fourth convolution layer may be equal to the convolution kernel of the second convolution layer.
  • FIG. 2 is a flowchart of a method for generating a super-score repair network model provided by an embodiment of the present application, as shown in FIG. 2 , the method may include:
  • S201 Obtain an image to be trained, where the image to be trained is a low-resolution image.
  • the images to be trained are first obtained to train the student network model with the images to be trained. Specifically, in order to improve the training effect and the generalization of the network, a large number of diverse images to be trained can be obtained to enhance the learning ability of the network.
  • S202 Input the images to be trained into the student network model and the teacher network model respectively, and obtain a loss function corresponding to the student network model.
  • this embodiment uses the knowledge distillation algorithm to train the student network model, so that the processing effect of the trained student network model can approach the processing effect of the teacher network model. Therefore, the images to be trained are input into the student network model and the teacher network model respectively, and the loss function corresponding to the student network model is obtained.
  • knowledge distillation refers to the idea of model compression, by using a larger trained network (teacher network model) to guide a smaller network (student network model) to learn the ability or behavior of the teacher network model .
  • obtaining the loss function corresponding to the student network model can be implemented in the following ways:
  • pixel-wise-distillation regards each pixel as a unit of classification and performs distillation independently; holistic distillation (holistic distillation) uses an adversarial training strategy, such as taking the output of the teacher network model as true, Taking the output of the student network model as false, conduct adversarial learning, so that the output of the student network model and the teacher network model cannot be distinguished, and achieve image-level knowledge distillation.
  • the overall distillation mainly trains the texture features of the images.
  • the first loss function represents the loss function between the output result of the student network model for the first pixel in the image to be trained and the output result of the teacher network model for the first pixel in the image to be trained;
  • the second loss function represents the student network model.
  • the first pixel is any pixel in the image to be trained.
  • the first loss function corresponding to pixel distillation may be an L1loss loss function or an L2loss loss function.
  • the L1loss loss function is also called the minimum absolute value deviation or the minimum absolute value error, and the calculation formula is as follows:
  • yi represents the ith pixel value of the teacher network model
  • f(xi) represents the ith pixel value of the student network
  • n represents the number of pixels in the image to be trained.
  • the L2loss loss function can also be called the minimum mean square deviation or the minimum mean square error, and the formula is as follows:
  • the first loss function loss1 can be obtained by calculating at the position where the output of the basic module of the ESRGAN network is located.
  • the second loss function loss2 can be the cross-entropy loss function commonly used by the gan network, or other functions that can measure the distance between two parameters, such as a loss function based on Wasserstein distance, etc.
  • the position where the second loss function is calculated is the final output of the student network model and the final output of the teacher network model.
  • the calculation formula of the cross entropy loss function is as follows:
  • G generator, D discriminator, d represents the real data
  • D d represents the discriminator's discrimination result on the real data
  • G(p) represents the generated fake data
  • D(G(p)) represents the discriminator for the fake data
  • the result of discriminating the data The value of the log function in the interval [0, 1] is (- ⁇ , 0), which is an increasing function.
  • This formula reflects the process of the game between the generator and the discriminator. The generator hopes that the loss is as small as possible, and the discriminator hopes that the loss is as large as possible.
  • the overall loss function corresponding to the student network model can be obtained according to loss1 and loss2.
  • the third loss function loss3 corresponding to the discriminator can also be obtained, that is, the loss function between the discriminator corresponding to the student network model and the discriminator corresponding to the teacher network model.
  • the third loss function may be an L1loss loss function or an L2loss loss function. Then, weighted summation can be performed according to loss1, loss2 and loss3 to obtain the overall loss function corresponding to the student network model.
  • the student network model may be an initial network model, or a network model obtained by simply training the initial network model, thereby improving the efficiency of subsequent training.
  • the images to be trained are respectively input into the initial network model and the teacher network model to obtain the fourth loss function corresponding to the pixel distillation; the parameters of the initial network model are updated according to the fourth loss function to obtain the student network model, and then in the Train the student network model.
  • the fourth loss function may be an L1loss function or an L2loss loss function.
  • the student network model or the initial network model is an enhanced super-resolution generative adversarial network ESRGAN, which includes a basic module, an upsampling module and a convolution module.
  • the basic module includes one or more residual density modules RRDB
  • the RRDB includes multiple processing modules
  • the input of each processing module is used as the input of the subsequent processing module
  • the processing module includes the first convolutional layer connected in sequence, The second convolutional layer and activation layer.
  • the convolution kernel of the first convolution layer is smaller than the convolution kernel of the second convolution layer.
  • the structure of the RRDB module can be seen in Figure 1b
  • the structure of the ESRGAN network can be seen in Figure 1c.
  • the learning perceptual image patch similarity (Learned Perceptual Image Patch Similarity, LPIPS) can be used as an indicator for training.
  • S203 Update the parameters of the student network model according to the loss function, so that the loss function of the student network model satisfies a preset condition, and train to generate an over-score repair network model.
  • the loss function corresponding to the student network model is obtained, and then the parameters of the student network model can be continuously updated by using the back-propagation algorithm, so that the loss function of the student network model satisfies the preset conditions, and then the training generates an overscore.
  • the preset condition may be that the loss function corresponding to the student network model is the smallest.
  • the image to be trained (low-resolution image) is obtained, and the image to be trained is input into the student network model and the teacher network model respectively to obtain the loss function corresponding to the student network model.
  • the teacher network model is a large network that has been trained.
  • the parameters of the student network model are updated according to the loss function, so that the loss function of the student network model satisfies the preset conditions, and then the over-score repair network is generated.
  • the student network model is an enhanced super-resolution generative adversarial ESRGAN network, which includes a basic module, an upsampling module and a convolution module.
  • the basic module includes one or more RRDB modules
  • the RRDB module includes multiple processing modules
  • the input of each processing module is used as the input of the subsequent processing module. That is, when training the student network model, the input of each processing model is used as the input of each subsequent processing module to enhance the transmission of features, so that the subsequent processing modules can use more image features for training to improve the The effect of over-repair.
  • the processing module includes a first convolution layer, a second convolution layer, and an activation layer that are connected in sequence, and the convolution kernel of the first convolution layer is smaller than the convolution kernel of the second convolution layer. That is, a first convolution layer with a smaller convolution kernel is added to the processing module, thereby reducing the dimension of image features, reducing the amount of calculation, and improving the processing speed.
  • the super-resolution inpainting network model generated by training in the above-mentioned embodiment, in practical applications, can be used to inpaint low-resolution images.
  • FIG. 3 is a flowchart of an image super-score restoration method provided by an embodiment of the present application, as shown in FIG. 3 , the method may include:
  • S301 Acquire a to-be-processed image, where the to-be-processed image is a low-resolution image.
  • S302 Input the to-be-processed image into a super-resolution repair network model to obtain a target image, where the target image is a super-resolution image corresponding to the to-be-processed image.
  • the low-resolution image can be input into the over-score and repair network model, so that the low-resolution image can be Perform super-resolution inpainting to obtain super-resolution images.
  • the super-score repair network model can be trained and generated by the method shown in Figure 2.
  • the super-resolution inpainting network model generated by the training in the embodiment of the present application is an enhanced super-resolution generative adversarial ESRGAN network
  • the ESRGAN network includes a basic module, an upsampling module and a convolution module.
  • the basic module includes one or more RRDB modules
  • the RRDB module includes multiple processing modules
  • the input of each processing module is used as the input of the subsequent processing module. That is, during training, the student network model uses the input of each processing model as the input of each subsequent processing module to enhance the transfer of features, so that the subsequent processing modules can use more image features for training to improve The effect of over-repair.
  • the first convolution layer with smaller convolution kernels is also added, thereby reducing the processing delay. It can be seen that by using the super-score repair network model, the repair effect and the processing delay can be balanced.
  • the embodiments of the present application further provide an apparatus for generating a network model for super-resolution restoration and an apparatus for super-resolution restoration of images, which will be described below with reference to the accompanying drawings.
  • FIG. 4 this figure is a structural diagram of an apparatus for generating a superdivision repair network model provided by an embodiment of the present application.
  • the apparatus 400 may include:
  • the first acquisition unit 401 is used to acquire an image to be trained, and the image to be trained is a low-resolution variable image;
  • the second obtaining unit 402 is configured to input the to-be-trained images into the student network model and the teacher network model, respectively, to obtain a loss function corresponding to the student network model;
  • Generation unit 403 is used to update the parameters of the student network model according to the loss function, so that the loss function of the student network model satisfies preset conditions, and generates an over-score repair network;
  • the student network model is an enhanced super-resolution generative confrontation ESRGAN network.
  • the ESRGAN network includes a basic module, an upsampling module, and a convolution module.
  • the basic module includes one or more RRDB modules, and the RRDB module includes a plurality of processing module, the output of each processing module is used as the input of the subsequent processing module, and the processing module includes a first convolutional layer, a second convolutional layer and an activation layer that are connected in sequence, and the first convolutional layer is The convolution kernel is smaller than the convolution kernel of the second convolution layer.
  • the convolution kernel of the first convolution layer is 1*1.
  • the basic module includes an RRDB module.
  • the convolution module includes a third convolution layer, and the convolution kernel of the third convolution layer is equal to the convolution kernel of the first convolution layer.
  • the second obtaining unit is specifically configured to input the to-be-trained images into the student network model and the teacher network model respectively, so as to obtain the first loss function corresponding to pixel distillation and the corresponding loss function of overall distillation.
  • a second loss function according to the first loss function and the second loss function, a loss function corresponding to the student network model is obtained.
  • the second obtaining unit is specifically configured to input the to-be-trained images into the student network model and the teacher network model respectively, so as to obtain the first loss function corresponding to pixel distillation and the corresponding loss function of overall distillation.
  • the second loss function and the third loss function corresponding to the discriminator; the weighted summation is performed on the first loss function, the second loss function and the third loss function to obtain the loss function corresponding to the student network model.
  • the first loss function represents the output result of the student network model for the first pixel in the image to be trained and the output result of the teacher network model for the first pixel in the image to be trained.
  • the second loss function represents the difference between the output result of the student network model for the image to be trained and the output result of the teacher network model for the image to be trained.
  • the third loss function represents the loss function between the discriminator corresponding to the student network model and the discriminator corresponding to the teacher network model, and the first pixel is any pixel in the image to be trained.
  • the device further includes:
  • the third obtaining unit is configured to input the to-be-trained image into the initial network model and the teacher network model respectively, so as to obtain the fourth loss function corresponding to the pixel distillation; according to the fourth loss function, the parameters of the initial network model are Update to get the student network model.
  • the fourth loss function represents the output result of the initial network model for the second pixel in the image to be trained and the output result of the teacher network model for the first pixel in the image to be trained The loss function between the output results of two pixels, where the second pixel is any pixel in the image to be trained.
  • this figure is a structural diagram of an image super-resolution restoration apparatus provided by an embodiment of the present application, and the apparatus 500 includes:
  • a first obtaining unit 501 configured to obtain a to-be-processed image, where the to-be-processed image is a low-resolution image;
  • the second obtaining unit 502 is configured to input the to-be-processed image into a super-resolution inpainting network model to obtain a target image, where the target image is a super-resolution image corresponding to the to-be-processed image, and the super-resolution inpainting network model is It is generated by training according to the super-score repair network model generation method.
  • FIG. 6 it shows a schematic structural diagram of an electronic device (eg, a terminal device or a server in FIG. 6 ) 700 suitable for implementing an embodiment of the present disclosure.
  • Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, mobile terminals such as in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, and the like.
  • the electronic device shown in FIG. 6 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • an electronic device 700 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 701 that may be loaded into random access according to a program stored in a read only memory (ROM) 702 or from a storage device 708 Various appropriate actions and processes are executed by the programs in the memory (RAM) 703 . In the RAM 703, various programs and data necessary for the operation of the electronic device 700 are also stored.
  • the processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704.
  • An input/output (I/O) interface 705 is also connected to bus 704 .
  • I/O interface 705 the following devices can be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration An output device 707 of a computer, etc.; a storage device 708 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 709. Communication means 709 may allow electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 5 shows electronic device 700 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 709 , or from the storage device 708 , or from the ROM 702 .
  • the processing device 701 the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the electronic device provided by the embodiments of the present disclosure and the super-resolution network model generation method or the image super-resolution restoration method provided by the above-mentioned embodiments belong to the same inventive concept.
  • This embodiment has the same beneficial effects as the above-mentioned embodiments.
  • Embodiments of the present disclosure provide a computer storage medium on which a computer program is stored, and when the program is executed by a processor, implements the super-score-repair network model generation method or the image over-score restoration method provided by the foregoing embodiments.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
  • the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device can execute:
  • the image to be trained is a low-resolution variable image
  • the student network model is an enhanced super-resolution generative adversarial network ESRGAN.
  • the ESRGAN network includes a basic module, an upsampling module and a convolution module.
  • the basic module includes one or more residual density modules RRDB, and the RRDB includes A plurality of processing modules, the input of each processing module is used as the input of the subsequent processing module, and the processing module includes a first convolution layer, a second convolution layer and an activation layer connected in sequence, and the first convolution layer
  • the convolution kernel of the layer is smaller than the convolution kernel of the second convolution layer.
  • the image to be processed is a low-resolution image
  • Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Among them, the name of the unit/module does not constitute a limitation of the unit itself under certain circumstances.
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLDs Complex Programmable Logical Devices
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • At least one (item) refers to one or more, and "a plurality” refers to two or more.
  • “And/or” is used to describe the relationship between related objects, indicating that there can be three kinds of relationships, for example, “A and/or B” can mean: only A, only B, and both A and B exist , where A and B can be singular or plural.
  • the character “/” generally indicates that the associated objects are an “or” relationship.
  • At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
  • At least one (a) of a, b or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c" ", where a, b, c can be single or multiple.
  • the steps of a method or algorithm described in connection with the embodiments disclosed herein may be directly implemented in hardware, a software module executed by a processor, or a combination of the two.
  • the software module can be placed in random access memory (RAM), internal memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other in the technical field. in any other known form of storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

Procédé de génération d'un modèle de réseau de réparation à super-résolution, consistant : à obtenir une image à former ; à entrer ladite image dans un modèle de réseau étudiant et un modèle de réseau enseignant, et à obtenir une fonction de perte correspondant au modèle de réseau étudiant ; à mettre à jour les paramètres du modèle de réseau étudiant en fonction de la fonction de perte, de telle sorte que la fonction de perte du modèle de réseau étudiant satisfait à une condition prédéfinie, et à générer un modèle de réseau de réparation à super-résolution. Le modèle de réseau étudiant est un réseau ESRGAN. Le réseau ESRGAN comprend un module de base, un module de suréchantillonnage et un module de convolution. Le module de base comprend un ou plusieurs modules RRDB. Les modules RRDB comprennent une pluralité de modules de traitement, et l'entrée de chaque module de traitement est l'entrée du module de traitement ultérieur. Les modules de traitement comprennent chacun une première couche de convolution, une seconde couche de convolution et une couche d'activation, qui sont connectées en séquence. Le noyau de convolution de la première couche de convolution est plus petit que le noyau de convolution de la seconde couche de convolution. Sont également divulgués un procédé et un appareil de réparation à super-résolution d'image.
PCT/CN2022/080294 2021-04-27 2022-03-11 Procédé de génération d'un modèle de réseau de réparation à super-résolution, et procédé et appareil de réparation à super-résolution d'image WO2022227886A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110459814.1 2021-04-27
CN202110459814.1A CN113177888A (zh) 2021-04-27 2021-04-27 超分修复网络模型生成方法、图像超分修复方法及装置

Publications (1)

Publication Number Publication Date
WO2022227886A1 true WO2022227886A1 (fr) 2022-11-03

Family

ID=76926642

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/080294 WO2022227886A1 (fr) 2021-04-27 2022-03-11 Procédé de génération d'un modèle de réseau de réparation à super-résolution, et procédé et appareil de réparation à super-résolution d'image

Country Status (2)

Country Link
CN (1) CN113177888A (fr)
WO (1) WO2022227886A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116994309A (zh) * 2023-05-06 2023-11-03 浙江大学 一种公平性感知的人脸识别模型剪枝方法

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177888A (zh) * 2021-04-27 2021-07-27 北京有竹居网络技术有限公司 超分修复网络模型生成方法、图像超分修复方法及装置
CN113850367B (zh) * 2021-08-31 2022-08-26 荣耀终端有限公司 网络模型的训练方法、图像处理方法及其相关设备
CN113922985B (zh) * 2021-09-03 2023-10-31 西南科技大学 一种基于集成学习的网络入侵检测方法及***
CN116071275B (zh) * 2023-03-29 2023-06-09 天津大学 基于在线知识蒸馏和预训练先验的人脸图像修复方法
CN116681594B (zh) * 2023-07-26 2023-11-21 摩尔线程智能科技(北京)有限责任公司 图像处理方法及装置、设备、介质
CN117857842B (zh) * 2024-03-07 2024-05-28 淘宝(中国)软件有限公司 直播场景中的画质处理方法及电子设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268292A1 (en) * 2017-03-17 2018-09-20 Nec Laboratories America, Inc. Learning efficient object detection models with knowledge distillation
CN108830813A (zh) * 2018-06-12 2018-11-16 福建帝视信息科技有限公司 一种基于知识蒸馏的图像超分辨率增强方法
CN110418139A (zh) * 2019-08-01 2019-11-05 广东工业大学 一种基于esrgan的视频超分辨修复技术
CN111062872A (zh) * 2019-12-17 2020-04-24 暨南大学 一种基于边缘检测的图像超分辨率重建方法及***
CN112200722A (zh) * 2020-10-16 2021-01-08 鹏城实验室 图像超分辨重构模型的生成方法、重构方法及电子设备
CN112288632A (zh) * 2020-10-29 2021-01-29 福州大学 基于精简esrgan的单图像超分辨率方法及***
CN113177888A (zh) * 2021-04-27 2021-07-27 北京有竹居网络技术有限公司 超分修复网络模型生成方法、图像超分修复方法及装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796619B (zh) * 2019-10-28 2022-08-30 腾讯科技(深圳)有限公司 一种图像处理模型训练方法、装置、电子设备及存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268292A1 (en) * 2017-03-17 2018-09-20 Nec Laboratories America, Inc. Learning efficient object detection models with knowledge distillation
CN108830813A (zh) * 2018-06-12 2018-11-16 福建帝视信息科技有限公司 一种基于知识蒸馏的图像超分辨率增强方法
CN110418139A (zh) * 2019-08-01 2019-11-05 广东工业大学 一种基于esrgan的视频超分辨修复技术
CN111062872A (zh) * 2019-12-17 2020-04-24 暨南大学 一种基于边缘检测的图像超分辨率重建方法及***
CN112200722A (zh) * 2020-10-16 2021-01-08 鹏城实验室 图像超分辨重构模型的生成方法、重构方法及电子设备
CN112288632A (zh) * 2020-10-29 2021-01-29 福州大学 基于精简esrgan的单图像超分辨率方法及***
CN113177888A (zh) * 2021-04-27 2021-07-27 北京有竹居网络技术有限公司 超分修复网络模型生成方法、图像超分修复方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116994309A (zh) * 2023-05-06 2023-11-03 浙江大学 一种公平性感知的人脸识别模型剪枝方法
CN116994309B (zh) * 2023-05-06 2024-04-09 浙江大学 一种公平性感知的人脸识别模型剪枝方法

Also Published As

Publication number Publication date
CN113177888A (zh) 2021-07-27

Similar Documents

Publication Publication Date Title
WO2022227886A1 (fr) Procédé de génération d'un modèle de réseau de réparation à super-résolution, et procédé et appareil de réparation à super-résolution d'image
WO2022105638A1 (fr) Procédé et appareil de traitement de dégradation d'image, ainsi que support de stockage et dispositif électronique
CN110413812B (zh) 神经网络模型的训练方法、装置、电子设备及存储介质
WO2023273985A1 (fr) Procédé et appareil d'apprentissage de modèle de reconnaissance vocale, et dispositif
WO2023030348A1 (fr) Procédé et appareil de génération d'image, dispositif et support de stockage
WO2020207174A1 (fr) Procédé et appareil de génération de réseau neuronal quantifié
WO2022012179A1 (fr) Procédé et appareil pour générer un réseau d'extraction de caractéristique, dispositif et support lisible par ordinateur
WO2023077995A1 (fr) Procédé et appareil d'extraction d'informations, dispositif, support et produit
WO2022105622A1 (fr) Procédé et appareil de segmentation d'image, support lisible et dispositif électronique
WO2022100481A1 (fr) Procédé et appareil de traduction d'informations de texte, dispositif électronique et support d'enregistrement
CN112800276B (zh) 视频封面确定方法、装置、介质及设备
JP7324891B2 (ja) バックボーンネットワーク生成方法、装置、電子機器、記憶媒体およびコンピュータプログラム
WO2022012178A1 (fr) Procédé de génération de fonction objective, appareil, dispositif électronique et support lisible par ordinateur
WO2023217117A1 (fr) Procédé et appareil d'évaluation d'image ainsi que dispositif, support d'enregistrement et produit-programme
WO2023045870A1 (fr) Procédé, appareil et dispositif de compression de modèle de réseau, procédé de génération d'image et support
WO2023011397A1 (fr) Procédé de génération de caractéristiques acoustiques, d'entraînement de modèles vocaux et de reconnaissance vocale, et dispositif
WO2022252883A1 (fr) Procédé d'entraînement pour modèle de retouche d'image et procédé, appareil et dispositif de retouche d'image
CN114972876A (zh) 基于知识蒸馏技术的图像处理方法、装置、设备及介质
CN111680754B (zh) 图像分类方法、装置、电子设备及计算机可读存储介质
CN111737575B (zh) 内容分发方法、装置、可读介质及电子设备
WO2023093828A1 (fr) Procédé et appareil de traitement d'images de super-résolution basés sur un gan, dispositif et support
WO2023093481A1 (fr) Procédé et appareil de traitement des images à super-résolution basé sur un domaine de fourier, dispositif et support
WO2023093838A1 (fr) Procédé et appareil de traitement d'image de super-résolution, dispositif et support
CN116912631B (zh) 目标识别方法、装置、电子设备及存储介质
CN111860518B (zh) 用于分割图像的方法、装置、设备和计算机可读介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22794365

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22794365

Country of ref document: EP

Kind code of ref document: A1