WO2023029418A1 - 图像超分辨率模型训练方法、装置和计算机可读存储介质 - Google Patents

图像超分辨率模型训练方法、装置和计算机可读存储介质 Download PDF

Info

Publication number
WO2023029418A1
WO2023029418A1 PCT/CN2022/078890 CN2022078890W WO2023029418A1 WO 2023029418 A1 WO2023029418 A1 WO 2023029418A1 CN 2022078890 W CN2022078890 W CN 2022078890W WO 2023029418 A1 WO2023029418 A1 WO 2023029418A1
Authority
WO
WIPO (PCT)
Prior art keywords
resolution
image
super
model
loss function
Prior art date
Application number
PCT/CN2022/078890
Other languages
English (en)
French (fr)
Inventor
易自尧
徐科
杨维
孔德辉
宋剑军
Original Assignee
深圳市中兴微电子技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市中兴微电子技术有限公司 filed Critical 深圳市中兴微电子技术有限公司
Publication of WO2023029418A1 publication Critical patent/WO2023029418A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting

Definitions

  • the embodiments of the present application relate to but are not limited to the technical field of computer vision, and in particular relate to an image super-resolution model training method, device and computer-readable storage medium.
  • Embodiments of the present application provide an image super-resolution model training method, device, and computer-readable storage medium.
  • the embodiment of the present application provides an image super-resolution model training method, including: obtaining a low-resolution image and its corresponding real high-resolution image to form a training set; inputting the low-resolution image in the training set high-resolution images and their corresponding real high-resolution images to a preset encoding-decoding model to determine model parameters; initialize an image super-resolution model according to the model parameters; combine the low-resolution images in the training set
  • the image is input to the image super-resolution model to obtain a super-resolution image; determine the information entropy loss between the real high-resolution image and the super-resolution image; according to the information entropy loss and a preset loss function Combining the loss coefficient to construct a total loss function, the loss coefficient is used to characterize the weight value accounting for the total loss function; based on the total loss function, the image super-resolution model is model trained to obtain all training completed The image super-resolution model described above.
  • the embodiment of the present application provides an image super-resolution model training device, including: a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor executes the computer program
  • the program implements the image super-resolution model training method described in the first aspect above.
  • an embodiment of the present application provides an electronic device, including: a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor implements the above-mentioned first step when executing the computer program.
  • the image super-resolution model training method is provided.
  • the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer-executable program, and the computer-executable program is used to make the computer perform the above-mentioned first aspect.
  • Image super-resolution model training method is used to make the computer perform the above-mentioned first aspect.
  • Fig. 1 is the main flowchart of a kind of image super-resolution model training method provided by one embodiment of the present application
  • FIG. 2 is a schematic structural diagram of an image super-resolution model initialized using an encoding-decoding model provided by an embodiment of the present application;
  • Fig. 3 is a subflow chart of an image super-resolution model training method provided by an embodiment of the present application
  • Fig. 4 is a subflow chart of an image super-resolution model training method provided by an embodiment of the present application.
  • Fig. 5 is a schematic structural diagram of an image super-resolution model training device provided by an embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • multiple means more than two, greater than, less than, exceeding, etc. are understood as not including the original number, and above, below, within, etc. are understood as including the original number. If there is a description of "first”, “second”, etc., it is only for the purpose of distinguishing technical features, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the indicated The sequence relationship of the technical characteristics.
  • Embodiments of the present application provide an image super-resolution model training method, device, and computer-readable storage medium.
  • a training set is formed by acquiring low-resolution images and their corresponding real high-resolution images, and the low-resolution images in the training set are input.
  • the image and its corresponding real high-resolution image are sent to the preset encoding-decoding model to determine the model parameters, initialize the image super-resolution model according to the model parameters, and input the low-resolution images in the training set to the image super-resolution model, Get the super-resolution image, determine the information entropy loss between the real high-resolution image and the super-resolution image, and construct the total loss function according to the information entropy loss and the preset loss function combined with the loss coefficient.
  • the loss coefficient is used to represent the total loss
  • the weight value of the function based on the total loss function, performs model training on the image super-resolution model, and obtains the trained image super-resolution model. Based on this, the low-resolution images in the training set and their corresponding real high-resolution images are input to the preset encoding-decoding model to determine the model parameters associated with the training set data, and use the model parameters to initialize the image super-resolution rate model to provide a good initial starting point for image super-resolution models.
  • the information entropy loss between the real high-resolution image and the super-resolution image is introduced, and the total loss function is constructed by superimposing the information entropy loss to the preset loss function and combining the loss coefficient to increase the training time of the image super-resolution model.
  • the ability to adjust the loss coefficient to achieve the optimal combination can improve the training effect of the image super-resolution model.
  • FIG. 1 is a flowchart of an image super-resolution model training method provided by an embodiment of the present application.
  • the image super-resolution model training method includes but is not limited to the following steps:
  • Step 101 obtaining a low-resolution image and its corresponding real high-resolution image to form a training set
  • Step 102 input the low-resolution images in the training set and the corresponding real high-resolution images to the preset encoding-decoding model to determine the model parameters;
  • Step 103 initialize the image super-resolution model according to the model parameters
  • Step 104 input the low-resolution images in the training set to the image super-resolution model to obtain super-resolution images;
  • Step 105 determining the information entropy loss between the real high-resolution image and the super-resolution image
  • Step 106 according to the information entropy loss and the preset loss function combined with the loss coefficient to construct the total loss function, the loss coefficient is used to represent the weight value of the total loss function;
  • Step 107 based on the total loss function, perform model training on the image super-resolution model to obtain a trained image super-resolution model.
  • the low-resolution image LR and its corresponding real high-resolution image Ground Truth HR are input into the encoder encoder in the encoding-decoding model, and an encoder
  • the encoder obtains an intermediate low-resolution image LR I , in which, in order to prevent the distribution of LR I and LR from being consistent, it is necessary to constrain the value of LR I , such as simple difference or L1 norm constraints and other related constraints, and then through a
  • the decoder decoder obtains a super-resolution picture SR I .
  • the model parameters of the decoder decoder are initialized to the current network (ie, the image super-resolution model), thus providing a good initial starting point for the training of the image super-resolution model , so that the super-resolution image SR can be obtained by inputting the low-resolution image LR into the image super-resolution model.
  • the model parameters initialize the image super-resolution model according to the model parameters, input the low-resolution images in the training set to the image super-resolution model, obtain the super-resolution image, and determine the difference between the real high-resolution image and the super-resolution image
  • the total loss function is constructed according to the information entropy loss and the preset loss function combined with the loss coefficient.
  • the loss coefficient is used to represent the weight value of the total loss function.
  • the image super-resolution model is trained. , to obtain the trained image super-resolution model.
  • the low-resolution images in the training set and their corresponding real high-resolution images are input to the preset encoding-decoding model to determine the model parameters associated with the training set data, and use the model parameters to initialize the image super-resolution rate model to provide a good initial starting point for image super-resolution models.
  • the information entropy loss between the real high-resolution image and the super-resolution image is introduced, and the total loss function is constructed by superimposing the information entropy loss to the preset loss function and combining the loss coefficient to increase the training time of the image super-resolution model.
  • the ability to adjust the loss coefficient to achieve the optimal combination can improve the training effect of the image super-resolution model.
  • information entropy represents the amount of information contained in a picture.
  • information entropy loss for example, using the information entropy difference as a loss function to describe the difference in information content between the output picture and the real ground truth picture, the image super-resolution The model has a positive effect on model training.
  • GPU servers or workstations can be used for model training, and multiple graphics cards can be used for parallel operations during the training process.
  • the system environment of the GPU server can be Ubuntu or Windows. Model reasoning can be performed in GPU servers or in AI chips.
  • step 105 may include but not limited to the following sub-steps:
  • Step 1051 calculating the first information entropy according to the real high-resolution image
  • Step 1052 calculating the second information entropy according to the super-resolution image
  • Step 1053 taking the absolute value of the difference between the first information entropy and the second information entropy to obtain an information entropy difference.
  • the first information entropy is calculated according to the real high-resolution image
  • the second information entropy is calculated according to the super-resolution image
  • the low-resolution LR images in the training set are input to the image super-resolution model to obtain the super-resolution image SR, and the histograms of SR and HR are calculated to obtain the probability Pxi of pixel values from 0 to 255, where x The value range of i is 0...255.
  • the calculation formula of the information entropy difference H(x) is as follows:
  • step 106 may include but not limited to the following sub-steps:
  • Step 1061 Determine the first weight coefficient corresponding to the preset loss function and the second weight coefficient corresponding to the information entropy difference based on multi-task learning;
  • Step 1062 constructing a total loss function according to the preset loss function, the first weight coefficient, the information entropy difference and the second weight coefficient.
  • the total loss function Loss is constructed according to the preset loss function, the first weight coefficient, the information entropy difference and the second weight coefficient, specifically, the first weight coefficient can be obtained according to the product of the preset loss function and the first weight coefficient Product; according to the product of the information entropy difference and the second weight coefficient, the second product is obtained; according to the sum of the first product and the second product, the total loss function is obtained.
  • the calculation formula of the total loss function Loss is as follows:
  • Loss ⁇ 1 ⁇ Loss pre + ⁇ 2 ⁇ Loss Info
  • Loss pre represents the preset loss function previously designed in the image super-resolution model
  • ⁇ 1 is the first weight coefficient
  • ⁇ 2 is the second weight coefficient.
  • the first weight coefficient ⁇ 1 corresponding to the preset loss function and the second weight coefficient ⁇ 2 corresponding to the information entropy difference may be determined based on multi-task learning.
  • the optimization task of the multi-task formula is as follows:
  • c is the task parameter
  • ⁇ sh is the shared task parameter
  • ⁇ Info and ⁇ pre are the exclusive parameters of its own task.
  • the total loss function constructed by using the optimized loss coefficient can train the image super-resolution model to achieve better results, so that the super-resolution image can be obtained by inputting the low-resolution image into the image super-resolution model.
  • the task parameters can be determined by finding a Pareto stationary point for multi-task learning, and the first weight coefficient and the second weight coefficient can be determined according to the task parameters. Therefore, to find the minimum value of the optimization task of the above multi-task formula, it can be realized by deriving it, that is, it can be realized by finding the Pareto stationary point, and the following formula must be satisfied to reach the Pareto stationary point:
  • first weight coefficient and the second weight coefficient may also be dynamically updated by using multi-task gradients.
  • the coefficient of the total loss function can also be determined using other multi-task coefficient determination methods.
  • the gradient grad can be used to dynamically update the parameters, because in multi-task learning, the problem of multi-task gradient imbalance is solved.
  • the weight coefficient can be known
  • the update gradient of w can use the gradient update formula to dynamically update w, just like updating the parameters of the neural network, use the following formula to update the first weight coefficient W pre and the second weight coefficient W Info :
  • follows the global neural network learning rate
  • t represents the t-th training.
  • step 5 Use the derivative of step 5 to update w;
  • step 101 may include but not limited to the following sub-steps:
  • the images in the training set are preprocessed, and the preprocessing includes reducing, rotating and flipping the images in the training set.
  • the training set is expanded by preprocessing the images of the training set.
  • the preprocessing includes but not limited to shrinking, rotating and flipping the images in the training set.
  • Training set preparation The training set can be downloaded from public data sets on the Internet such as DIV2K, URBAN100, BSD100, SET14, etc., or can be produced by using shooting tools such as cameras, mobile phones, or computers.
  • Preprocessing of images in the training set the images in the training set are reduced by different multiples to expand the training set, and the images in the training set are rotated and flipped counterclockwise.
  • the determined parameters include network optimizer, learning rate loss function, etc. Among them, the setting of the loss function is obtained by formula (1).
  • H and W are the height and width of high resolution
  • H' and W' are the height and width of low resolution
  • Steps 1 and 2 are the same as training and initializing the network.
  • Loss pre represents the preset loss function previously designed in the image super-resolution model
  • ⁇ 1 is the first weight coefficient
  • ⁇ 2 is the second weight coefficient.
  • the first weight coefficient ⁇ 1 corresponding to the preset loss function and the second weight coefficient ⁇ 2 corresponding to the information entropy difference may be determined based on multi-task learning.
  • the optimization task of the multi-task formula is as follows:
  • c is the task parameter
  • ⁇ sh is the shared task parameter
  • ⁇ Info and ⁇ pre are the exclusive parameters of its own task.
  • the total loss function constructed by using the optimized loss coefficient can train the image super-resolution model to achieve better results, so that the super-resolution image can be obtained by inputting low resolution into the image super-resolution model.
  • first weight coefficient and the second weight coefficient may also be dynamically updated by using multi-task gradients.
  • the coefficient of the total loss function can also be determined using other multi-task coefficient determination methods.
  • the gradient grad can be used to dynamically update the parameters, because in multi-task learning, the problem of multi-task gradient imbalance is solved.
  • the weight coefficient can be known
  • the update gradient of w can use the gradient update formula to dynamically update w, just like updating the parameters of the neural network, use the following formula to update the first weight coefficient W pre and the second weight coefficient W Info :
  • follows the global neural network learning rate
  • t represents the t-th training.
  • step 5 Use the derivative of step 5 to update w;
  • the embodiment of the present application also provides an image super-resolution model training device.
  • the image super-resolution model training device includes: one or more processors and memories, and one processor and memories are taken as an example in FIG. 5 .
  • the processor and the memory may be connected through a bus or in other ways, and connection through a bus is taken as an example in FIG. 5 .
  • the memory can be used to store non-transitory software programs and non-transitory computer-executable programs, such as the image super-resolution model training method in the above embodiments of the present application.
  • the processor implements the image super-resolution model training method in the above-mentioned embodiments of the present application by running the non-transitory software program and the program stored in the memory.
  • the memory may include a storage program area and a storage data area, wherein the storage program area may store the operating system and at least one application required by the function; the storage data area may store and execute the image super-resolution model training method in the above-mentioned embodiment of the present application required data etc.
  • the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices.
  • the memory may include memory that is remotely located relative to the processor, and these remote memories may be connected to the image super-resolution model training device through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the non-transitory software programs and programs required to implement the image super-resolution model training method in the above-mentioned embodiments of the present application are stored in the memory, and when executed by one or more processors, the image in the above-mentioned embodiments of the present application is executed.
  • the super-resolution model training method for example, executes method step 101 to step 107 in Fig. 1 described above, method step 1051 to step 1053 in Fig. 3, method step 1061 to step 1062 in Fig.
  • the image super-resolution model is trained to obtain the trained image super-resolution Model.
  • the low-resolution images in the training set and their corresponding real high-resolution images are input to the preset encoding-decoding model to determine the model parameters associated with the training set data, and use the model parameters to initialize the image super-resolution rate model to provide a good initial starting point for image super-resolution models.
  • the information entropy loss between the real high-resolution image and the super-resolution image is introduced, and the total loss function is constructed by superimposing the information entropy loss to the preset loss function and combining the loss coefficient to increase the training time of the image super-resolution model.
  • the ability to adjust the loss coefficient to achieve the optimal combination can improve the training effect of the image super-resolution model.
  • the embodiment of the present application also provides an electronic device.
  • the electronic device includes: one or more processors and memories, and one processor and memories are taken as an example in FIG. 6 .
  • the processor and the memory may be connected through a bus or in other ways, and connection through a bus is taken as an example in FIG. 6 .
  • the memory can be used to store non-transitory software programs and non-transitory computer-executable programs, such as the image super-resolution model training method in the above-mentioned embodiments of the present application.
  • the processor implements the image super-resolution model training method in the above-mentioned embodiments of the present application by running the non-transitory software program and the program stored in the memory.
  • the memory may include a storage program area and a storage data area, wherein the storage program area may store the operating system and at least one application required by the function; the storage data area may store and execute the image super-resolution model training method in the above-mentioned embodiment of the present application required data etc.
  • the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices.
  • the memory may include memory that is remotely located relative to the processor, and these remote memories may be connected to the image super-resolution model training device through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the non-transitory software programs and programs required to implement the image super-resolution model training method in the above-mentioned embodiments of the present application are stored in the memory, and when executed by one or more processors, the image in the above-mentioned embodiments of the present application is executed.
  • the super-resolution model training method for example, executes method step 101 to step 107 in Fig. 1 described above, method step 1051 to step 1053 in Fig. 3, method step 1061 to step 1062 in Fig.
  • the image super-resolution model is trained to obtain the trained image super-resolution Model.
  • the low-resolution images in the training set and their corresponding real high-resolution images are input to the preset encoding-decoding model to determine the model parameters associated with the training set data, and use the model parameters to initialize the image super-resolution rate model to provide a good initial starting point for image super-resolution models.
  • the information entropy loss between the real high-resolution image and the super-resolution image is introduced, and the total loss function is constructed by superimposing the information entropy loss to the preset loss function and combining the loss coefficient to increase the training time of the image super-resolution model.
  • the ability to adjust the loss coefficient to achieve the optimal combination can improve the training effect of the image super-resolution model.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer-executable program, and the computer-executable program is executed by one or more control processors, for example, as shown in FIG. 5
  • Execution by one of the processors can make the above-mentioned one or more processors execute the above-mentioned image super-resolution model training method in the embodiment of the present application, for example, execute the method steps 101 to 107 in FIG. 1 described above, FIG. Step 1051 to step 1053 of the method in 3, step 1061 to step 1062 of the method in Fig.
  • the training set by obtaining the low-resolution image and its corresponding real high-resolution image, input the low-resolution image and its corresponding real high-resolution image in the training set
  • the corresponding real high-resolution images are sent to the preset encoding-decoding model to determine the model parameters, initialize the image super-resolution model according to the model parameters, and input the low-resolution images in the training set to the image super-resolution model to obtain super-resolution High-resolution image, determine the information entropy loss between the real high-resolution image and the super-resolution image, construct the total loss function according to the information entropy loss and the preset loss function combined with the loss coefficient, and the loss coefficient is used to represent the weight of the total loss function Value, based on the total loss function, model training is performed on the image super-resolution model, and the trained image super-resolution model is obtained.
  • the low-resolution images in the training set and their corresponding real high-resolution images are input to the preset encoding-decoding model to determine the model parameters associated with the training set data, and use the model parameters to initialize the image super-resolution rate model to provide a good initial starting point for image super-resolution models.
  • the information entropy loss between the real high-resolution image and the super-resolution image is introduced, and the total loss function is constructed by superimposing the information entropy loss to the preset loss function and combining the loss coefficient to increase the training time of the image super-resolution model.
  • the ability to adjust the loss coefficient to achieve the optimal combination can improve the training effect of the image super-resolution model.
  • the embodiment of the present application includes: obtaining a low-resolution image and its corresponding real high-resolution image to form a training set, inputting the low-resolution image in the training set and its corresponding real high-resolution image to a preset encoding-decoding model, To determine the model parameters, initialize the image super-resolution model according to the model parameters, input the low-resolution images in the training set to the image super-resolution model, obtain the super-resolution image, and determine the difference between the real high-resolution image and the super-resolution image According to the information entropy loss, the total loss function is constructed according to the information entropy loss and the preset loss function combined with the loss coefficient. The loss coefficient is used to represent the weight value of the total loss function.
  • the image super-resolution model is trained. , to obtain the trained image super-resolution model.
  • the low-resolution images in the training set and their corresponding real high-resolution images are input to the preset encoding-decoding model to determine the model parameters associated with the training set data, and use the model parameters to initialize the image super-resolution rate model to provide a good initial starting point for image super-resolution models.
  • the information entropy loss between the real high-resolution image and the super-resolution image is introduced, and the total loss function is constructed by superimposing the information entropy loss to the preset loss function and combining the loss coefficient to increase the training time of the image super-resolution model.
  • the ability to adjust the loss coefficient to achieve the optimal combination can improve the training effect of the image super-resolution model.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer.
  • communication media typically embodies computer readable programs, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

一种图像超分辨率模型训练方法、装置和计算机可读存储介质。方法包括:获取低分辨率图像及其对应的真实高分辨率图像构成训练集(101);输入训练集中的低分辨率图像及其对应的真实高分辨率图像至预设的编码-解码模型,以确定模型参数(102);根据模型参数初始化图像超分辨率模型(103);将训练集中的低分辨率图像输入至图像超分辨率模型,得到超分辨率图像(104);确定真实高分辨率图像和超分辨率图像之间的信息熵损失(105);根据信息熵损失和预设损失函数结合损失系数构造总损失函数,损失系数用于表征占比总损失函数的权重值(106);基于总损失函数,对图像超分辨率模型进行模型训练,得到训练完成的图像超分辨率模型(107)。

Description

图像超分辨率模型训练方法、装置和计算机可读存储介质
相关申请的交叉引用
本申请基于申请号为202111027822.5、申请日为2021年09月02日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请实施例涉及但不限于计算机视觉技术领域,特别是涉及一种图像超分辨率模型训练方法、装置和计算机可读存储介质。
背景技术
目前提升图像超分辨率还有许多可以优化的空间,比如初始化方法,已有的算法初始化网络参数时大部分使用的是均匀分布、正交初始化等,然而,这些初始化方法大部分考虑的是激活函数的性质或者特殊的网络结构,对于比如循环神经网络的收敛问题,与训练数据没有关联,因此,无法为图像超分辨率模型提供好的模型参数作为初始起点,导致训练效果不佳。同时,损失函数的设计也直接决定了图像超分辨率模型的性能,但是,已有的损失函数是多个损失函数的线性叠加,线性参数一般使用的是“暴力试值”的方法,该方法不能达到一个最优组合,从而导致图像超分辨率模型效果不好。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本申请实施例提供了一种图像超分辨率模型训练方法、装置和计算机可读存储介质。
第一方面,本申请实施例提供了一种图像超分辨率模型训练方法,包括:获取低分辨率图像及其对应的真实高分辨率图像构成训练集;输入所述训练集中的所述低分辨率图像及其对应的所述真实高分辨率图像至预设的编码-解码模型,以确定模型参数;根据所述模型参数初始化图像超分辨率模型;将所述训练集中的所述低分辨率图像输入至所述图像超分辨率模型,得到超分辨率图像;确定所述真实高分辨率图像和所述超分辨率图像之间的信息熵损失;根据所述信息熵损失和预设损失函数结合损失系数构造总损失函数,所述损失系数用于表征占比所述总损失函数的权重值;基于所述总损失函数,对所述图像超分辨率模型进行模型训练,得到训练完成的所述图像超分辨率模型。
第二方面,本申请实施例提供了一种图像超分辨率模型训练装置,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上第一方面所述的图像超分辨率模型训练方法。
第三方面,本申请实施例提供了一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上第一方面所述的图像超分辨率模型训练方法。
第四方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存 储有计算机可执行程序,所述计算机可执行程序用于使计算机执行如上第一方面所述的图像超分辨率模型训练方法。
本申请的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请而了解。本申请的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。
附图说明
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。
图1是本申请一个实施例提供的一种图像超分辨率模型训练方法的主流程图;
图2是本申请一个实施例提供的利用编码-解码模型初始化图像超分辨率模型结构示意图;
图3是本申请一个实施例提供的一种图像超分辨率模型训练方法的子流程图;
图4是本申请一个实施例提供的一种图像超分辨率模型训练方法的子流程图;
图5是本申请一个实施例提供的图像超分辨率模型训练装置结构示意图;
图6是本申请一个实施例提供的电子设备结构示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
应了解,在本申请实施例的描述中,多个(或多项)的含义是两个以上,大于、小于、超过等理解为不包括本数,以上、以下、以内等理解为包括本数。如果有描述到“第一”、“第二”等只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。
目前提升图像超分辨率还有许多可以优化的空间,比如初始化方法,已有的算法初始化网络参数时大部分使用的是均匀分布、正交初始化等,然而,这些初始化方法大部分考虑的是激活函数的性质或者特殊的网络结构,对于比如循环神经网络的收敛问题,与训练数据没有关联,因此,无法为图像超分辨率模型提供好的模型参数作为初始起点,导致训练效果不佳。同时,损失函数的设计也直接决定了图像超分辨率模型的性能,但是,已有的损失函数是多个损失函数的线性叠加,线性参数一般使用的是“暴力试值”的方法,该方法不能达到一个最优组合,从而导致图像超分辨率模型效果不好。
本申请实施例提供了一种图像超分辨率模型训练方法、装置和计算机可读存储介质,通过获取低分辨率图像及其对应的真实高分辨率图像构成训练集,输入训练集中的低分辨率图像及其对应的真实高分辨率图像至预设的编码-解码模型,以确定模型参数,根据模型参数初始化图像超分辨率模型,将训练集中的低分辨率图像输入至图像超分辨率模型,得到超分辨率图像,确定真实高分辨率图像和超分辨率图像之间的信息熵损失,根据信息熵损失和预设损失函数结合损失系数构造总损失函数,损失系数用于表征占比总损失函数的权重值,基于总损失函数,对图像超分辨率模型进行模型训练,得到训练完成的图像超分辨率模型。基于 此,通过将训练集中的低分辨率图像及其对应的真实高分辨率图像输入到预设的编码-解码模型,以确定与训练集数据有关联的模型参数,利用模型参数初始化图像超分辨率模型,以此为图像超分辨率模型提供好的初始起点。同时,引入真实高分辨率图像和超分辨率图像之间的信息熵损失,采用信息熵损失叠加到预设损失函数并结合损失系数的方式构造总损失函数,以增加图像超分辨率模型训练时对损失系数的调整能力,以达到最优组合,从而能够提高图像超分辨率模型的训练效果。
如图1所示,图1是本申请一个实施例提供的一种图像超分辨率模型训练方法的流程图。图像超分辨率模型训练方法包括但不限于如下步骤:
步骤101,获取低分辨率图像及其对应的真实高分辨率图像构成训练集;
步骤102,输入训练集中的低分辨率图像及其对应的真实高分辨率图像至预设的编码-解码模型,以确定模型参数;
步骤103,根据模型参数初始化图像超分辨率模型;
步骤104,将训练集中的低分辨率图像输入至图像超分辨率模型,得到超分辨率图像;
步骤105,确定真实高分辨率图像和超分辨率图像之间的信息熵损失;
步骤106,根据信息熵损失和预设损失函数结合损失系数构造总损失函数,损失系数用于表征占比总损失函数的权重值;
步骤107,基于总损失函数,对图像超分辨率模型进行模型训练,得到训练完成的图像超分辨率模型。
可以理解的是,如图2所示的编码-解码模型,将低分辨率图像LR及其对应的真实高分辨率图像Ground Truth HR输入到编码-解码模型中的编码器encoder,通过一个编码器encoder得到一个中间低分辨率图片LR I,其中,为防止LR I和LR的分布不一致,需要对LR I数值进行约束,比如简单的做差或者L1范数约束等相关的约束方式,再通过一个解码器decoder得到一个超分辨率图片SR I。当对编码-解码模型进行训练后得到稳定的模型参数,然后将解码器decoder的模型参数初始化给当前网络(即图像超分辨率模型),从而为图像超分辨率模型的训练提供好的初始起点,从而使得将低分辨率图像LR输入到图像超分辨率模型中可以得到超分辨率图像SR。
可以理解的是,通过获取低分辨率图像及其对应的真实高分辨率图像构成训练集,输入训练集中的低分辨率图像及其对应的真实高分辨率图像至预设的编码-解码模型,以确定模型参数,根据模型参数初始化图像超分辨率模型,将训练集中的低分辨率图像输入至图像超分辨率模型,得到超分辨率图像,确定真实高分辨率图像和超分辨率图像之间的信息熵损失,根据信息熵损失和预设损失函数结合损失系数构造总损失函数,损失系数用于表征占比总损失函数的权重值,基于总损失函数,对图像超分辨率模型进行模型训练,得到训练完成的图像超分辨率模型。基于此,通过将训练集中的低分辨率图像及其对应的真实高分辨率图像输入到预设的编码-解码模型,以确定与训练集数据有关联的模型参数,利用模型参数初始化图像超分辨率模型,以此为图像超分辨率模型提供好的初始起点。同时,引入真实高分辨率图像和超分辨率图像之间的信息熵损失,采用信息熵损失叠加到预设损失函数并结合损失系数的方式构造总损失函数,以增加图像超分辨率模型训练时对损失系数的调整能力,以达到最优组合,从而能够提高图像超分辨率模型的训练效果。
需要说明的是,信息熵代表了图片包含信息的多少,通过引入信息熵损失,例如采用信 息熵差作为损失函数描述了输出图片和真实ground truth图片之间信息含量的区别,对图像超分辨率模型进行模型训练具有积极作用。
需要说明的是,在进行超分辨率的模型训练时,可以使用GPU服务器或者工作站进行模型训练,训练过程中可以使用多张显卡进行并行操作。GPU服务器的***环境可以为Ubuntu或者Windows。在模型推理是可以在GPU服务器中进行或者在AI芯片中进行。
如图3所示,步骤105可以包括但不限于如下子步骤:
步骤1051,根据真实高分辨率图像计算得到第一信息熵;
步骤1052,根据超分辨率图像计算得到第二信息熵;
步骤1053,对第一信息熵和第二信息熵之差取绝对值得到信息熵差。
可以理解的是,根据真实高分辨率图像计算得到第一信息熵,根据超分辨率图像计算得到第二信息熵,对第一信息熵和第二信息熵之差取绝对值得到信息熵差Loss Info=|HR-SR|。
可以理解的是,将训练集中的低分辨率LR图像输入至图像超分辨率模型,得到超分辨率图像SR,统计SR和HR的直方图得到像素值为0-255的概率Px i,其中x i取值范围为0...255,计算SR和HR图片的信息熵,然后相减取绝对值得到信息熵差H(x),信息熵差H(x)的计算公式如下:
Figure PCTCN2022078890-appb-000001
如图4所示,步骤106可以包括但不限于如下子步骤:
步骤1061,基于多任务学习确定预设损失函数对应的第一权重系数和信息熵差对应的第二权重系数;
步骤1062,根据预设损失函数、第一权重系数、信息熵差和第二权重系数构造总损失函数。
可以理解的是,总损失函数Loss根据预设损失函数、第一权重系数、信息熵差和第二权重系数构造,具体地,可以根据预设损失函数与第一权重系数之乘积,得到第一乘积;根据信息熵差与第二权重系数之乘积,得到第二乘积;根据第一乘积和第二乘积之和得到总损失函数,基于此,总损失函数Loss的计算公式如下:
Loss=λ 1·Loss pre2·Loss Info
其中,Loss pre表示图像超分辨率模型中先前设计的预设损失函数,后项为信息熵差Loss Info=|HR-SR|,λ 1为第一权重系数,λ 2为第二权重系数。
可以理解的是,可以基于多任务学习来确定预设损失函数对应的第一权重系数λ 1和信息熵差对应的第二权重系数λ 2。具体地,多任务公式的优化任务如下:
Figure PCTCN2022078890-appb-000002
其中,c为任务参数,
Figure PCTCN2022078890-appb-000003
为损失经验θ sh为共享的任务参数,θ Info以及θ pre为自身任务的独占参数。通过求上述公式的最小值,可以确定θ sh、θ Info以及θ pre的取值。由于θ shpre=λ 1,θ shInfo=λ 2,从而可以确定第一权重系数λ 1和第二权重系数λ 2。由此可知,相较于已有技术对损失系数都是通过“暴力试值”的方法确定,本方法通过使用多任务中来确定损失系数,使得损失系数有据可循,从而得到最优值。采用优化的损失系数构建的总损失函数,可以训练图像超分辨率模型取得较佳的效果,从而使得将低分辨率图像输入到图像超分辨率模型中 可以得到超分辨率图像。
可以理解的是,可以通过对多任务学***稳点来确定任务参数,并根据任务参数确定第一权重系数和第二权重系数。因此,对于上述多任务公式的优化任务求最小值,可以通过对其求导来实现,即可以通过求帕累托平稳点来实现,达到帕累托平稳点需满足如下公式:
Figure PCTCN2022078890-appb-000004
这里可以使用Frank-wolfe算法求出帕累托平稳点。
可以理解的是,还可以采用多任务梯度动态更新第一权重系数和第二权重系数。
总损失函数的系数也可以使用其他多任务系数确定方法确定,例如,可以利用梯度grad动态的更新参数,因为在多任务学***衡的问题,如果能知道权重系数w的更新梯度,就可以利用梯度更新公式来动态更新w,就像更新神经网络的参数一样,采用如下公式来更新第一权重系数W pre和第二权重系数W Info:
w(t+1)=w(t)+λβ(t)
其中,λ沿用全局的神经网络学习率,t代表第t次训练。
具体地,在一个batch中具体更新的步骤如下:
1.初始化权重w;
2.前向传播计算总损失:Loss=w prel pre+w Infol Info
3.计算本次训练的每一个权重的grad G w(t),以及该grad的平均值
Figure PCTCN2022078890-appb-000005
和反向传播速度r i(t);
4.计算grad Loss对w的导数;
5.利用第2步计算的Loss反向传播更新神经网络参数;
6.利用第5步的导数更新w;
7.对w进行重整化renormalize。
在步骤101之前可以包括但不限于如下子步骤:
对训练集的图像进行预处理,预处理包括对训练集的图像进行缩小、旋转以及翻转。
可以理解的是,通过对训练集的图像进行预处理,以扩充训练集。其中,预处理包括但不限于对训练集的图像进行缩小、旋转以及翻转。
以下结合具体实施例进一步介绍本申请提供的图像超分辨率模型训练方法。
1.训练集准备:训练集可通过下载网络上公开的数据集如DIV2K,URBAN100,BSD100,SET14等,也可以使用例如相机、手机等拍摄工具或者电脑对图像进行制作。
2.训练集图像的预处理:将训练集中的图像进行不同倍数的缩小为扩充训练集,把训练集中的图片分别逆时针旋转、翻转等操作。
3、相关训练参数的确定:确定的参数包括网络优化器,学习率损失函数等。其中,损失函数的设置通过公式(1)得到。
Figure PCTCN2022078890-appb-000006
其中,H和W为高分辨率的高和宽,H’和W’为低分辨率的高和宽,
Figure PCTCN2022078890-appb-000007
Figure PCTCN2022078890-appb-000008
为加权系数。
其次,训练当前网络:
其中1、2步与训练初始化网络相同。
3、使用编码-解码模型中解码器Decoder的参数用来初始化图像超分辨率模型。将训练集中的低分辨率LR图像输入至图像超分辨率模型,得到超分辨率图像SR,统计SR和HR的直方图得到像素值为0-255的概率Px i,其中x i取值范围为0...255,计算SR和HR图片的信息熵,然后相减取绝对值得到信息熵差H(x),信息熵差H(x)的计算公式如(2)所示,根据式(3)中的loss函数对当前网络训练得到训练好的网络。
Figure PCTCN2022078890-appb-000009
Loss=λ 1·Loss pre2·Loss Info  (3)
其中,Loss pre表示图像超分辨率模型中先前设计的预设损失函数,后项为信息熵差Loss Info=|HR-SR|,λ 1为第一权重系数,λ 2为第二权重系数。
可以基于多任务学习来确定预设损失函数对应的第一权重系数λ 1和信息熵差对应的第二权重系数λ 2。具体地,多任务公式的优化任务如下:
Figure PCTCN2022078890-appb-000010
其中,c为任务参数,
Figure PCTCN2022078890-appb-000011
为损失经验θ sh为共享的任务参数,θ Info以及θ pre为自身任务的独占参数。通过求上述公式的最小值,可以确定θ sh、θ Info以及θ pre的取值。由于θ shpre=λ 1,θ shInfo=λ 2,从而可以确定第一权重系数λ 1和第二权重系数λ 2。由此可知,相较于已有技术对损失系数都是通过“暴力试值”的方法确定,本方法通过使用多任务中来确定损失系数,使得损失系数有据可循,从而得到最优值。采用优化的损失系数构建的总损失函数,可以训练图像超分辨率模型取得较佳的效果,从而使得将低分辨率输入到图像超分辨率模型中可以得到超分辨率图像。
4.通过对多任务学***稳点来确定任务参数,并根据任务参数确定第一权重系数和第二权重系数。因此,对于上述多任务公式的优化任务求最小值,可以通过对其求导来实现,即可以通过求帕累托平稳点来实现,达到帕累托平稳点需满足如下公式:
Figure PCTCN2022078890-appb-000012
这里可以使用Frank-wolfe算法求出帕累托平稳点。
需要说明的是,还可以采用多任务梯度动态更新第一权重系数和第二权重系数。
总损失函数的系数也可以使用其他多任务系数确定方法确定,例如,可以利用梯度grad动态的更新参数,因为在多任务学***衡的问题,如果能知道权重系数w的更新梯度,就可以利用梯度更新公式来动态更新w,就像更新神经网络的参数一样,采用如下公式来更新第一权重系数W pre和第二权重系数W Info:
w(t+1)=w(t)+λβ(t)
其中,λ沿用全局的神经网络学习率,t代表第t次训练。
具体地,在一个batch中具体更新的步骤如下:
1.初始化权重w;
2.前向传播计算总损失:Loss=w prel pre+w Infol Info
3.计算本次训练的每一个权重的grad G w(t),以及该grad的平均值
Figure PCTCN2022078890-appb-000013
和反向传播速度r i(t);
4.计算grad Loss对w的导数;
5.利用第2步计算的Loss反向传播更新神经网络参数;
6.利用第5步的导数更新w;
7.对w进行重整化renormalize。
如图5所示,本申请实施例还提供了一种图像超分辨率模型训练装置。
具体地,该图像超分辨率模型训练装置包括:一个或多个处理器和存储器,图5中以一个处理器及存储器为例。处理器和存储器可以通过总线或者其他方式连接,图5中以通过总线连接为例。
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序,如上述本申请实施例中的图像超分辨率模型训练方法。处理器通过运行存储在存储器中的非暂态软件程序以及程序,从而实现上述本申请实施例中的图像超分辨率模型训练方法。
存储器可以包括存储程序区和存储数据区,其中,存储程序区可存储操作***、至少一个功能所需要的应用程序;存储数据区可存储执行上述本申请实施例中的图像超分辨率模型训练方法所需的数据等。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可以包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该图像超分辨率模型训练装置。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
实现上述本申请实施例中的图像超分辨率模型训练方法所需的非暂态软件程序以及程序存储在存储器中,当被一个或者多个处理器执行时,执行上述本申请实施例中的图像超分辨率模型训练方法,例如,执行以上描述的图1中的方法步骤101至步骤107,图3中的方法步骤1051至步骤1053,图4中的方法步骤1061至步骤1062,通过获取低分辨率图像及其对应的真实高分辨率图像构成训练集,输入训练集中的低分辨率图像及其对应的真实高分辨率图像至预设的编码-解码模型,以确定模型参数,根据模型参数初始化图像超分辨率模型,将训练集中的低分辨率图像输入至图像超分辨率模型,得到超分辨率图像,确定真实高分辨率图像和超分辨率图像之间的信息熵损失,根据信息熵损失和预设损失函数结合损失系数构造总损失函数,损失系数用于表征占比总损失函数的权重值,基于总损失函数,对图像超分辨率模型进行模型训练,得到训练完成的图像超分辨率模型。基于此,通过将训练集中的低分辨率图像及其对应的真实高分辨率图像输入到预设的编码-解码模型,以确定与训练集数据有关联的模型参数,利用模型参数初始化图像超分辨率模型,以此为图像超分辨率模型提供好的初始起点。同时,引入真实高分辨率图像和超分辨率图像之间的信息熵损失,采用信息熵损失叠加到预设损失函数并结合损失系数的方式构造总损失函数,以增加图像超分辨率模型训练时对损失系数的调整能力,以达到最优组合,从而能够提高图像超分辨率模型的训练效果。
如图6所示,本申请实施例还提供了一种电子设备。
具体地,该电子设备包括:一个或多个处理器和存储器,图6中以一个处理器及存储器为例。处理器和存储器可以通过总线或者其他方式连接,图6中以通过总线连接为例。
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性 计算机可执行程序,如上述本申请实施例中的图像超分辨率模型训练方法。处理器通过运行存储在存储器中的非暂态软件程序以及程序,从而实现上述本申请实施例中的图像超分辨率模型训练方法。
存储器可以包括存储程序区和存储数据区,其中,存储程序区可存储操作***、至少一个功能所需要的应用程序;存储数据区可存储执行上述本申请实施例中的图像超分辨率模型训练方法所需的数据等。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可以包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该图像超分辨率模型训练装置。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
实现上述本申请实施例中的图像超分辨率模型训练方法所需的非暂态软件程序以及程序存储在存储器中,当被一个或者多个处理器执行时,执行上述本申请实施例中的图像超分辨率模型训练方法,例如,执行以上描述的图1中的方法步骤101至步骤107,图3中的方法步骤1051至步骤1053,图4中的方法步骤1061至步骤1062,通过获取低分辨率图像及其对应的真实高分辨率图像构成训练集,输入训练集中的低分辨率图像及其对应的真实高分辨率图像至预设的编码-解码模型,以确定模型参数,根据模型参数初始化图像超分辨率模型,将训练集中的低分辨率图像输入至图像超分辨率模型,得到超分辨率图像,确定真实高分辨率图像和超分辨率图像之间的信息熵损失,根据信息熵损失和预设损失函数结合损失系数构造总损失函数,损失系数用于表征占比总损失函数的权重值,基于总损失函数,对图像超分辨率模型进行模型训练,得到训练完成的图像超分辨率模型。基于此,通过将训练集中的低分辨率图像及其对应的真实高分辨率图像输入到预设的编码-解码模型,以确定与训练集数据有关联的模型参数,利用模型参数初始化图像超分辨率模型,以此为图像超分辨率模型提供好的初始起点。同时,引入真实高分辨率图像和超分辨率图像之间的信息熵损失,采用信息熵损失叠加到预设损失函数并结合损失系数的方式构造总损失函数,以增加图像超分辨率模型训练时对损失系数的调整能力,以达到最优组合,从而能够提高图像超分辨率模型的训练效果。
此外,本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机可执行程序,该计算机可执行程序被一个或多个控制处理器执行,例如,被图5中的一个处理器执行,可使得上述一个或多个处理器执行上述本申请实施例中的图像超分辨率模型训练方法,例如,执行以上描述的图1中的方法步骤101至步骤107,图3中的方法步骤1051至步骤1053,图4中的方法步骤1061至步骤1062,通过获取低分辨率图像及其对应的真实高分辨率图像构成训练集,输入训练集中的低分辨率图像及其对应的真实高分辨率图像至预设的编码-解码模型,以确定模型参数,根据模型参数初始化图像超分辨率模型,将训练集中的低分辨率图像输入至图像超分辨率模型,得到超分辨率图像,确定真实高分辨率图像和超分辨率图像之间的信息熵损失,根据信息熵损失和预设损失函数结合损失系数构造总损失函数,损失系数用于表征占比总损失函数的权重值,基于总损失函数,对图像超分辨率模型进行模型训练,得到训练完成的图像超分辨率模型。基于此,通过将训练集中的低分辨率图像及其对应的真实高分辨率图像输入到预设的编码-解码模型,以确定与训练集数据有关联的模型参数,利用模型参数初始化图像超分辨率模型,以此为图像超分辨率模型提供好的初 始起点。同时,引入真实高分辨率图像和超分辨率图像之间的信息熵损失,采用信息熵损失叠加到预设损失函数并结合损失系数的方式构造总损失函数,以增加图像超分辨率模型训练时对损失系数的调整能力,以达到最优组合,从而能够提高图像超分辨率模型的训练效果。
本申请实施例包括:获取低分辨率图像及其对应的真实高分辨率图像构成训练集,输入训练集中的低分辨率图像及其对应的真实高分辨率图像至预设的编码-解码模型,以确定模型参数,根据模型参数初始化图像超分辨率模型,将训练集中的低分辨率图像输入至图像超分辨率模型,得到超分辨率图像,确定真实高分辨率图像和超分辨率图像之间的信息熵损失,根据信息熵损失和预设损失函数结合损失系数构造总损失函数,损失系数用于表征占比总损失函数的权重值,基于总损失函数,对图像超分辨率模型进行模型训练,得到训练完成的图像超分辨率模型。基于此,通过将训练集中的低分辨率图像及其对应的真实高分辨率图像输入到预设的编码-解码模型,以确定与训练集数据有关联的模型参数,利用模型参数初始化图像超分辨率模型,以此为图像超分辨率模型提供好的初始起点。同时,引入真实高分辨率图像和超分辨率图像之间的信息熵损失,采用信息熵损失叠加到预设损失函数并结合损失系数的方式构造总损失函数,以增加图像超分辨率模型训练时对损失系数的调整能力,以达到最优组合,从而能够提高图像超分辨率模型的训练效果。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、***可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读程序、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读程序、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
以上是对本申请的若干实施方式进行了具体说明,但本申请并不局限于上述实施方式,熟悉本领域的技术人员在不违背本申请精神的共享条件下还可作出种种等同的变形或替换,这些等同的变形或替换均包括在本申请权利要求所限定的范围内。

Claims (10)

  1. 一种图像超分辨率模型训练方法,包括:
    获取低分辨率图像及其对应的真实高分辨率图像构成训练集;
    输入所述训练集中的所述低分辨率图像及其对应的所述真实高分辨率图像至预设的编码-解码模型,以确定模型参数;
    根据所述模型参数初始化图像超分辨率模型;
    将所述训练集中的所述低分辨率图像输入至所述图像超分辨率模型,得到超分辨率图像;
    确定所述真实高分辨率图像和所述超分辨率图像之间的信息熵损失;
    根据所述信息熵损失和预设损失函数结合损失系数构造总损失函数,所述损失系数用于表征占比所述总损失函数的权重值;
    基于所述总损失函数,对所述图像超分辨率模型进行模型训练,得到训练完成的所述图像超分辨率模型。
  2. 根据权利要求1所述的方法,其中,所述确定所述真实高分辨率图像和所述超分辨率图像之间的信息熵损失,包括:
    根据所述真实高分辨率图像计算得到第一信息熵;
    根据所述超分辨率图像计算得到第二信息熵;
    对所述第一信息熵和所述第二信息熵之差取绝对值得到信息熵差。
  3. 根据权利要求2所述的方法,其中,所述损失系数包括第一权重系数和第二权重系数,所述根据所述信息熵损失和预设损失函数结合损失系数构造总损失函数,包括:
    基于多任务学习确定所述预设损失函数对应的所述第一权重系数和所述信息熵差对应的所述第二权重系数;
    根据所述预设损失函数、所述第一权重系数、所述信息熵差和所述第二权重系数构造所述总损失函数。
  4. 根据权利要求3所述的方法,其中,所述根据所述预设损失函数、所述第一权重系数、所述信息熵差和所述第二权重系数构造所述总损失函数,包括:
    根据所述预设损失函数与所述第一权重系数之乘积,得到第一乘积;
    根据所述信息熵差与所述第二权重系数之乘积,得到第二乘积;
    根据所述第一乘积和所述第二乘积之和得到所述总损失函数。
  5. 根据权利要求3所述的方法,其中,所述基于多任务学习确定所述预设损失函数对应的所述第一权重系数和所述信息熵差对应的所述第二权重系数,包括:
    通过对多任务学***稳点来确定任务参数;
    根据任务参数确定所述第一权重系数和所述第二权重系数。
  6. 根据权利要求3所述的方法,其中,所述基于多任务学习确定所述预设损失函数对应的所述第一权重系数和所述信息熵差对应的所述第二权重系数,包括:
    采用多任务梯度动态更新所述第一权重系数和所述第二权重系数。
  7. 根据权利要求6所述的方法,其中,在所述获取低分辨率图像及其对应的真实高分辨率图像构成训练集之后,还包括:
    对所述训练集的图像进行预处理,所述预处理包括对所述训练集的图像进行缩小、旋转 以及翻转。
  8. 一种图像超分辨率模型训练装置,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如权利要求1至7中任意一项所述的图像超分辨率模型训练方法。
  9. 一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如权利要求1至7中任意一项所述的图像超分辨率模型训练方法。
  10. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机可执行程序,所述计算机可执行程序用于使计算机执行如权利要求1至7任意一项所述的图像超分辨率模型训练方法。
PCT/CN2022/078890 2021-09-02 2022-03-02 图像超分辨率模型训练方法、装置和计算机可读存储介质 WO2023029418A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111027822.5A CN115760563A (zh) 2021-09-02 2021-09-02 图像超分辨率模型训练方法、装置和计算机可读存储介质
CN202111027822.5 2021-09-02

Publications (1)

Publication Number Publication Date
WO2023029418A1 true WO2023029418A1 (zh) 2023-03-09

Family

ID=85332217

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/078890 WO2023029418A1 (zh) 2021-09-02 2022-03-02 图像超分辨率模型训练方法、装置和计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN115760563A (zh)
WO (1) WO2023029418A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717857A (zh) * 2019-09-29 2020-01-21 中国科学院长春光学精密机械与物理研究所 超分辨率图像重构方法和装置
CN112215119A (zh) * 2020-10-08 2021-01-12 华中科技大学 一种基于超分辨率重建的小目标识别方法、装置及介质
CN112488924A (zh) * 2020-12-21 2021-03-12 深圳大学 一种图像超分辨率模型训练方法、重建方法及装置
CN113066018A (zh) * 2021-02-27 2021-07-02 华为技术有限公司 一种图像增强方法及相关装置
WO2021164731A1 (zh) * 2020-02-19 2021-08-26 华为技术有限公司 图像增强方法以及图像增强装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717857A (zh) * 2019-09-29 2020-01-21 中国科学院长春光学精密机械与物理研究所 超分辨率图像重构方法和装置
WO2021164731A1 (zh) * 2020-02-19 2021-08-26 华为技术有限公司 图像增强方法以及图像增强装置
CN112215119A (zh) * 2020-10-08 2021-01-12 华中科技大学 一种基于超分辨率重建的小目标识别方法、装置及介质
CN112488924A (zh) * 2020-12-21 2021-03-12 深圳大学 一种图像超分辨率模型训练方法、重建方法及装置
CN113066018A (zh) * 2021-02-27 2021-07-02 华为技术有限公司 一种图像增强方法及相关装置

Also Published As

Publication number Publication date
CN115760563A (zh) 2023-03-07

Similar Documents

Publication Publication Date Title
US11538229B2 (en) Image processing method and apparatus, electronic device, and computer-readable storage medium
CN109753971B (zh) 扭曲文字行的矫正方法及装置、字符识别方法及装置
CN108229479B (zh) 语义分割模型的训练方法和装置、电子设备、存储介质
WO2020098708A1 (zh) 车道线的检测及驾驶控制方法、装置和电子设备
US10455152B2 (en) Panoramic video processing method and device and non-transitory computer-readable medium
TWI721510B (zh) 雙目圖像的深度估計方法、設備及儲存介質
CN106846467B (zh) 基于每个相机位置优化的实体场景建模方法和***
WO2020056903A1 (zh) 用于生成信息的方法和装置
WO2020087564A1 (zh) 三维物体重建方法、计算机设备及存储介质
WO2018210318A1 (zh) 图像虚化处理方法、装置、存储介质及电子设备
CN109583509B (zh) 数据生成方法、装置及电子设备
CN112733820B (zh) 障碍物信息生成方法、装置、电子设备和计算机可读介质
JP7264310B2 (ja) 画像処理方法、機器、非一時的コンピュータ可読媒体
CN114511041B (zh) 模型训练方法、图像处理方法、装置、设备和存储介质
CN109766896B (zh) 一种相似度度量方法、装置、设备和存储介质
US20220383630A1 (en) Training large-scale vision transformer neural networks
US20230100427A1 (en) Face image processing method, face image processing model training method, apparatus, device, storage medium, and program product
CN113947768A (zh) 一种基于单目3d目标检测的数据增强方法和装置
CN115564639A (zh) 背景虚化方法、装置、计算机设备和存储介质
CN113592706B (zh) 调整单应性矩阵参数的方法和装置
CN113223137B (zh) 透视投影人脸点云图的生成方法、装置及电子设备
WO2024093763A1 (zh) 全景图像处理方法、装置、计算机设备、介质和程序产品
US20230401737A1 (en) Method for training depth estimation model, training apparatus, and electronic device applying the method
US10074033B2 (en) Using labels to track high-frequency offsets for patch-matching algorithms
WO2023029418A1 (zh) 图像超分辨率模型训练方法、装置和计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22862567

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE