CN114926342A - Image super-resolution reconstruction model construction method, device, equipment and storage medium - Google Patents

Image super-resolution reconstruction model construction method, device, equipment and storage medium Download PDF

Info

Publication number
CN114926342A
CN114926342A CN202210612479.9A CN202210612479A CN114926342A CN 114926342 A CN114926342 A CN 114926342A CN 202210612479 A CN202210612479 A CN 202210612479A CN 114926342 A CN114926342 A CN 114926342A
Authority
CN
China
Prior art keywords
feature map
image
resolution
deep layer
layer feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210612479.9A
Other languages
Chinese (zh)
Inventor
陈实
张乐飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202210612479.9A priority Critical patent/CN114926342A/en
Publication of CN114926342A publication Critical patent/CN114926342A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a method, a device, equipment and a storage medium for constructing an image super-resolution reconstruction model, which relate to the technical field of computer vision and comprise the steps of acquiring a training data set, wherein the training data set comprises a high-resolution image and a low-resolution image corresponding to the high-resolution image; constructing an image super-resolution network, wherein the super-resolution network comprises a shallow layer feature extraction module, a deep layer feature extraction module and an image reconstruction module, the deep layer feature extraction module comprises 4 residual feature distillation submodules with an iteration function based on mixed hole convolution, and the residual feature distillation submodules based on the mixed hole convolution are used for iteratively extracting deep layer feature information and fine granularity information of an image; and training the image super-resolution network based on the training data set to obtain an image super-resolution reconstruction model. According to the method and the device, the capacity and the calculation consumption of the model can be reduced and the reasoning speed of the model can be improved while the image super-resolution reconstruction result is ensured.

Description

Image super-resolution reconstruction model construction method, device, equipment and storage medium
Technical Field
The application relates to the technical field of computer vision, in particular to a method, a device, equipment and a storage medium for constructing an image super-resolution reconstruction model.
Background
Super-resolution is an important task in the fields of computer vision and image processing, and is widely applied in the fields of target detection, intelligent mobile devices, large-sized intelligent screen display devices, medical images, target identification in security monitoring, remote sensing images and the like. Among them, the super-resolution image reconstruction aims to reconstruct a high-resolution image having texture details and good visual effects from a low-resolution image degraded for various reasons. However, super-resolution image reconstruction is an ill-posed task, that is, for a certain low-resolution image input, there may exist a plurality of different high-resolution images corresponding to each other, and a one-to-many mapping relationship is formed.
In recent years, with the rise of deep learning in the field of computer vision, computer vision tasks such as image processing, image understanding, and pattern recognition have rapidly developed, and results far exceeding those of conventional methods have been achieved. Naturally, deep learning also has profound effects in the field of super-resolution of images, and more complex deep networks achieve top-quality performance in the super-resolution field. However, most existing methods ignore the model capacity and computational efficiency while gradually increasing the network depth and complexity. Therefore, the depth models have the problems of difficult deployment to mobile terminal devices, slow operation rate and the like, so that the methods are not suitable for devices with limited resources.
In order to solve the problems, a large number of lightweight convolutional neural network models are proposed, however, the current lightweight super-resolution algorithm uses a large convolutional kernel, so that some redundant parameters exist, and the model operation efficiency is low; and the sensing field of the adopted common convolution mechanism is small, so that the long-distance information dependence relationship cannot be captured, and the problem of poor super-resolution reconstruction result precision exists.
Disclosure of Invention
The application provides a method, a device, equipment and a storage medium for constructing an image super-resolution reconstruction model, which aim to solve the problems of low model operation efficiency and poor super-resolution reconstruction result precision in the related technology.
In a first aspect, a method for constructing an image super-resolution reconstruction model is provided, which comprises the following steps:
acquiring a training data set, wherein the training data set comprises a high-resolution image and a low-resolution image corresponding to the high-resolution image;
constructing an image super-resolution network, wherein the super-resolution network comprises a shallow layer feature extraction module, a deep layer feature extraction module and an image reconstruction module, the deep layer feature extraction module comprises 4 residual feature distillation submodules with an iterative effect and based on mixed cavity convolution, and the residual feature distillation submodules based on the mixed cavity convolution are used for iteratively extracting deep feature information and fine image granularity information;
and training the image super-resolution network based on the training data set to obtain an image super-resolution reconstruction model.
In some embodiments, each of the mixed hole convolution based residual signature distillation sub-modules is specifically configured to:
carrying out feature distillation separation processing on the shallow feature map output by the shallow feature extraction module to obtain a first shallow feature map and a second shallow feature map, and carrying out mixed cavity convolution processing on the second shallow feature map to extract global feature information to obtain a first deep feature map;
performing characteristic distillation separation processing on the first deep layer feature map to obtain a second deep layer feature map and a third deep layer feature map, and performing mixed cavity convolution processing on the third deep layer feature map to extract global feature information to obtain a fourth deep layer feature map;
performing characteristic distillation separation processing on the fourth deep layer feature map to obtain a fifth deep layer feature map and a sixth deep layer feature map, and performing mixed cavity convolution processing on the sixth deep layer feature map to extract global feature information to obtain a seventh deep layer feature map;
and splicing the first shallow layer feature map, the second deep layer feature map, the fifth deep layer feature map and the seventh deep layer feature map to obtain a spliced feature map.
In some embodiments, each of the residual characteristic distillation sub-modules based on mixed-hole convolution is further configured to:
extracting fine-grained information of the image of the spliced characteristic diagram based on a channel attention mechanism to obtain a new spliced characteristic diagram;
and adding the new spliced feature map and the shallow feature map output by the shallow feature extraction module to obtain a final deep feature map.
In some embodiments, the training of the image super-resolution network based on the training dataset comprises:
training on the basis of a first loss function in the iterative process of the first N epochs, wherein N is a positive integer;
the network parameter is adjusted based on the second loss function in an iterative process after the nth Epoch.
In some embodiments, before the step of acquiring the training data set, the method further comprises:
acquiring an initial high-resolution image and an initial low-resolution image corresponding to the initial high-resolution image;
respectively randomly cutting the initial high-resolution image and the initial low-resolution image to obtain a cut high-resolution image and a cut low-resolution image, wherein the sizes of the cut high-resolution image and the cut low-resolution image are the same;
and respectively carrying out data enhancement processing of turning and rotating on the cut high-resolution image and the cut low-resolution image to obtain a processed high-resolution image and a processed low-resolution image.
In a second aspect, an apparatus for constructing an image super-resolution reconstruction model is provided, which includes:
an acquisition unit for acquiring a training data set including a high resolution image and a low resolution image corresponding to the high resolution image;
the super-resolution network comprises a shallow layer feature extraction module, a deep layer feature extraction module and an image reconstruction module, wherein the deep layer feature extraction module comprises 4 residual feature distillation submodules with an iterative effect and based on mixed cavity convolution, and the residual feature distillation submodules based on the mixed cavity convolution are used for iteratively extracting deep feature information and fine image granularity information;
and the training unit is used for training the image super-resolution network based on the training data set to obtain an image super-resolution reconstruction model.
In some embodiments, each of the mixed hole convolution based residual signature distillation sub-modules is specifically configured to:
carrying out feature distillation separation processing on the shallow feature map output by the shallow feature extraction module to obtain a first shallow feature map and a second shallow feature map, and carrying out mixed cavity convolution processing on the second shallow feature map to extract global feature information to obtain a first deep feature map;
performing characteristic distillation separation processing on the first deep layer feature map to obtain a second deep layer feature map and a third deep layer feature map, and performing mixed cavity convolution processing on the third deep layer feature map to extract global feature information to obtain a fourth deep layer feature map;
performing characteristic distillation separation processing on the fourth deep layer feature map to obtain a fifth deep layer feature map and a sixth deep layer feature map, and performing mixed cavity convolution processing on the sixth deep layer feature map to extract global feature information to obtain a seventh deep layer feature map;
and splicing the first shallow layer feature map, the second deep layer feature map, the fifth deep layer feature map and the seventh deep layer feature map to obtain a spliced feature map.
In some embodiments, each of the residual characteristic distillation sub-modules based on mixed-hole convolution is further configured to:
extracting fine-grained information of the image of the spliced characteristic graph based on a channel attention mechanism to obtain a new spliced characteristic graph;
and adding the new spliced feature map and the shallow feature map output by the shallow feature extraction module to obtain a final deep feature map.
In a third aspect, an image super-resolution reconstruction model construction device is provided, including: the image super-resolution reconstruction model comprises a memory and a processor, wherein at least one instruction is stored in the memory, and is loaded and executed by the processor to realize the image super-resolution reconstruction model construction method.
In a fourth aspect, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor, implements the aforementioned image super-resolution reconstruction model construction method.
The technical scheme who provides this application brings beneficial effect includes: the method can reduce the capacity and the calculation consumption of the model and improve the reasoning speed of the model while ensuring the image super-resolution reconstruction result.
The application provides a method, a device, equipment and a storage medium for constructing an image super-resolution reconstruction model, which comprises the steps of acquiring a training data set, wherein the training data set comprises a high-resolution image and a low-resolution image corresponding to the high-resolution image; constructing an image super-resolution network, wherein the super-resolution network comprises a shallow layer feature extraction module, a deep layer feature extraction module and an image reconstruction module, the deep layer feature extraction module comprises 4 residual feature distillation submodules with an iterative effect and based on mixed cavity convolution, and the residual feature distillation submodules based on the mixed cavity convolution are used for iteratively extracting deep feature information and fine image granularity information; and training the image super-resolution network based on the training data set to obtain an image super-resolution reconstruction model. According to the method, the experience field of feature extraction can be enlarged under the condition that network parameters are not increased through the residual feature distillation submodule based on mixed cavity convolution, so that the redundant parameters of a model can be reduced, the efficiency of feature information extraction is improved, the extracted features can contain more global information, and the precision of a super-resolution reconstruction result is improved; and the deep feature extraction is decomposed into 4 residual feature distillation sub-modules with iteration effect based on mixed cavity convolution by utilizing a recursion idea so as to effectively reduce the complexity of the model and improve the information processing efficiency, so that the capacity and the calculation consumption of the model can be reduced and the reasoning speed of the model can be improved while the super-resolution image reconstruction result is ensured.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for constructing an image super-resolution reconstruction model according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of an image super-resolution network provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of a DRFDB provided in an embodiment of the present application;
fig. 4 is a schematic structural diagram of cclaayer provided in the embodiment of the present application;
fig. 5 is a schematic structural diagram of an image super-resolution reconstruction model construction device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application provides a method, a device, equipment and a storage medium for constructing an image super-resolution reconstruction model, which can solve the problems of low model operation efficiency and poor super-resolution reconstruction result precision in the related technology.
Fig. 1 is a method for constructing an image super-resolution reconstruction model provided by an embodiment of the present application, including the following steps:
step S10: acquiring a training data set, wherein the training data set comprises a high-resolution image and a low-resolution image corresponding to the high-resolution image;
further, before the step of acquiring the training data set, the method further includes:
acquiring an initial high-resolution image and an initial low-resolution image corresponding to the initial high-resolution image;
respectively randomly cutting the initial high-resolution image and the initial low-resolution image to obtain a cut high-resolution image and a cut low-resolution image, wherein the sizes of the cut high-resolution image and the cut low-resolution image are the same;
and respectively carrying out data enhancement processing of turning and rotating on the cut high-resolution image and the cut low-resolution image to obtain a processed high-resolution image and a processed low-resolution image.
Illustratively, in this embodiment, the training data may be formed by downloading over the network the data set DIV2K provided in the global event "New Trend Games for image recovery and enhancement" which is the most influential in the field of image recovery, and collecting a large number of high resolution and low resolution image pairs from another data set Flickr2K commonly used in this field. Such as downloading DIV2K data sets and High Resolution images (HR) in the Flickr2K data sets from the network
Figure BDA0003672353020000071
With corresponding Low-Resolution images (LR) at different down-sampling multiples
Figure BDA0003672353020000072
Wherein the high-resolution image and the low-resolution image respectively comprise 800 training pictures and 2650 training pictures, and the down-sampling multiples of the low-resolution image are respectively 2 times,3 times and 4 times. The collected high-resolution and low-resolution images are then randomly cropped to a fixed pixel size (e.g., 256 × 256 pixels), the cropped images are randomly horizontally flipped and rotated by 90 degrees to achieve data enhancement, and the data-enhanced images form the final training data set.
Step S20: constructing an image super-resolution network, wherein the super-resolution network comprises a shallow layer feature extraction module, a deep layer feature extraction module and an image reconstruction module, the deep layer feature extraction module comprises 4 residual feature distillation submodules with an iteration function based on mixed hole convolution, and the residual feature distillation submodules based on mixed hole convolution are used for iteratively extracting deep layer feature information and fine granularity information of an image;
further, each of the residual characteristic distillation submodules based on the mixed hole convolution is specifically configured to:
carrying out feature distillation separation processing on the shallow feature map output by the shallow feature extraction module to obtain a first shallow feature map and a second shallow feature map, and carrying out mixed cavity convolution processing on the second shallow feature map to extract global feature information to obtain a first deep feature map;
performing characteristic distillation separation processing on the first deep layer feature map to obtain a second deep layer feature map and a third deep layer feature map, and performing mixed cavity convolution processing on the third deep layer feature map to extract global feature information to obtain a fourth deep layer feature map;
performing characteristic distillation separation processing on the fourth deep layer feature map to obtain a fifth deep layer feature map and a sixth deep layer feature map, and performing mixed cavity convolution processing on the sixth deep layer feature map to extract global feature information to obtain a seventh deep layer feature map;
and splicing the first shallow layer feature map, the second deep layer feature map, the fifth deep layer feature map and the seventh deep layer feature map to obtain a spliced feature map.
Further, each of the residual characteristic distillation submodules based on the mixed hole convolution is further specifically configured to:
extracting fine-grained information of the image of the spliced characteristic diagram based on a channel attention mechanism to obtain a new spliced characteristic diagram;
and adding the new spliced feature map and the shallow feature map output by the shallow feature extraction module to obtain a final deep feature map.
Exemplarily, in the present embodiment, a fast image super-resolution network based on feature distillation is constructed, and as shown in fig. 2, the network mainly comprises three parts: shallow feature extraction, deep feature extraction and image reconstruction. Firstly, shallow features of an input image are extracted through a simple convolution layer (namely a Conv3 convolution kernel) in a shallow feature extraction module to obtain a shallow feature map; then, iteratively extracting deep Feature information and fine granularity information of the image through 4 iterative Residual Feature Distillation sub-modules (DRFDB) based on the cavity convolution to realize deep Feature extraction, and further obtaining a deep Feature map; and finally, performing Sub-pixel convolution operation on the deep layer characteristic diagram and the shallow layer characteristic diagram through an image reconstruction module (namely Sub-pixel) to obtain an image Super-Resolution reconstruction result (Super-Resolution, SR).
Specifically, referring to fig. 3, the residual feature distillation submodule based on the cavity convolution is composed of three feature distillation operations, wherein each feature distillation operation uses 1 × 1 convolution (Conv-1) to process half channels separated from the feature map and directly transmits the half channels as local information to the tail of the module, and then all channels of the feature map are used as the input of the next separation operation (namely, scaled Conv) to gradually expand the receptive field and extract global information; in addition, a mixed hole convolution extraction scheme is used when the global information is extracted, and the hole rate which is calculated in advance is used when the global information is extracted for 3 times, so that the information loss (namely the problem of chessboard effect) possibly caused by general hole convolution superposition can be avoided; finally, splicing the local feature information and the final global feature information which are separated each time at the tail part of the module to obtain a spliced feature map; as shown in fig. 4, fine-grained information is further extracted from the spliced feature map by an improved channel attention mechanism and convolution operation (i.e., cclaayer), which is a channel attention mechanism more suitable for underlying visual tasks such as super-resolution, etc., and the global average pooling layer in the common channel attention is replaced by the sum of the pixel average value and the standard deviation, so that the network pays more attention to the texture edge and the structural information, and further a final deep feature map is obtained, and the extracted final deep feature map is used as the input of the next module, and the output of each DRFDB module and the shallow feature map input from the shallow feature extraction module are spliced together and sent to the image reconstruction module at the tail of the network.
Step S30: and training the image super-resolution network based on the training data set to obtain an image super-resolution reconstruction model.
Further, the training of the image super-resolution network based on the training data set includes:
training on the basis of a first loss function in the iterative process of the first N epochs, wherein N is a positive integer;
the network parameter is adjusted based on the second loss function in an iterative process after the nth Epoch.
Exemplarily, in the present embodiment, when the image super-resolution network is trained by using the training data set, an Adam (Adaptive gradient Estimation optimizer) optimizer is used to calculate gradients and update network parameters, and finally, an image super-resolution reconstruction model is formed. Wherein the training of the image super-resolution network is divided into 2 steps, such as first using a loss function L1 (such as an Epoch) in an iterative process of the first 1000 epochs (when a complete data set passes once through the neural network and back once, this process is called an Epoch)
Figure BDA0003672353020000101
) To train the network; the loss function L2 is then used during the training of the next 1000 epochs (e.g., as
Figure BDA0003672353020000102
) And fine-tuning the network parameters.
Therefore, the application embodiment can enlarge the receptive field of feature extraction through the residual feature distillation submodule based on the mixed cavity convolution under the condition of not increasing network parameters, not only can reduce redundant parameters of a model and improve the efficiency of feature information extraction, but also can enable extracted features to contain more global information, and further improve the precision of a super-resolution reconstruction result; and the deep feature extraction is decomposed into 4 residual feature distillation sub-modules with iteration effect based on mixed cavity convolution by utilizing a recursion idea so as to effectively reduce the complexity of the model and improve the information processing efficiency, so that the capacity and the calculation consumption of the model can be reduced and the reasoning speed of the model can be improved while the super-resolution image reconstruction result is ensured.
The working principle of the image super-resolution reconstruction model is further explained in conjunction with fig. 2 to 4.
Inputting the high-resolution images and the low-resolution images in the training data set into a shallow feature extraction module (namely Conv3) to perform shallow feature extraction to obtain a shallow feature map; next, the DRFDB module will extract deep feature information and fine image granularity information from the shallow feature map (since the deep feature extraction principle of each DRFDB module is similar, for the sake of simplicity of description, the present embodiment only explains the workflow of one DRFDB module from top to bottom): carrying out first characteristic distillation operation on the shallow characteristic diagram to obtain a first shallow characteristic diagram and a second shallow characteristic diagram, directly inputting the first shallow characteristic diagram serving as local information into a first Conv-1 for further characteristic extraction, and inputting the second shallow characteristic diagram into a first scaled Conv; the first scaled conv is to perform mixed cavity convolution processing on the second shallow feature map to extract global feature information to obtain a first deep feature map, and perform feature distillation operation on the first deep feature map to obtain a second deep feature map and a third deep feature map; then directly inputting the second deep feature map as local information to a second Conv-1 for further feature extraction, and inputting the third deep feature map to a second scaled Conv; the second scaled conv is to perform mixed cavity convolution processing on the third deep feature map to extract global feature information to obtain a fourth deep feature map, and perform feature distillation operation on the fourth deep feature map to obtain a fifth deep feature map and a sixth deep feature map; directly inputting the fifth deep layer feature map serving as local information into a third Conv-1 for further feature extraction, performing mixed-hole convolution processing on the sixth deep layer feature map to extract global feature information to obtain a seventh deep layer feature map, and inputting the seventh deep layer feature map into Conv-3 for further feature extraction; and then splicing the feature graphs output by the three Conv-1 after feature extraction and the feature graphs output by the Conv-3 to obtain a spliced feature graph, further extracting features of the spliced feature graph through a fourth Conv-1, and inputting the output spliced feature graph to a contrast perception attention layer (namely CCALayer) to extract fine-grained information of the image.
Referring to fig. 4, a specific process of extracting fine-grained information of an image is as follows: a Contrast Layer in the CCA Layer performs comparative learning on the spliced feature map, the number of channels of the feature map obtained through the comparative learning is changed based on two Conv-1, the importance degree (namely weight) of each channel on the feature map is given through sigmoid, and then the feature map output by the sigmoid is cross-multiplied with the original spliced feature map to obtain a new spliced feature map; then, residual errors of the new splicing characteristic diagram output by the CCALayer and the shallow characteristic diagram output by the shallow characteristic extraction module are added to obtain a final deep characteristic diagram; referring to fig. 2, the final deep feature maps output by the four DRFDB modules are input to Conv1 and Conv3 for convolution operation, so as to obtain a total deep feature map; and inputting the total deep feature map and the shallow feature map output by the shallow feature extraction module into Conv3 for further convolution operation, and finally inputting the convolution result into Sub-pixel for Sub-pixel convolution and a series of convolution operations, so as to obtain an image super-resolution reconstruction result.
Therefore, the embodiment of the application provides a residual error feature distillation submodule based on cavity convolution, and the module can enlarge the receptive field of feature extraction under the condition that network parameters are not increased, so that the extracted features contain more global information, and the efficiency of extracting feature information by a network is improved; in addition, the process of super-resolution reconstruction in the embodiment of the application is divided into 3 processes of shallow feature extraction, deep feature extraction and image super-resolution reconstruction, and the deep feature extraction is decomposed into 4 iterative modules by using a recursion idea, so that the complexity of a model is effectively reduced and the information processing efficiency is improved by using the method; compared with the existing lightweight image super-resolution model, the network provided by the embodiment of the application can reduce the number of parameters of the model, reduce the capacity of the model, the floating point calculation amount and the calculation consumption and improve the reasoning speed of the model while ensuring the image super-resolution reconstruction result, thereby improving the efficiency of the model in practical application.
The embodiment of the application further provides an image super-resolution reconstruction model construction device, which includes:
an acquisition unit for acquiring a training data set including a high resolution image and a low resolution image corresponding to the high resolution image;
the super-resolution network comprises a shallow layer feature extraction module, a deep layer feature extraction module and an image reconstruction module, wherein the deep layer feature extraction module comprises 4 residual feature distillation submodules with iteration function based on mixed hole convolution, and the residual feature distillation submodules based on mixed hole convolution are used for iteratively extracting deep layer feature information and fine granularity information of an image;
and the training unit is used for training the image super-resolution network based on the training data set to obtain an image super-resolution reconstruction model.
According to the embodiment of the application, the experience field of feature extraction can be enlarged under the condition that network parameters are not increased through the residual feature distillation submodule based on mixed cavity convolution, so that the redundant parameters of a model can be reduced, the efficiency of feature information extraction is improved, the extracted features can contain more global information, and the precision of a super-resolution reconstruction result is improved; and the deep feature extraction is decomposed into 4 residual feature distillation sub-modules with iteration effect based on mixed cavity convolution by utilizing the recursion idea so as to effectively reduce the complexity of the model and improve the information processing efficiency, so that the method and the device can reduce the capacity and the calculation consumption of the model and improve the reasoning speed of the model while ensuring the image super-resolution reconstruction result.
Further, each of the residual characteristic distillation submodules based on the mixed hole convolution is specifically configured to:
carrying out feature distillation separation processing on the shallow feature map output by the shallow feature extraction module to obtain a first shallow feature map and a second shallow feature map, and carrying out mixed cavity convolution processing on the second shallow feature map to extract global feature information to obtain a first deep feature map;
performing characteristic distillation separation processing on the first deep layer feature map to obtain a second deep layer feature map and a third deep layer feature map, and performing mixed cavity convolution processing on the third deep layer feature map to extract global feature information to obtain a fourth deep layer feature map;
performing characteristic distillation separation processing on the fourth deep layer feature map to obtain a fifth deep layer feature map and a sixth deep layer feature map, and performing mixed cavity convolution processing on the sixth deep layer feature map to extract global feature information to obtain a seventh deep layer feature map;
and splicing the first shallow layer feature map, the second deep layer feature map, the fifth deep layer feature map and the seventh deep layer feature map to obtain a spliced feature map.
Further, each of the residual characteristic distillation submodules based on the mixed hole convolution is further specifically configured to:
extracting fine-grained information of the image of the spliced characteristic diagram based on a channel attention mechanism to obtain a new spliced characteristic diagram;
and adding the new spliced characteristic diagram with the shallow characteristic diagram output by the shallow characteristic extraction module to obtain a final deep characteristic diagram.
Further, the training unit is specifically configured to:
training based on a first loss function in the iteration process of the first N epochs, wherein N is a positive integer;
the network parameter is adjusted based on the second loss function in an iterative process after the nth Epoch.
Further, the obtaining unit is further configured to:
acquiring an initial high-resolution image and an initial low-resolution image corresponding to the initial high-resolution image;
respectively randomly cutting the initial high-resolution image and the initial low-resolution image to obtain a cut high-resolution image and a cut low-resolution image, wherein the sizes of the cut high-resolution image and the cut low-resolution image are the same;
and respectively carrying out data enhancement processing of turning and rotating on the cut high-resolution image and the cut low-resolution image to obtain a processed high-resolution image and a processed low-resolution image.
It should be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the apparatus and the units described above may refer to the corresponding processes in the foregoing embodiment of the image super-resolution reconstruction model construction method, and are not described herein again.
The apparatus provided by the above-mentioned embodiment can be implemented in the form of a computer program, which can be run on an image super-resolution reconstruction model construction device as shown in fig. 5.
The embodiment of the application further provides an image super-resolution reconstruction model construction device, which comprises: the image super-resolution reconstruction model building method comprises a memory, a processor and a network interface which are connected through a system bus, wherein at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor so as to realize all steps or part of steps of the image super-resolution reconstruction model building method.
The network interface is used for performing network communication, such as sending distributed tasks. It will be appreciated by those skilled in the art that the configuration shown in fig. 5 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
The Processor may be a CPU, or other general purpose Processor, Digital Signal Processor (DSP), Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) or other programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being the control center of the computer device and the various interfaces and lines connecting the various parts of the overall computer device.
The memory may be used to store computer programs and/or modules, and the processor may implement various functions of the computer device by executing or executing the computer programs and/or modules stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a video playing function, an image playing function, etc.), and the like; the storage data area may store data (such as video data, image data, etc.) created according to the use of the cellular phone, etc. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, all or part of the steps of the aforementioned image super-resolution reconstruction model construction method are implemented.
The embodiments of the present application may implement all or part of the foregoing processes, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of the foregoing methods. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer memory, Read-Only memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, in accordance with legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunications signals.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, server, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The previous description is only an example of the present application, and is provided to enable any person skilled in the art to understand or implement the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for constructing an image super-resolution reconstruction model is characterized by comprising the following steps:
acquiring a training data set, wherein the training data set comprises a high-resolution image and a low-resolution image corresponding to the high-resolution image;
constructing an image super-resolution network, wherein the super-resolution network comprises a shallow layer feature extraction module, a deep layer feature extraction module and an image reconstruction module, the deep layer feature extraction module comprises 4 residual feature distillation submodules with an iterative effect and based on mixed cavity convolution, and the residual feature distillation submodules based on the mixed cavity convolution are used for iteratively extracting deep feature information and fine image granularity information;
and training the image super-resolution network based on the training data set to obtain an image super-resolution reconstruction model.
2. The image super-resolution reconstruction model building method according to claim 1, wherein each of the residual eigen-distillation sub-modules based on the mixed hole convolution is specifically configured to:
carrying out feature distillation separation processing on the shallow layer feature map output by the shallow layer feature extraction module to obtain a first shallow layer feature map and a second shallow layer feature map, and carrying out mixed cavity convolution processing on the second shallow layer feature map to extract global feature information to obtain a first deep layer feature map;
performing characteristic distillation separation processing on the first deep layer feature map to obtain a second deep layer feature map and a third deep layer feature map, and performing mixed cavity convolution processing on the third deep layer feature map to extract global feature information to obtain a fourth deep layer feature map;
performing characteristic distillation separation processing on the fourth deep layer feature map to obtain a fifth deep layer feature map and a sixth deep layer feature map, and performing mixed cavity convolution processing on the sixth deep layer feature map to extract global feature information to obtain a seventh deep layer feature map;
and splicing the first shallow layer feature map, the second deep layer feature map, the fifth deep layer feature map and the seventh deep layer feature map to obtain a spliced feature map.
3. The image super-resolution reconstruction model building method according to claim 2, wherein each of the residual eigen-distillation sub-modules based on the mixed hole convolution is further configured to:
extracting fine-grained information of the image of the spliced characteristic graph based on a channel attention mechanism to obtain a new spliced characteristic graph;
and adding the new spliced feature map and the shallow feature map output by the shallow feature extraction module to obtain a final deep feature map.
4. The method for constructing the image super-resolution reconstruction model according to claim 1, wherein the training of the image super-resolution network based on the training dataset comprises:
training based on a first loss function in the iteration process of the first N epochs, wherein N is a positive integer;
the network parameter is adjusted based on the second loss function in an iterative process after the nth Epoch.
5. The image super-resolution reconstruction model building method according to claim 1, further comprising, before the step of acquiring a training data set:
acquiring an initial high-resolution image and an initial low-resolution image corresponding to the initial high-resolution image;
respectively randomly cutting the initial high-resolution image and the initial low-resolution image to obtain a cut high-resolution image and a cut low-resolution image, wherein the sizes of the cut high-resolution image and the cut low-resolution image are the same;
and respectively carrying out data enhancement processing of turning and rotating on the cut high-resolution image and the cut low-resolution image to obtain a processed high-resolution image and a processed low-resolution image.
6. An image super-resolution reconstruction model construction device is characterized by comprising:
an acquisition unit for acquiring a training data set including a high resolution image and a low resolution image corresponding to the high resolution image;
the super-resolution network comprises a shallow layer feature extraction module, a deep layer feature extraction module and an image reconstruction module, wherein the deep layer feature extraction module comprises 4 residual feature distillation submodules with iteration function based on mixed hole convolution, and the residual feature distillation submodules based on mixed hole convolution are used for iteratively extracting deep layer feature information and fine granularity information of an image;
and the training unit is used for training the image super-resolution network based on the training data set to obtain an image super-resolution reconstruction model.
7. The image super-resolution reconstruction model construction device according to claim 6, wherein each of the residual eigen-distillation sub-modules based on the mixed hole convolution is specifically configured to:
carrying out feature distillation separation processing on the shallow feature map output by the shallow feature extraction module to obtain a first shallow feature map and a second shallow feature map, and carrying out mixed cavity convolution processing on the second shallow feature map to extract global feature information to obtain a first deep feature map;
performing characteristic distillation separation processing on the first deep layer feature map to obtain a second deep layer feature map and a third deep layer feature map, and performing mixed cavity convolution processing on the third deep layer feature map to extract global feature information to obtain a fourth deep layer feature map;
performing characteristic distillation separation processing on the fourth deep layer feature map to obtain a fifth deep layer feature map and a sixth deep layer feature map, and performing mixed cavity convolution processing on the sixth deep layer feature map to extract global feature information to obtain a seventh deep layer feature map;
and splicing the first shallow layer feature map, the second deep layer feature map, the fifth deep layer feature map and the seventh deep layer feature map to obtain a spliced feature map.
8. The image super-resolution reconstruction model construction device according to claim 7, wherein each of the residual eigen-distillation sub-modules based on the mixed hole convolution is further configured to:
extracting fine-grained information of the image of the spliced characteristic graph based on a channel attention mechanism to obtain a new spliced characteristic graph;
and adding the new spliced feature map and the shallow feature map output by the shallow feature extraction module to obtain a final deep feature map.
9. An image super-resolution reconstruction model construction device, characterized by comprising: a memory and a processor, the memory having stored therein at least one instruction, the at least one instruction being loaded and executed by the processor to implement the image super-resolution reconstruction model construction method of any one of claims 1 to 5.
10. A computer-readable storage medium, characterized in that: the computer storage medium stores a computer program which, when executed by a processor, implements the image super-resolution reconstruction model construction method of any one of claims 1 to 5.
CN202210612479.9A 2022-05-31 2022-05-31 Image super-resolution reconstruction model construction method, device, equipment and storage medium Pending CN114926342A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210612479.9A CN114926342A (en) 2022-05-31 2022-05-31 Image super-resolution reconstruction model construction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210612479.9A CN114926342A (en) 2022-05-31 2022-05-31 Image super-resolution reconstruction model construction method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114926342A true CN114926342A (en) 2022-08-19

Family

ID=82812481

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210612479.9A Pending CN114926342A (en) 2022-05-31 2022-05-31 Image super-resolution reconstruction model construction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114926342A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115564649A (en) * 2022-09-27 2023-01-03 苏州大学 Image super-resolution reconstruction method, device and equipment
CN115908144A (en) * 2023-03-08 2023-04-04 中国科学院自动化研究所 Image processing method, device, equipment and medium based on random wavelet attention
CN116993592A (en) * 2023-09-27 2023-11-03 城云科技(中国)有限公司 Construction method, device and application of image super-resolution reconstruction model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115564649A (en) * 2022-09-27 2023-01-03 苏州大学 Image super-resolution reconstruction method, device and equipment
CN115908144A (en) * 2023-03-08 2023-04-04 中国科学院自动化研究所 Image processing method, device, equipment and medium based on random wavelet attention
CN116993592A (en) * 2023-09-27 2023-11-03 城云科技(中国)有限公司 Construction method, device and application of image super-resolution reconstruction model
CN116993592B (en) * 2023-09-27 2023-12-12 城云科技(中国)有限公司 Construction method, device and application of image super-resolution reconstruction model

Similar Documents

Publication Publication Date Title
CN112348783B (en) Image-based person identification method and device and computer-readable storage medium
CN114926342A (en) Image super-resolution reconstruction model construction method, device, equipment and storage medium
CN110222598B (en) Video behavior identification method and device, storage medium and server
CN107563974B (en) Image denoising method and device, electronic equipment and storage medium
CN111340077B (en) Attention mechanism-based disparity map acquisition method and device
CN113870283B (en) Portrait matting method, device, computer equipment and readable storage medium
CN112419152B (en) Image super-resolution method, device, terminal equipment and storage medium
CN112419191B (en) Image motion blur removing method based on convolution neural network
CN113129212B (en) Image super-resolution reconstruction method and device, terminal device and storage medium
CN114399440B (en) Image processing method, image processing network training method and device and electronic equipment
US20230252605A1 (en) Method and system for a high-frequency attention network for efficient single image super-resolution
CN103632153A (en) Region-based image saliency map extracting method
CN110782406A (en) Image denoising method and device based on information distillation network
CN115512258A (en) Desensitization method and device for video image, terminal equipment and storage medium
CN108986210B (en) Method and device for reconstructing three-dimensional scene
CN111951171A (en) HDR image generation method and device, readable storage medium and terminal equipment
CN112489103B (en) High-resolution depth map acquisition method and system
Truong et al. Depth map inpainting and super-resolution with arbitrary scale factors
DE102022105545A1 (en) Low complexity deep guided filter decoder for pixel by pixel prediction task
CN114049491A (en) Fingerprint segmentation model training method, fingerprint segmentation device, fingerprint segmentation equipment and fingerprint segmentation medium
CN110399881B (en) End-to-end quality enhancement method and device based on binocular stereo image
CN114445629A (en) Model generation method, image segmentation method, model generation system, image segmentation system, electronic device and storage medium
CN115631115B (en) Dynamic image restoration method based on recursion transform
CN116612385B (en) Remote sensing image multiclass information extraction method and system based on depth high-resolution relation graph convolution
CN117952901B (en) Multi-source heterogeneous image change detection method and device based on generation countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination