CN113077385A - Video super-resolution method and system based on countermeasure generation network and edge enhancement - Google Patents

Video super-resolution method and system based on countermeasure generation network and edge enhancement Download PDF

Info

Publication number
CN113077385A
CN113077385A CN202110340664.2A CN202110340664A CN113077385A CN 113077385 A CN113077385 A CN 113077385A CN 202110340664 A CN202110340664 A CN 202110340664A CN 113077385 A CN113077385 A CN 113077385A
Authority
CN
China
Prior art keywords
resolution
super
video
frame
continuous frames
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110340664.2A
Other languages
Chinese (zh)
Inventor
滕国伟
王嘉璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN202110340664.2A priority Critical patent/CN113077385A/en
Publication of CN113077385A publication Critical patent/CN113077385A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a video super-resolution method based on a confrontation generation network and edge enhancement, which comprises the following steps: step S1: independently establishing a data set based on a countermeasure generation network, and acquiring high-resolution continuous frames and corresponding low-resolution continuous frames; step S2: converting the RGB color space of the high-resolution continuous frames and the corresponding low-resolution continuous frames into HSV color space; step S3: establishing a generation network to obtain super-resolution continuous frames; step S4: performing authenticity identification on the super-resolution continuous frames; step S5: if the discrimination result is true, outputting a super-resolution continuous frame; and if the discrimination result is false, regenerating the super-resolution continuous frame for true-false discrimination. The invention directly adopts the original continuous frames of the old film and television as the input of the network, and the corresponding high-definition repair version continuous frames are used as the target of the network. Thus, the intermediate degradation process is omitted, and the actual situation of the natural image is fitted.

Description

Video super-resolution method and system based on countermeasure generation network and edge enhancement
Technical Field
The invention relates to the technical field of video image processing, in particular to a video super-resolution method and a video super-resolution system based on a countermeasure generation network and edge enhancement.
Background
Super resolution aims at reconstructing High Resolution (HR) images or video from a Low Resolution (LR) version, which is a classical problem in computer vision. It not only pursues the enlargement of physical size but also restores high frequency details to ensure clarity. Classical algorithms have existed for decades and can be classified into the following categories: patch, edge, sparse coding, prediction and statistics based methods. These methods are less computationally expensive than the deep learning methods, but their recovery performance is also very limited. With the popularization of deep learning, convolutional neural networks have been widely used, and have led to a leap in super-resolution.
The field can be divided into two parts, single image super-resolution (SISR) and video super-resolution (VSR). The former exploits spatial correlation in a single frame, while the latter uses inter-frame temporal correlation in addition. The resolution for movies and television works around 2000 was low due to the previous shooting conditions and projection equipment. Although the display of image quality is not satisfactory, a large number of excellent movie works are appearing. The perceived needs of today's people are not met when directly copying the previous shot directly into existing devices. Therefore, the super-resolution of the old movies does have certain market demand. However, since temporal correlation is crucial for video super resolution, information from adjacent low resolution frames is often combined. However, some video reconstruction results are still unsatisfactory.
Through retrieval, patent document CN105931189B discloses a video super-resolution method and device based on an improved super-resolution parameterized model, which uses the improved super-resolution parameterized model as a theoretical guidance of the video super-resolution method, and uses a common mark matrix to exclude error reference information introduced by non-common areas corresponding to occlusion and boundary overflow, so that the parameterized model can better describe various actual videos; and the stable implementation of the video super-resolution is ensured by combining a method of jointly estimating a plurality of unknown parameter parameters. Although the prior art solves the defect of non-public content between videos, the technical problem of edge enhancement cannot be solved by improving parameters and not combining spatial correlation and temporal correlation between frames.
Patent document CN111260560B discloses a multi-frame video super-resolution method integrating attention mechanism, which includes acquiring video data and training the video data by using video enhancement technology to generate a training set and a test set; connecting the deformed convolution feature alignment module and the feature reconstruction module to form a multi-frame super-resolution network, and training the multi-frame super-resolution network by adopting a training set; adding a 3D convolution characteristic alignment module into a multi-frame super-resolution network for training; adding the feature fusion module into a multi-frame super-resolution network for training; training a multi-frame super-resolution network by adopting a training set; fine-tuning the multi-frame super-resolution network by adopting a training set to generate a multi-frame super-resolution model; and testing the multi-frame super-resolution model by adopting a test set. The prior art mainly improves super-resolution by analyzing big data, focuses on attention mechanism, does not combine spatial correlation and temporal correlation between frames, and cannot solve the technical problem of edge enhancement.
At present, the prior art has two major problems for the video reconstruction result. First, the super-resolution data set is obtained by down-sampling the high-resolution data HR to obtain low-resolution data LR, and then forming LR and HR pairs. However, the sampling process is an ideal process for natural image degradation, and is not in accordance with reality. Second, the super-resolution result of the GAN network in the prior art has a great progress in subjective perception, but is not applied to video.
Therefore, there is a need to develop a video application result that combines the spatial correlation and the temporal correlation between frames and utilizes the super-resolution result of the GAN network to improve the old movies.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a video super-resolution method and a video super-resolution system based on a countermeasure generation network and edge enhancement.
The video super-resolution method based on the countermeasure generation network and the edge enhancement provided by the invention comprises the following steps:
step S1: independently establishing a data set based on a countermeasure generation network, and acquiring high-resolution continuous frames and corresponding low-resolution continuous frames;
step S2: converting the RGB color space of the high-resolution continuous frames and the corresponding low-resolution continuous frames into HSV color space;
step S3: establishing a generation network to obtain super-resolution continuous frames;
step S4: performing authenticity identification on the super-resolution continuous frames;
step S5: if the discrimination result is true, outputting a super-resolution continuous frame; and if the discrimination result is false, regenerating the super-resolution continuous frame for true-false discrimination.
Preferably, step S1 includes the following sub-steps:
step S11: reading in an original video of an old film and a corresponding high-definition restoration version;
step S12: converting an original video and a high-definition repair version video into a continuous frame sequence;
step S13: the high resolution consecutive frames and the low resolution consecutive frames are rotated and cropped.
Preferably, step S12 includes the following sub-steps:
step S121: aligning the time axes of the original video and the corresponding high-definition repair version;
step S122: selecting low-resolution video within a period from a starting time t1 to an ending time t2 of an original videoLRConverting into continuous frames;
step S123: selecting high-resolution video within a period from the same starting time t1 to the same ending time t2 of the high-definition repair versionHRAnd then, the frame is converted into a continuous frame.
Preferably, the low resolution video in step S12LRAnd high resolution videoHRThere is no scene cut within.
Preferably, the rotation and cropping parameters employed in step S13 for the high resolution consecutive frames and the low resolution consecutive frames must be consistent.
Preferably, step S3 includes the following sub-steps:
step S31: motion compensation of optical flow network
Figure BDA00029990142400000312
Motion compensation
Figure BDA00029990142400000313
Quadruple linear upsampling to obtain upsampled motion compensation VtCompensating V for the motion after upsamplingtSuper-resolution result from previous frame
Figure BDA0002999014240000031
Carrying out nonlinear image deformation operation to obtain deformed frame
Figure BDA0002999014240000032
Step S32: obtaining a super-resolution intermediate result by using a super-resolution reconstruction network;
step S33: and utilizing a Laplacian edge enhancement network for the super-resolution intermediate result.
Preferably, step S32 includes the following sub-steps:
step S321: motion compensation
Figure BDA00029990142400000314
Linear up-sampling by four times to obtain Vt
Step S322: will VtSuper-resolution reconstruction result with previous frame
Figure BDA0002999014240000033
Carrying out nonlinear image deformation operation to obtain deformed frame
Figure BDA0002999014240000034
Its shape is (batch size,4 w,4 h, channel);
step S323: will be provided with
Figure BDA0002999014240000035
Channel recombination is carried out to obtain a frame with a size sampled downwards
Figure BDA0002999014240000036
The shape at this time is (batch size, w, h,4 × 4 channel);
step S324: will be provided with
Figure BDA0002999014240000037
And
Figure BDA0002999014240000038
merging on the third channel, the shape being (batch size, w, h,4 × 4 channel + channel);
step S325: obtaining intermediate results of super-resolution
Figure BDA0002999014240000039
Wherein
Figure BDA00029990142400000310
A super-resolution result representing a residual of the low-resolution frame and the high-resolution frame,
Figure BDA00029990142400000311
representing a low resolution frame and Bicubic () representing Bicubic samples.
Preferably, step S33 includes the following sub-steps:
step S331: utilizing Laplacian L and
Figure BDA0002999014240000041
convolution is carried out to obtain an image with sudden change of pixel values
Figure BDA0002999014240000042
Wherein L is a laplace mask and,
Figure BDA0002999014240000043
in order to perform the convolution operation,
Figure BDA0002999014240000044
is the intermediate super-resolution result.
Step S332: extracting the edge of the intermediate super-resolution result by using Laplacian
Figure BDA0002999014240000045
And intermediate super-resolution results
Figure BDA0002999014240000046
Superimposed to produce a sharpened image
Figure BDA0002999014240000047
Step S333: post-processing denoising work to obtain a super-resolution final result
Figure BDA0002999014240000048
Preferably, step S4 includes:
loss of perception: the final result of super-resolution
Figure BDA0002999014240000049
And the target frame
Figure BDA00029990142400000410
A second order norm loss is made at the network layer of VGG19,
Figure BDA00029990142400000411
wherein
Figure BDA00029990142400000412
As a final result of super-resolution
Figure BDA00029990142400000413
A map of features on a particular convolutional layer of VGG19,
Figure BDA00029990142400000414
is a target frame
Figure BDA00029990142400000415
Feature maps on the same convolutional layer as VGG 19.
Content loss: the intermediate results are:
Figure BDA00029990142400000416
wherein the content of the first and second substances,
Figure BDA00029990142400000417
for the final result of the super-resolution,
Figure BDA00029990142400000418
in order to be a target frame, the frame is,
Figure BDA00029990142400000419
is the intermediate super-resolution result.
Loss of sequence: forward generated super-resolution end result
Figure BDA00029990142400000420
And reverse generated super-resolution end result
Figure BDA00029990142400000421
It should be consistent in theory that,
Figure BDA00029990142400000422
the invention provides a video super-resolution system based on a countermeasure generation network and edge enhancement, which comprises:
module M1: independently establishing a data set based on a countermeasure generation network, and acquiring high-resolution continuous frames and corresponding low-resolution continuous frames;
module M2: converting the RGB color space of the high-resolution continuous frames and the corresponding low-resolution continuous frames into HSV color space;
module M3: establishing a generation network to obtain super-resolution continuous frames;
module M4: performing authenticity identification on the super-resolution continuous frames;
module M5: if the discrimination result is true, outputting a super-resolution continuous frame; and if the discrimination result is false, regenerating the super-resolution continuous frame for true-false discrimination.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the invention, the generation of a network is resisted, a data set is automatically established, a high-resolution continuous frame and a corresponding low-resolution continuous frame are obtained, the process of obtaining a low-resolution video from a high-resolution video by manual down-sampling is omitted, the original video continuous frame of an old film and television is directly used as input, the corresponding high-definition repair version continuous frame is used as a target, and the technical problem of the process of ideal natural image degradation is avoided.
2. The invention converts the RGB color space into the HSV color space and then carries out nonlinear feature mapping, so that the visual characteristic of the human eye is closer, and the good visual effect is beneficial to being obtained.
3. The method fully utilizes the characteristic that a movie scene is accompanied by a large amount of subtitles, and carries out edge enhancement operation after the intermediate super-resolution result to obtain a final result.
4. According to the invention, the noise mask is arranged on the edge enhancement module to learn the noise, so that the learned noise can be removed from the extracted edge.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
fig. 1 is a flow chart of the video super-resolution method based on the countermeasure generation network and the edge enhancement of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
The invention provides a video super-resolution method based on a confrontation generation network and edge enhancement, which comprises the following steps:
step S1: and based on the countermeasure generation network autonomous establishment data set, acquiring high-resolution continuous frames and corresponding low-resolution continuous frames.
Step S2: converting the RGB color space of the high-resolution continuous frames and the corresponding low-resolution continuous frames into HSV color space;
step S3: establishing a generation network to obtain super-resolution continuous frames;
step S4: performing authenticity identification on the super-resolution continuous frames;
step S5: if the discrimination result is true, outputting a super-resolution continuous frame; and if the discrimination result is false, regenerating the super-resolution continuous frame for true-false discrimination.
Further, step S1 includes the following sub-steps:
step S11: and reading in an original video of the old film and the corresponding high-definition restoration version.
Step S12: the original video and the high definition restored version video are converted into a continuous sequence of frames.
Step S121: and aligning the time axes of the original video and the corresponding high-definition repair version.
Step S122: selecting low-resolution video within a period from a starting time t1 to an ending time t2 of an original videoLRAnd then, the frame is converted into a continuous frame.
Step S123: selecting high-resolution video within a period from the same starting time t1 to the same ending time t2 of the high-definition repair versionHRAnd then, the frame is converted into a continuous frame.
Step S13: the high resolution consecutive frames and the low resolution consecutive frames are rotated and cropped.
Step S3 includes the following sub-steps:
step S31: motion compensation of optical flow network
Figure BDA00029990142400000611
Linear up-sampling by four times to obtain VtWill VtAnd
Figure BDA0002999014240000061
performing nonlinear image deformation operation to obtain
Figure BDA0002999014240000062
Step S32: and obtaining a super-resolution intermediate result by using a super-resolution reconstruction network.
Step S33: and utilizing a Laplacian edge enhancement network for the super-resolution intermediate result.
Step S331: utilizing Laplacian L and
Figure BDA0002999014240000063
convolution is carried out to obtain an image with sudden change of pixel values
Figure BDA0002999014240000064
Wherein L is a laplace mask and,
Figure BDA0002999014240000065
in order to perform the convolution operation,
Figure BDA0002999014240000066
is the intermediate super-resolution result.
Step S332: will Laplace image
Figure BDA0002999014240000067
And intermediate super-resolution results
Figure BDA0002999014240000068
Superimposed to produce a sharpened image
Figure BDA0002999014240000069
Step S333: denoising work of post-processing to obtain final result
Figure BDA00029990142400000610
As shown in fig. 1, the present invention first creates a particular high-low resolution data pair for a particular scene. The dataset is then subjected to preprocessing including cropping, rotation, and color space conversion. And then obtaining a reconstruction result through a generated network, adding a corresponding layer in the generated network for edge enhancement, and finally identifying the authenticity through judging the network.
The invention is further clearly and completely explained by combining the drawings in the specification, so that the invention is clearer and clearer.
Firstly, reading in an original video of an old film and a corresponding high-definition repair version, and respectively converting the original video and the corresponding high-definition repair version into continuous frames. In the current image or video super-resolution algorithm, a low-resolution data set is basically obtained from a high-resolution data set in the processing of the data set. This process involves down-sampling and artificial noise addition in order to mimic the process of degradation of natural images or video, but does not achieve the same. Aiming at a specific movie scene, the method directly adopts the original continuous frames of the old movie as the input of the network, and the corresponding high-definition repair version continuous frames as the target of the network. Thus, the intermediate degradation process is omitted, and the actual situation of the natural image is fitted.
And secondly, aligning time axes of the original video and the corresponding high-definition repair version, selecting the videos of the original video and the high-definition repair version within the same time period, wherein scene switching does not exist in the selected time period, and then converting the videos into continuous frames. And then, rotating and cutting the high-resolution continuous frames and the low-resolution continuous frames (the operation parameters need to be in one-to-one correspondence), and expanding the training set.
Third, the conversion of color space, from RGB to HSV. HSV color space is another popular color model in addition to RGB color space, which is widely used in computers, and HSV is a color model for users' perception, emphasizing on color representation, which is closer to the perception experience of human color than RGB, and very intuitively expresses hue (H), vividness (S) and brightness (V) of color in television display. Meanwhile, the two color spaces have a definite conversion relation, nonlinear mapping and super-resolution reconstruction are carried out in the HSV space, and then the color spaces are converted back to the RGB space.
Fourthly, inputting the low-resolution continuous frames subjected to the preprocessing into the optical flow network for motion compensation. Video super-resolution has more one-dimensional temporal information available than image super-resolution, while also adding to the consistency challenges. In the invention, the optical flow net is still reserved for motion compensation. The input to the network is two consecutive low resolution consecutive strings, one set of 10 consecutive frames, the second set being the reverse sequence of the first set:
Figure BDA0002999014240000071
Figure BDA0002999014240000072
then, the user can use the device to perform the operation,
Figure BDA0002999014240000073
and
Figure BDA0002999014240000074
and (3) merging to obtain:
Figure BDA0002999014240000075
Figure BDA0002999014240000076
so designed that not only can obtain
Figure BDA0002999014240000077
And
Figure BDA0002999014240000078
between the motion compensation
Figure BDA0002999014240000079
Can also obtain
Figure BDA00029990142400000710
And
Figure BDA00029990142400000711
between the motion compensation
Figure BDA00029990142400000712
For subsequent use
Figure BDA00029990142400000713
And
Figure BDA00029990142400000714
both are theoretically identical, which facilitates the later design of the loss function.
And fifthly, reconstructing the super-resolution. In order to enhance the consistency of the reconstructed video, the invention compensates the motion
Figure BDA00029990142400000715
Linear up-sampling by four times to obtain VtWill VtSuper-resolution reconstruction result with previous frame
Figure BDA00029990142400000716
Performing nonlinear image deformation operation to obtain
Figure BDA00029990142400000717
The shape is (batch size,4 w,4 h, channel) and the channel recombination is carried out to obtain
Figure BDA00029990142400000718
The shape at this time is (batch size, w, h,4 channel), and the following will be described
Figure BDA00029990142400000719
And
Figure BDA00029990142400000720
merging takes place on the third channel, now shaped as (blocksize, w, h,4 x 4 channel + channel), which is the final input to the hyper-resolution reconstruction network. The output of the super-resolution network is
Figure BDA00029990142400000721
In addition, in order to obtain stable network training, the invention only learns the residual part, and the final result of the residual part is
Figure BDA00029990142400000722
Figure BDA00029990142400000723
Sixth, edge enhancement. The output obtained in the last step is an intermediate super-resolution result
Figure BDA00029990142400000724
The invention carries out edge enhancement on the basis of the method. First using the Laplace operator L and
Figure BDA00029990142400000725
convolution is carried out to obtain an image with sudden change of pixel values
Figure BDA00029990142400000726
Figure BDA00029990142400000727
Then the Laplace image is processed
Figure BDA00029990142400000728
And intermediate super-resolution results
Figure BDA00029990142400000729
Superimposed to produce a sharpened image
Figure BDA0002999014240000081
The simple edge enhancement can not only produce the effect of Laplace sharpening, but also retain background information, and superimpose an original image on a processing result of Laplace transformation, so that each pixel value in the image can be retained, the contrast at a sudden change position of the pixel value is enhanced, and the final result is that the edge is highlighted on the premise of retaining the image background. Considering that the laplacian operator can enhance the noise while enhancing the edge, the post-processing work is increased to obtain the final result
Figure BDA0002999014240000082
Aiming at the phenomenon that a large amount of subtitles are accompanied in movie and television theaters, the edge enhancement effect can obviously improve the visual effect.
Seventh, design of the loss function. The method generates a network based on the countermeasure, and adds an additional loss function besides the countermeasure loss. Firstly, loss of perception: the final result is obtained
Figure BDA0002999014240000083
And the target frame
Figure BDA0002999014240000084
L2 losses are made at the VGG19 network layer,
Figure BDA0002999014240000085
wherein
Figure BDA0002999014240000086
And
Figure BDA0002999014240000087
are respectively as
Figure BDA0002999014240000088
And
Figure BDA0002999014240000089
the resulting signature is convolved on VGG 19. Content loss: in addition to the final result, a portion of the intermediate results is added.
Figure BDA00029990142400000810
Figure BDA00029990142400000811
③ loss of sequence: generated in the forward direction
Figure BDA00029990142400000812
And
Figure BDA00029990142400000813
it should be consistent in theory that,
Figure BDA00029990142400000814
the main framework of the invention is based on a countermeasure generation network, and is an end-to-end video super-resolution method based on deep learning. Unlike prior art techniques that combine information from adjacent low resolution frames for video reconstruction. In addition, the method is used for the autonomous creation of the data set of the network training, and is different from the method which is commonly used at present and directly uses the high-resolution data set to degenerate to obtain the low-resolution data set, and the method directly obtains the high-resolution continuous frames and the low-resolution continuous frames of the training set.
Additionally, the present invention also incorporates other image enhancement techniques. And after finishing the middle super-resolution result, performing an edge enhancement technology on the middle result. The edge enhancement technology improves Laplace edge enhancement, simple edge enhancement can misunderstand noise as an edge to be enhanced, and noise amplification is caused, so that a noise mask is arranged in an edge enhancement module to learn noise, and the learned noise is removed from an extracted edge.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (10)

1. A video super-resolution method based on a confrontation generation network and edge enhancement is characterized by comprising the following steps:
step S1: independently establishing a data set based on a countermeasure generation network, and acquiring high-resolution continuous frames and corresponding low-resolution continuous frames;
step S2: converting the RGB color space of the high-resolution continuous frames and the corresponding low-resolution continuous frames into HSV color space;
step S3: establishing a generation network to obtain super-resolution continuous frames;
step S4: performing authenticity identification on the super-resolution continuous frames;
step S5: if the discrimination result is true, outputting a super-resolution continuous frame; and if the discrimination result is false, regenerating the super-resolution continuous frame for true-false discrimination.
2. The video super-resolution method based on the countermeasure generation network and the edge enhancement according to claim 1, wherein the step S1 comprises the following sub-steps:
step S11: reading in an original video of an old film and a corresponding high-definition restoration version;
step S12: converting an original video and a high-definition repair version video into a continuous frame sequence;
step S13: the high resolution consecutive frames and the low resolution consecutive frames are rotated and cropped.
3. The video super-resolution method based on the countermeasure generation network and the edge enhancement according to claim 2, wherein the step S12 comprises the following sub-steps:
step S121: aligning the time axes of the original video and the corresponding high-definition repair version;
step S122: selecting low-resolution video within a period from a starting time t1 to an ending time t2 of an original videoLRConverting into continuous frames;
step S123: selecting high-resolution video within a period from the same starting time t1 to the same ending time t2 of the high-definition repair versionHRAnd then, the frame is converted into a continuous frame.
4. The video super resolution method based on the countermeasure generation network and the edge enhancement as claimed in claim 3, wherein the low resolution video in step S12LRAnd high resolution videoHRThere is no scene cut within.
5. The super-resolution video method based on resist generation network and edge enhancement according to claim 2, wherein the rotation and cropping parameters adopted in step S13 for the high-resolution continuous frames and the low-resolution continuous frames must be consistent.
6. The video super-resolution method based on the countermeasure generation network and the edge enhancement according to claim 1, wherein the step S3 comprises the following sub-steps:
step S31: motion compensation of optical flow network
Figure FDA00029990142300000223
Exercise patchPayment
Figure FDA00029990142300000224
Quadruple linear upsampling to obtain upsampled motion compensation VtCompensating V for the motion after upsamplingtSuper-resolution result from previous frame
Figure FDA0002999014230000021
Carrying out nonlinear image deformation operation to obtain deformed frame
Figure FDA0002999014230000022
Step S32: obtaining a super-resolution intermediate result by using a super-resolution reconstruction network;
step S33: and utilizing a Laplacian edge enhancement network for the super-resolution intermediate result.
7. The video super-resolution method based on the countermeasure generation network and the edge enhancement according to claim 5, wherein the step S32 comprises the following sub-steps:
step S321: motion compensation
Figure FDA00029990142300000225
Linear up-sampling by four times to obtain Vt
Step S322: will VtSuper-resolution reconstruction result with previous frame
Figure FDA0002999014230000023
Carrying out nonlinear image deformation operation to obtain deformed frame
Figure FDA0002999014230000024
Its shape is (batch size,4 w,4 h, channel);
step S323: will be provided with
Figure FDA0002999014230000025
Performing channel reorganizationThen obtaining the size down-sampled frame
Figure FDA0002999014230000026
The shape at this time is (batch size, w, h,4 × 4 channel);
step S324: will be provided with
Figure FDA0002999014230000027
And
Figure FDA0002999014230000028
merging on the third channel, the shape being (batch size, w, h,4 × 4 channel + channel);
step S325: obtaining intermediate results of super-resolution
Figure FDA0002999014230000029
Wherein
Figure FDA00029990142300000210
A super-resolution result representing a residual of the low-resolution frame and the high-resolution frame,
Figure FDA00029990142300000211
representing a low resolution frame and Bicubic () representing Bicubic samples.
8. The video super-resolution method based on the countermeasure generation network and the edge enhancement according to claim 6, wherein the step S33 comprises the following sub-steps:
step S331: utilizing Laplacian L and
Figure FDA00029990142300000212
convolution is carried out to obtain an image with sudden change of pixel values
Figure FDA00029990142300000213
Figure FDA00029990142300000214
Wherein L is a laplace mask and,
Figure FDA00029990142300000215
in order to perform the convolution operation,
Figure FDA00029990142300000216
is the intermediate super-resolution result.
Step S332: extracting the edge of the intermediate super-resolution result by using Laplacian
Figure FDA00029990142300000217
And intermediate super-resolution results
Figure FDA00029990142300000218
Superimposed to produce a sharpened image
Figure FDA00029990142300000219
Step S333: post-processing denoising work to obtain a super-resolution final result
Figure FDA00029990142300000220
9. The video super-resolution method based on the countermeasure generation network and the edge enhancement according to claim 1, wherein the step S4 includes:
loss of perception: the final result of super-resolution
Figure FDA00029990142300000221
And the target frame
Figure FDA00029990142300000222
A second order norm loss is made at the network layer of VGG19,
Figure FDA0002999014230000031
wherein
Figure FDA0002999014230000032
As a final result of super-resolution
Figure FDA0002999014230000033
A map of features on a particular convolutional layer of VGG19,
Figure FDA0002999014230000034
is a target frame
Figure FDA0002999014230000035
Feature maps on the same convolutional layer as VGG 19.
Content loss: the intermediate results are:
Figure FDA0002999014230000036
wherein the content of the first and second substances,
Figure FDA0002999014230000037
for the final result of the super-resolution,
Figure FDA0002999014230000038
in order to be a target frame, the frame is,
Figure FDA0002999014230000039
is the intermediate super-resolution result.
Loss of sequence: forward generated super-resolution end result
Figure FDA00029990142300000310
And reverse generated super-resolution end result
Figure FDA00029990142300000311
It should be consistent in theory that,
Figure FDA00029990142300000312
10. a video super-resolution system based on a countermeasure generation network and edge enhancement, comprising:
module M1: independently establishing a data set based on a countermeasure generation network, and acquiring high-resolution continuous frames and corresponding low-resolution continuous frames;
module M2: converting the RGB color space of the high-resolution continuous frames and the corresponding low-resolution continuous frames into HSV color space;
module M3: establishing a generation network to obtain super-resolution continuous frames;
module M4: performing authenticity identification on the super-resolution continuous frames;
module M5: if the discrimination result is true, outputting a super-resolution continuous frame; and if the discrimination result is false, regenerating the super-resolution continuous frame for true-false discrimination.
CN202110340664.2A 2021-03-30 2021-03-30 Video super-resolution method and system based on countermeasure generation network and edge enhancement Pending CN113077385A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110340664.2A CN113077385A (en) 2021-03-30 2021-03-30 Video super-resolution method and system based on countermeasure generation network and edge enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110340664.2A CN113077385A (en) 2021-03-30 2021-03-30 Video super-resolution method and system based on countermeasure generation network and edge enhancement

Publications (1)

Publication Number Publication Date
CN113077385A true CN113077385A (en) 2021-07-06

Family

ID=76611866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110340664.2A Pending CN113077385A (en) 2021-03-30 2021-03-30 Video super-resolution method and system based on countermeasure generation network and edge enhancement

Country Status (1)

Country Link
CN (1) CN113077385A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958203A (en) * 2023-08-01 2023-10-27 北京知存科技有限公司 Image processing method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109118431A (en) * 2018-09-05 2019-01-01 武汉大学 A kind of video super-resolution method for reconstructing based on more memories and losses by mixture
CN111062867A (en) * 2019-11-21 2020-04-24 浙江大华技术股份有限公司 Video super-resolution reconstruction method
CN111311490A (en) * 2020-01-20 2020-06-19 陕西师范大学 Video super-resolution reconstruction method based on multi-frame fusion optical flow
CN112001847A (en) * 2020-08-28 2020-11-27 徐州工程学院 Method for generating high-quality image by relatively generating antagonistic super-resolution reconstruction model
CN112215140A (en) * 2020-10-12 2021-01-12 苏州天必佑科技有限公司 3-dimensional signal processing method based on space-time countermeasure

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109118431A (en) * 2018-09-05 2019-01-01 武汉大学 A kind of video super-resolution method for reconstructing based on more memories and losses by mixture
CN111062867A (en) * 2019-11-21 2020-04-24 浙江大华技术股份有限公司 Video super-resolution reconstruction method
CN111311490A (en) * 2020-01-20 2020-06-19 陕西师范大学 Video super-resolution reconstruction method based on multi-frame fusion optical flow
CN112001847A (en) * 2020-08-28 2020-11-27 徐州工程学院 Method for generating high-quality image by relatively generating antagonistic super-resolution reconstruction model
CN112215140A (en) * 2020-10-12 2021-01-12 苏州天必佑科技有限公司 3-dimensional signal processing method based on space-time countermeasure

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958203A (en) * 2023-08-01 2023-10-27 北京知存科技有限公司 Image processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111028150B (en) Rapid space-time residual attention video super-resolution reconstruction method
CN109903228B (en) Image super-resolution reconstruction method based on convolutional neural network
CN114092330B (en) Light-weight multi-scale infrared image super-resolution reconstruction method
CN113139898B (en) Light field image super-resolution reconstruction method based on frequency domain analysis and deep learning
CN112734646B (en) Image super-resolution reconstruction method based on feature channel division
CN110136062B (en) Super-resolution reconstruction method combining semantic segmentation
CN109671023A (en) A kind of secondary method for reconstructing of face image super-resolution
CN108269244B (en) Image defogging system based on deep learning and prior constraint
CN109785236B (en) Image super-resolution method based on super-pixel and convolutional neural network
CN110717868B (en) Video high dynamic range inverse tone mapping model construction and mapping method and device
CN112804561A (en) Video frame insertion method and device, computer equipment and storage medium
CN116152120B (en) Low-light image enhancement method and device integrating high-low frequency characteristic information
CN114331831A (en) Light-weight single-image super-resolution reconstruction method
CN112422870B (en) Deep learning video frame insertion method based on knowledge distillation
CN112200732B (en) Video deblurring method with clear feature fusion
CN113850718A (en) Video synchronization space-time super-resolution method based on inter-frame feature alignment
CN114972036A (en) Blind image super-resolution reconstruction method and system based on fusion degradation prior
CN113128517B (en) Tone mapping image mixed visual feature extraction model establishment and quality evaluation method
CN113077385A (en) Video super-resolution method and system based on countermeasure generation network and edge enhancement
Liu et al. Arbitrary-scale super-resolution via deep learning: A comprehensive survey
CN115496819B (en) Rapid coding spectral imaging method based on energy concentration characteristic
Wang et al. Super resolution for compressed screen content video
CN116668738A (en) Video space-time super-resolution reconstruction method, device and storage medium
CN116228550A (en) Image self-enhancement defogging algorithm based on generation of countermeasure network
CN115841523A (en) Double-branch HDR video reconstruction algorithm based on Raw domain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210706

RJ01 Rejection of invention patent application after publication