CN108012157B - Method for constructing convolutional neural network for video coding fractional pixel interpolation - Google Patents

Method for constructing convolutional neural network for video coding fractional pixel interpolation Download PDF

Info

Publication number
CN108012157B
CN108012157B CN201711207766.7A CN201711207766A CN108012157B CN 108012157 B CN108012157 B CN 108012157B CN 201711207766 A CN201711207766 A CN 201711207766A CN 108012157 B CN108012157 B CN 108012157B
Authority
CN
China
Prior art keywords
neural network
convolutional neural
fractional pixel
video coding
pixel interpolation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711207766.7A
Other languages
Chinese (zh)
Other versions
CN108012157A (en
Inventor
宋利
张翰
杨小康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201711207766.7A priority Critical patent/CN108012157B/en
Publication of CN108012157A publication Critical patent/CN108012157A/en
Application granted granted Critical
Publication of CN108012157B publication Critical patent/CN108012157B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Discrete Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a method for constructing a convolutional neural network for fractional pixel interpolation of video coding, which comprises the following steps: collecting images with different contents and resolutions to form an original training data set containing data with different types and coding complexity; preprocessing the original training data set to obtain training data which accord with the inter-frame prediction fractional pixel interpolation characteristic of video coding; constructing a deep convolutional neural network to obtain a convolutional neural network structure suitable for video coding inter-frame prediction fractional pixel interpolation; and inputting the data obtained by preprocessing into the constructed convolutional neural network, and training the constructed convolutional neural network by taking the original training data set as a corresponding true value. The invention ensures that the convolutional neural network can be trained smoothly, and the fractional pixels obtained by interpolation of the trained convolutional neural network meet the requirement of the interpolation characteristic of the fractional pixels of video coding.

Description

Method for constructing convolutional neural network for video coding fractional pixel interpolation
Technical Field
The invention relates to a method in the technical field of image processing, in particular to a convolutional neural network method suitable for fractional pixel interpolation of video coding interframe prediction.
Background
Inter-frame prediction is a key technology in a video coding standard, and by utilizing the similarity of video contents between frames, the redundancy of the video in time can be effectively removed, so that the coding compression efficiency is improved. Meanwhile, due to the discrete sampling operation in the digitization process, the real object motion is not necessarily performed according to the sampling grid. In order to further improve the accuracy of object motion prediction, the motion of objects in the video coding standard is in units of fractional pixels. Pixel values at fractional pixel positions on the sampling grid are not actually present, and in an application, the pixel values at the fractional pixel positions need to be interpolated by using pixel values at the actually present integer positions.
However, the interpolation filters currently used in video coding to generate fractional pixels are designed artificially based on some a priori assumptions. The parameters of these interpolation filters are fixed, and with the continuous richness of video contents and the continuous increase of video resolution, the fixed parameter filters are not all applicable.
Deep learning is a method for fitting mass data through a designed neural network to obtain a universally applicable model. The method based on deep learning makes a major breakthrough in semantic level problems such as target tracking and pedestrian detection, and remarkably improves the effect in pixel level problems such as image super-resolution.
The inter-frame prediction fractional pixel interpolation and the image super-resolution have certain similarity, namely, a large image is generated by a small image which exists really according to a certain multiplying power. However, image super-resolution is to generate a whole high-resolution large image by using a low-resolution image, while inter-frame prediction fractional pixel interpolation is to generate the rest fractional position pixels according to the real existing integer position pixels, and it is necessary to ensure that the integer position pixels are not changed. In addition, for the interpolation of the interframe prediction fractional pixels, the pixels at the fractional positions do not exist really, so that no real true value can be referred to in the training process of the convolutional neural network, and the training cannot be performed normally.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a construction method of a convolutional neural network suitable for video coding interframe prediction fractional pixel interpolation, which designs the convolutional neural network suitable for video coding interframe prediction fractional pixel interpolation and preprocessing operation which leads the training to be carried out smoothly by utilizing the advantages of the convolutional neural network with good performance on the aspect of obtaining the super-resolution problem of images and simultaneously considering the characteristics of the video coding interframe prediction fractional pixel interpolation, so that the objective quality of video coding reconstructed frames can be improved, and the improvement of the coding efficiency is realized.
In order to achieve the above object, the method for constructing a convolutional neural network for video coding fractional pixel interpolation according to the present invention comprises:
collecting images with different contents and different resolutions to form an original training data set containing data with different types and different coding complexities;
preprocessing the collected original training data set to obtain training data which accords with the inter-frame prediction fractional pixel interpolation characteristic of video coding and is used as input data for training a convolutional neural network;
building a deep convolutional neural network, and considering the fractional pixel interpolation characteristic of video coding to obtain a convolutional neural network structure suitable for the fractional pixel interpolation of video coding interframes prediction;
and inputting the data obtained by preprocessing into the constructed convolutional neural network, and training the constructed convolutional neural network by taking the original training data set as a corresponding true value to obtain the convolutional neural network model suitable for video coding interframe prediction fractional pixel interpolation.
Preferably, the pretreatment operation is as follows:
a) performing down-sampling operation of corresponding magnification on the image in the original training data set according to the fractional pixel position generated by interpolation as required to obtain low-resolution training data used in the step b);
b) performing compression coding on the low-resolution training data according to the configuration of the static image coding in the video coding standard to obtain a low-resolution coding reconstructed image used in the step c);
c) and (b) performing upsampling operation of corresponding multiplying power in the step a) on the low-resolution coding reconstructed image, and recovering the size of the original image to obtain input data of the training convolutional neural network.
More preferably, in c), the upsampling operation on the low-resolution encoded reconstructed image ensures that the pixel values of the integer pixel positions of the high-resolution image after the upsampling are consistent with the low-resolution encoded reconstructed image before the upsampling.
Preferably, the deep convolutional neural network is constructed, wherein the constructed deep convolutional neural network comprises 20 weight layers and 1 weight masking layer; for the weighted masking layer, WIIs the weight of an integer pixel position, WHAll fractional pixel positions share a weight, which is the weight of the fractional pixel position.
More preferably, the video coding inter-prediction fractional pixel interpolation, wherein integer pixel position pixel values are unchanged, generates only fractional pixel positions.
Compared with the prior art, the invention has the beneficial effects that:
in addition to the strong capability of extracting features from mass data by utilizing the deep convolutional neural network, the invention also considers the special data characteristics of video coding and the special characteristics of inter-frame prediction fractional pixel interpolation of the video coding compared with image super-resolution, redesigns the deep convolutional neural network, and simultaneously designs matched preprocessing operation to ensure that the training of the convolutional neural network can be smoothly carried out, thereby obtaining a convolutional neural network model suitable for fractional pixel interpolation of the video coding, improving the objective quality of the compression coding reconstructed video, and improving the video coding efficiency.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flow chart of a method of one embodiment of the present invention;
FIG. 2 is a schematic diagram of a convolutional neural network structure according to an embodiment of the present invention;
FIG. 3 is a diagram of integer pixel location, fractional-half pixel location, and fractional-quarter pixel location according to an embodiment of the invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.
The invention provides a method for constructing a convolutional neural network for fractional pixel interpolation of video coding, which comprises the following design ideas as shown in figure 1:
collecting images with different contents and different resolutions to obtain a training data set containing data with different types and different coding complexities;
and preprocessing the collected training data set to obtain input data of the training convolutional neural network. The pretreatment operation specifically comprises:
a) performing down-sampling operation of corresponding magnification on the image in the original training data set according to the fractional pixel position generated by interpolation as required to obtain low-resolution training data used in the step b);
b) performing compression coding on the low-resolution training data according to the configuration of the static image coding in the video coding standard to obtain a low-resolution coding reconstructed image used in the step c);
c) and (b) performing upsampling operation of corresponding multiplying power in the step a) on the low-resolution coding reconstructed image, and recovering the size of the original image to obtain input data of the training convolutional neural network.
Establishing a deep convolutional neural network suitable for video coding inter-frame prediction fractional pixel interpolation, taking an image obtained through preprocessing operation as the input of the network, simultaneously taking a corresponding image in an original training data set as a corresponding true value, setting training parameters and training the convolutional neural network;
and performing fractional pixel interpolation operation by using the convolutional neural network model obtained by training, and realizing video coding inter-frame prediction fractional pixel interpolation based on the convolutional neural network.
And b) in the preprocessing step, the low-resolution image after the down sampling is compressed and coded according to the configuration of the compression and coding of the static image in the video coding standard, so that the reconstruction value of the low-resolution image becomes an image containing the characteristics of the video coding data.
C) in the preprocessing step, for the up-sampling operation of the low-resolution reconstructed image after the compression coding, it is required to ensure that the pixel value of the whole pixel position of the high-resolution image after the up-sampling is consistent with the pixel value of the low-resolution image before the up-sampling, and only the pixel value of the fractional pixel position is generated.
On the basis of the image super-resolution convolutional neural network, the inherent characteristic of video coding fractional pixel interpolation, namely the constant integer position pixel is considered, only the fractional position pixel is generated, the convolutional neural network is redesigned, meanwhile, the preprocessing operation is matched, the convolutional neural network can be smoothly trained, the fractional pixels obtained by using the trained convolutional neural network interpolation meet the requirement of the video coding fractional pixel interpolation characteristic, and the improvement of the video coding efficiency can be realized by using the fractional pixel interpolation. In addition, the convolution neural network obtained by the invention can be used for simultaneously generating the pixel values of all fractional pixel positions in one operation.
The invention is applied to the latest video coding standard, namely High Efficiency Video Coding (HEVC), introduces a convolutional neural network construction method suitable for HEVC inter-frame prediction one-half pixel interpolation, and mainly explains concrete implementation details such as data preprocessing, convolutional neural network structure construction and the like in detail. Of course, the invention is also applicable to other coding standards.
1. Data preprocessing process
For the step of compressing and encoding the low-resolution image in the data preprocessing process, the low-resolution image obtained by down-sampling is encoded by adopting the full intra-frame (AI) configuration of HEVC.
And for the up-sampling process of the low-resolution compression coding reconstruction image in the preprocessing process, an interpolation filter based on discrete cosine transform is adopted. For one-half pixel position, the interpolation filter based on discrete cosine transform is an 8-tap filter, and the tap coefficients are shown in table 1.
TABLE 1 interpolation filter tap coefficients based on discrete cosine transform
Index i -3 -2 -1 0 1 2 3 4
Hfilter[i] -1 4 -11 40 40 -11 4 -1
The process of using a discrete cosine transform based interpolation filter to generate the half pixel position pixel in fig. 3 is as follows:
Figure RE-GDA0001585386330000052
wherein, b0,j,hi,0,j0,0Denotes the pixel value of one-half pixel position, Ai,jRepresenting integer pixel position pixel value, hfilter [ i ]]Which represents the tap coefficients of a discrete cosine transform based interpolation filter, B represents the bit depth for the pixel values.
2. Convolutional neural network structure construction
The invention adopts an accurate electromagnetic Super-Resolution Using Very Deep conditional network published by an IEEE Conference on Computer Vision and pattern Recognition Conference in 2016 (IEEE international Computer Vision and pattern Recognition Conference) by J Kim and the like as a basic frame, a weight masking layer is added in the original frame, and W is a weight masking layerIIs the weight value of the pixel value at integer position, WHIs a weight value of one-half pixel position pixel value.
As shown in fig. 2, the convolutional neural network structure constructed in this embodiment includes 20 convolutional layers and 1 weight masking layer. For convolutional layers, each convolutional layer contains 64 different filters, except the first convolutional layer and the last convolutional layer, each filter having a size of 3 × 3 × 64. For the first convolutional layer, 64 filters with a size of 3 × 3 × 1 are included. For the last convolutional layer, 1 filter with size 3 × 3 × 64 is included. For the weighted masking layer, integer pixel positions and fractional pixel positions use different weights, where WIIs a weight of an integer pixel position, WHIs a one-half pixel location weight. The convolutional neural network input in this embodiment is a high-resolution image of a target size obtained by preprocessing a low-resolution image. The convolutional neural network in this embodiment predicts a residual image between the final output high-resolution image and the initial input preprocessed image, and is defined as follows:
R=YH-XILR(4)
wherein Y isHRepresenting the final output high resolution image, XILRRepresenting the initial input pre-processed image.
And adding a residual image obtained by predicting the convolutional neural network with the input preprocessing image to obtain a finally output high-resolution image.
3. Training convolutional neural networks
The training process of the convolutional neural network adopts Euclidean distance as a loss function:
Figure RE-GDA0001585386330000061
where theta represents the set of parameters that the convolutional neural network needs to learn,
Figure RE-GDA0001585386330000062
a representation of a training image is shown,representing the corresponding truth image, F (X), in the original training data seti(ii) a θ) represents the final output high resolution image. Since the convolutional neural network in this embodiment predicts the residual image, F (X) in equation (5)i(ii) a θ) should be expressed as:
Figure RE-GDA0001585386330000064
wherein the content of the first and second substances,representing the initially input pre-processed image.
The convolutional neural network model suitable for the video coding inter-frame prediction fractional pixel interpolation is obtained through the training.
4. Effects of the implementation
The convolutional neural network model obtained by training in the embodiment is applied to an HEVC coding framework, and an improved encoder and a standard HEVC encoder are used for coding a test sequence. Test sequences as shown in table 2, all test sequences are in YUV format 4:2:0, indicating a bit depth of 8.
Table 2 details of the test sequences
Figure RE-GDA0001585386330000066
Figure RE-GDA0001585386330000071
The HEVC encoder used in this embodiment is HM-16.7, the encoding configuration is a low-latency P frame (LDP) general test configuration, and the Quantization Parameters (QP) used in encoding are 22,27,32, and 37, respectively.
Under the above-described implementation conditions, the coding test results shown in table 3 were obtained. The performance index adopted in table 3 is a BD-Rate index, which indicates the percentage of the saving of the inter-frame prediction half-fractional pixel interpolation code Rate by using the convolutional neural network obtained by training in the present embodiment under the condition of the same peak signal-to-noise ratio (PSNR) as compared with the standard HEVC encoder. As shown in Table 3, under the above-mentioned conditions, the average BD-Rate of the Y, U, V components was-0.9%, -0.1%, respectively. In particular, the gain of the sequence BasketbalPass is most significant, and the gain of the Y, U, V three-component can reach-2.4%, -0.1%, -1.6%. As can be seen from table 3, compared with the standard HEVC encoder, the method for performing half-pixel interpolation on a luma component by using the convolutional neural network obtained by training for the luma Y component in this embodiment has an obvious improvement in encoding efficiency. In addition, because the encoder uses a technology for predicting the chrominance component based on the luminance component, the remaining chrominance components can also obtain certain encoding performance improvement along with the improvement of the reconstruction quality of the luminance component.
TABLE 3 test sequence coding Performance (BD-Rate)
Figure RE-GDA0001585386330000081
To further illustrate that the convolutional neural network construction of the present invention is more suitable for fractional pixel interpolation in video coding inter-frame prediction, table 4 shows the test results of performing one-half fractional pixel interpolation directly using the convolutional neural network obtained by training for the image super-resolution problem and comparing with using a standard HEVC encoder. As can be seen from table 4, the fractional pixel interpolation directly using the convolutional neural network for image super-resolution has significant coding performance loss.
TABLE 4 convolutional neural network coding test results using image super resolution (BD-Rate)
Figure RE-GDA0001585386330000091
In conclusion, the invention designs a special convolutional neural network aiming at the inter-frame prediction fractional pixel interpolation of video coding, and simultaneously designs a matched data preprocessing process, so that the training of the convolutional neural network can be smoothly carried out, and the fractional pixels generated by using the trained convolutional neural network can meet the specific requirements of fractional pixel interpolation. The fractional pixel interpolation of the convolutional neural network obtained by the invention can achieve remarkable improvement of coding performance and is more suitable for the fractional pixel interpolation part of video coding inter-frame prediction.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims (5)

1. A method for constructing a convolutional neural network for video coding fractional pixel interpolation is characterized by comprising the following steps: the method comprises the following steps:
collecting images with different contents and different resolutions to form an original training data set containing data with different types and different coding complexities;
preprocessing the collected original training data set to obtain training data which accords with the inter-frame prediction fractional pixel interpolation characteristic of video coding and is used as input data for training a convolutional neural network;
building a deep convolutional neural network, and considering the fractional pixel interpolation characteristic of video coding to obtain a convolutional neural network structure suitable for the fractional pixel interpolation of video coding interframes prediction;
inputting the data obtained by preprocessing into a built convolutional neural network, and training the built convolutional neural network by taking the original training data set as a corresponding true value to obtain a convolutional neural network model suitable for video coding interframe prediction fractional pixel interpolation;
the pretreatment operation comprises the following steps:
a) performing down-sampling operation of corresponding magnification on the image in the original training data set according to the fractional pixel position generated by interpolation as required to obtain low-resolution training data used in the step b);
b) encoding the low-resolution training data according to the configuration of the static image encoding in the video encoding standard to obtain a low-resolution encoding reconstructed image used in the step c);
c) and (b) performing corresponding multiplying power up-sampling operation on the low-resolution coding reconstructed image in the step a) by adopting an interpolation filter based on discrete cosine transform, and recovering to the original image size to obtain input data of the training convolutional neural network.
2. The method of claim 1, wherein the convolutional neural network is constructed by using a convolutional encoder as a basis for fractional pixel interpolation in video coding: in the step c), the upsampling operation of the low-resolution coding reconstructed image is carried out, so that the pixel value of the integer pixel position of the high-resolution image after the upsampling is consistent with the low-resolution coding reconstructed image before the upsampling.
3. The method of constructing a convolutional neural network for video coding fractional pixel interpolation as claimed in any of claims 1-2, wherein: the build depth convolution spiritThe method comprises the steps of constructing a network, wherein the built deep convolutional neural network comprises 20 weight layers and 1 weight masking layer; for the weighted masking layer, WIIs the weight of an integer pixel position, WHAll fractional pixel positions share a weight, which is the weight of the fractional pixel position.
4. The method of claim 3, wherein the convolutional neural network is constructed by using a convolutional encoder as a basis for fractional pixel interpolation in video coding: the video coding inter-frame predicts fractional pixel interpolation, wherein integer pixel position pixel values are unchanged and only fractional pixel positions are generated.
5. Use of a convolutional neural network model constructed by the method of any one of claims 1 to 4, wherein: and applying the convolutional neural network model to fractional pixel interpolation operation to realize video coding inter-frame prediction fractional pixel interpolation based on the convolutional neural network.
CN201711207766.7A 2017-11-27 2017-11-27 Method for constructing convolutional neural network for video coding fractional pixel interpolation Active CN108012157B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711207766.7A CN108012157B (en) 2017-11-27 2017-11-27 Method for constructing convolutional neural network for video coding fractional pixel interpolation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711207766.7A CN108012157B (en) 2017-11-27 2017-11-27 Method for constructing convolutional neural network for video coding fractional pixel interpolation

Publications (2)

Publication Number Publication Date
CN108012157A CN108012157A (en) 2018-05-08
CN108012157B true CN108012157B (en) 2020-02-04

Family

ID=62054016

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711207766.7A Active CN108012157B (en) 2017-11-27 2017-11-27 Method for constructing convolutional neural network for video coding fractional pixel interpolation

Country Status (1)

Country Link
CN (1) CN108012157B (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502954B (en) * 2018-05-17 2023-06-16 杭州海康威视数字技术股份有限公司 Video analysis method and device
CN110794255B (en) * 2018-08-01 2022-01-18 北京映翰通网络技术股份有限公司 Power distribution network fault prediction method and system
CN110794254B (en) * 2018-08-01 2022-04-15 北京映翰通网络技术股份有限公司 Power distribution network fault prediction method and system based on reinforcement learning
CN110933432A (en) * 2018-09-19 2020-03-27 珠海金山办公软件有限公司 Image compression method, image decompression method, image compression device, image decompression device, electronic equipment and storage medium
CN111010568B (en) * 2018-10-06 2023-09-29 华为技术有限公司 Training method and device of interpolation filter, video image coding and decoding method and coder-decoder
CN109361919A (en) * 2018-10-09 2019-02-19 四川大学 A kind of image coding efficiency method for improving combined super-resolution and remove pinch effect
CN109525859B (en) * 2018-10-10 2021-01-15 腾讯科技(深圳)有限公司 Model training method, image sending method, image processing method and related device equipment
KR102525578B1 (en) * 2018-10-19 2023-04-26 삼성전자주식회사 Method and Apparatus for video encoding and Method and Apparatus for video decoding
CN109451308B (en) * 2018-11-29 2021-03-09 北京市商汤科技开发有限公司 Video compression processing method and device, electronic equipment and storage medium
CN109785279B (en) * 2018-12-28 2023-02-10 江苏师范大学 Image fusion reconstruction method based on deep learning
CN111711817B (en) * 2019-03-18 2023-02-10 四川大学 HEVC intra-frame coding compression performance optimization method combined with convolutional neural network
CN111800630A (en) * 2019-04-09 2020-10-20 Tcl集团股份有限公司 Method and system for reconstructing video super-resolution and electronic equipment
CN110072119B (en) * 2019-04-11 2020-04-10 西安交通大学 Content-aware video self-adaptive transmission method based on deep learning network
CN110177282B (en) * 2019-05-10 2021-06-04 杭州电子科技大学 Interframe prediction method based on SRCNN
CN110099280B (en) * 2019-05-24 2020-05-08 浙江大学 Video service quality enhancement method under limitation of wireless self-organizing network bandwidth
CN110519606B (en) * 2019-08-22 2021-12-07 天津大学 Depth video intra-frame intelligent coding method
CN110493596B (en) * 2019-09-02 2021-09-17 西北工业大学 Video coding system and method based on neural network
CN110572710B (en) * 2019-09-25 2021-09-28 北京达佳互联信息技术有限公司 Video generation method, device, equipment and storage medium
US11445198B2 (en) * 2020-09-29 2022-09-13 Tencent America LLC Multi-quality video super resolution with micro-structured masks
CN112601095B (en) * 2020-11-19 2023-01-10 北京影谱科技股份有限公司 Method and system for creating fractional interpolation model of video brightness and chrominance
CN113365079B (en) * 2021-06-01 2023-05-30 闽南师范大学 Super-resolution network-based video coding sub-pixel motion compensation method
CN113822801B (en) * 2021-06-28 2023-08-18 浙江工商大学 Compressed video super-resolution reconstruction method based on multi-branch convolutional neural network
CN114677652B (en) * 2022-05-30 2022-09-16 武汉博观智能科技有限公司 Illegal behavior monitoring method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112263A (en) * 2014-06-28 2014-10-22 南京理工大学 Method for fusing full-color image and multispectral image based on deep neural network
CN106204449A (en) * 2016-07-06 2016-12-07 安徽工业大学 A kind of single image super resolution ratio reconstruction method based on symmetrical degree of depth network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3259920A1 (en) * 2015-02-19 2017-12-27 Magic Pony Technology Limited Visual processing using temporal and spatial interpolation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112263A (en) * 2014-06-28 2014-10-22 南京理工大学 Method for fusing full-color image and multispectral image based on deep neural network
CN106204449A (en) * 2016-07-06 2016-12-07 安徽工业大学 A kind of single image super resolution ratio reconstruction method based on symmetrical degree of depth network

Also Published As

Publication number Publication date
CN108012157A (en) 2018-05-08

Similar Documents

Publication Publication Date Title
CN108012157B (en) Method for constructing convolutional neural network for video coding fractional pixel interpolation
Li et al. Learning convolutional networks for content-weighted image compression
CN107463989B (en) A kind of image based on deep learning goes compression artefacts method
Liu et al. Deep learning-based technology in responses to the joint call for proposals on video compression with capability beyond HEVC
Cheng et al. Performance comparison of convolutional autoencoders, generative adversarial networks and super-resolution for image compression
CN1719735A (en) Method or device for coding a sequence of source pictures
DE202012013410U1 (en) Image compression with SUB resolution images
CN110099280B (en) Video service quality enhancement method under limitation of wireless self-organizing network bandwidth
Zhang et al. Video compression artifact reduction via spatio-temporal multi-hypothesis prediction
Xiong et al. Sparse spatio-temporal representation with adaptive regularized dictionary learning for low bit-rate video coding
Wang et al. Multi-scale convolutional neural network-based intra prediction for video coding
US11962786B2 (en) Multi-stage block coding
CN111711817A (en) HEVC intra-frame coding compression performance optimization research combined with convolutional neural network
Hu et al. Fvc: An end-to-end framework towards deep video compression in feature space
CN114466192A (en) Image/video super-resolution
Lian et al. Reversing demosaicking and compression in color filter array image processing: Performance analysis and modeling
Wang et al. Neural network-based enhancement to inter prediction for video coding
CN116437102B (en) Method, system, equipment and storage medium for learning universal video coding
CN112637596B (en) Code rate control system
CN114202463A (en) Video super-resolution method and system for cloud fusion
CN101360236B (en) Wyner-ziv video encoding and decoding method
CN112584158B (en) Video quality enhancement method and system
CN112601095A (en) Method and system for creating fractional interpolation model of video brightness and chrominance
CN101389032A (en) Intra-frame predictive encoding method based on image value interposing
CN113709483B (en) Interpolation filter coefficient self-adaptive generation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant