CN106686472B - A kind of high frame-rate video generation method and system based on deep learning - Google Patents

A kind of high frame-rate video generation method and system based on deep learning Download PDF

Info

Publication number
CN106686472B
CN106686472B CN201611241691.XA CN201611241691A CN106686472B CN 106686472 B CN106686472 B CN 106686472B CN 201611241691 A CN201611241691 A CN 201611241691A CN 106686472 B CN106686472 B CN 106686472B
Authority
CN
China
Prior art keywords
frame
video
neural networks
convolutional neural
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611241691.XA
Other languages
Chinese (zh)
Other versions
CN106686472A (en
Inventor
王兴刚
罗浩
姜玉静
刘文予
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201611241691.XA priority Critical patent/CN106686472B/en
Publication of CN106686472A publication Critical patent/CN106686472A/en
Application granted granted Critical
Publication of CN106686472B publication Critical patent/CN106686472B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0127Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level by changing the field or frame frequency of the incoming video signal, e.g. frame rate converter

Abstract

The high frame-rate video generation method based on deep learning that the invention discloses a kind of, comprising: generate training sample set using one or more original high frame-rate video segments;The multiple video frame set training binary channels convolutional neural networks model concentrated using the training sample, with binary channels convolutional neural networks after being optimized, the binary channels convolutional neural networks model is convolutional neural networks made of being merged as two convolutional channels;Using binary channels convolutional neural networks after the optimization, the insertion frame of this two video frame is generated according to two video frame of arbitrary neighborhood in low frame-rate video, to generate the video that frame per second is higher than the low frame-rate video.The method of the present invention whole process is end to end, it does not need to carry out subsequent processing to video frame, the problems such as video frame rate conversion effect is good, and the video fluency of synthesis is high, switches for shake existing during video capture, video scene has preferable robustness.

Description

A kind of high frame-rate video generation method and system based on deep learning
Technical field
The invention belongs to technical field of computer vision, regard more particularly, to a kind of high frame per second based on deep learning Frequency generation method and system.
Background technique
With the development of science and technology, the mode that people obtain video is more and more convenient, however due to hardware, it is most of Video is all that non-professional equipment is collected, and frame per second generally only has 24fps-30fps.The video of high frame per second has high smoothness Degree, can bring better visual experience.If people directly upload the video of high frame per second on the net, due to flow Consumption increases, and the cost of people also increases as.If the directly upper video for transmitting low frame per second, due to network line, Inevitably there is frame losing in video, the video the big more is easy to appear this phenomenon, so that the view of distal end during transmission Frequency quality cannot be effectively guaranteed, this greatly affected the experience of people.It is reasonable it is therefore desirable to be used in distal end Processing mode carries out subsequent processing to the video that people upload, so that the demand that the quality of video is able to satisfy people is even further Promote the experience of people.
Summary of the invention
Aiming at the above defects or improvement requirements of the prior art, the high frame per second based on deep learning that the present invention provides a kind of Thus video generation method solves to regard due to low frame per second its object is to be the video of high frame per second by the Video Quality Metric of low frame per second Frame losing of frequency during network transmission and cause video quality to decline the technical issues of affecting to the experience of people.
To achieve the above object, according to one aspect of the present invention, a kind of high frame per second view based on deep learning is provided Frequency generation method, comprising the following steps:
(1) training sample set is generated using one or more original high frame-rate video segments, the training sample concentrates packet It includes multiple video frame set, includes two training frames and a control frame in each video frame set, described two Training frames are two video frames that a frame or multiframe are spaced in high frame-rate video segment, and the control frame is two training frames Any one frame of midfeather;The frame per second of the high frame-rate video segment is higher than setting frame per second threshold value;
(2) the multiple video frame set training binary channels convolutional neural networks model concentrated using the training sample, With binary channels convolutional neural networks after being optimized;Wherein, the binary channels convolutional neural networks model is to be led to by two convolution Convolutional neural networks made of road fusion, two convolutional channels are respectively used to two video frames in input video frame subclass simultaneously Convolution is carried out to the video frame of input respectively, binary channels convolutional neural networks model carries out the convolution results of two convolutional channels It merges and exports to predict frame, the frame flyback training bilateral is compareed with the video frame set according to the prediction frame Road convolutional neural networks model;
(3) using binary channels convolutional neural networks after the optimization, according to two video of arbitrary neighborhood in low frame-rate video Frame generates the insertion frame of this two video frame, to generate the video that frame per second is higher than the low frame-rate video.
In one embodiment of the present of invention, each convolutional channel in the binary channels convolutional neural networks model includes k A convolutional layer, wherein k > 0, the mathematical description of each convolutional layer are as follows:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i indicates that the number of plies of convolutional layer, input video frame are the 0th layer, and * represents convolution operation, Fi-1Indicate (i-1)-th layer Output, Zi(Y) output after i-th layer of convolution operation, W are indicatediFor i-th layer of convolution nuclear parameter, BiJoin for i-th layer of biasing Number.
In one embodiment of the present of invention, in the convolutional channel, it is connected to one respectively after preceding k-1 convolutional layer The active coating of ReLU is to keep the sparsity of network, mathematical description are as follows:
Fi(Y)=max (0, Zi)。
In one embodiment of the present of invention, in two video frames by the feature that obtains after the last one convolutional layer Response diagram is merged by the way of corresponding position value addition.
In one embodiment of the present of invention, swash in the Sigmoid that is followed by that the mixing operation obtains characteristic response figure Layer living is the pixel value of picture to be mapped between 0-1, mathematical description are as follows:
In one embodiment of the present of invention, use mean value for 0, the Gaussian Profile that standard deviation is 1 initializes convolution nuclear parameter, Biasing is initialized as 0, and benchmark learning rate is initialized as 1e-6, benchmark learning rate reduces 10 times after the m period of iteration, wherein m For preset value.
In one embodiment of the present of invention, frame flyback instruction is compareed with the video frame set according to the prediction frame Practice the binary channels convolutional neural networks model, specifically:
Using prediction frame with compare the error between frame, the binary channels convolution is trained using error backpropagation algorithm Neural network;Wherein use least squares error for our majorized function, mathematical description are as follows:
Wherein i indicates i-th samples pictures, and n indicates the quantity of sample training collection, YiIndicate the video frame of neural network forecast, Indicate the true value of corresponding video frame.
In one embodiment of the present of invention, the k value is 3;First convolutional layer has the convolution kernel of 64 9*9, step-length For 1 pixel, Filling power 4, Filling power refers to the circle number in the zero padding of characteristic pattern periphery;Second convolutional layer has 32 1*1's Convolution kernel, step-length are 1 pixel, Filling power 0;Third convolutional layer has the convolution kernel of 3 5*5, step-length 1, and Filling power is 2。
It is another aspect of this invention to provide that additionally providing a kind of high frame-rate video generation system based on deep learning, packet Include training sample set generation module, binary channels convolutional neural networks optimization module and high frame-rate video generation module, in which:
The training sample set generation module, for generating training sample using one or more high frame-rate video segments Collection, it includes two training frames in each video frame set that it includes multiple video frame set that the training sample, which is concentrated, With a control frame, two training frames are two video frames that a frame or multiframe are spaced in high frame-rate video segment, described Compare any one frame for the midfeather that frame is two training frames;The frame per second of the high frame-rate video segment is higher than setting frame Rate threshold value;
The binary channels convolutional neural networks optimization module, multiple video frames for being concentrated using the training sample Gather training binary channels convolutional neural networks model, binary channels convolutional neural networks after being optimized;Wherein, the binary channels volume Product neural network model is the convolutional neural networks of two channels fusion, and two channels are respectively used to input the video frame Two video frames in conjunction simultaneously carry out convolution to the video frame of input respectively, binary channels convolutional neural networks model it is logical to two The result of road convolution is merged and is exported to predict frame, compares frame with the video frame set according to the prediction frame Binary channels convolutional neural networks model described in regression training;
The high frame-rate video generation module is used for using binary channels convolutional neural networks after the optimization, according to low frame Two video frame of arbitrary neighborhood in rate video generates the insertion frame of this two video frame, so that generating frame per second is higher than the low frame per second view The video of frequency.
In one embodiment of the present of invention, each convolutional channel in the binary channels convolutional neural networks model includes k A convolutional layer, wherein k > 0, the mathematical description of each convolutional layer are as follows:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i indicates that the number of plies of convolutional layer, input video frame are the 0th layer, and * represents convolution operation, Fi-1Indicate (i-1)-th layer Output, Zi(Y) output after i-th layer of convolution operation, W are indicatediFor i-th layer of convolution nuclear parameter, BiJoin for i-th layer of biasing Number.
In general, contemplated above technical scheme through the invention, compared with prior art, the present invention has following Technical effect:
(1) feature extraction of the invention and the prediction of frame are obtained by the supervised learning of training sample, without artificial Intervene, spatial diversity information can be preferably fitted under the scene of large-scale data;
(2) whole process of the invention is end to end, using the ability of self-teaching of convolutional neural networks, to pass through self The mode of study learns model parameter, it is succinct efficiently, overcome traditional technology taken time and effort when handling video frame rate conversion and The unconspicuous feature of effect.
Detailed description of the invention
Fig. 1 is the flow chart of the method for converting video frame rate of the invention based on deep learning, wherein FiIndicate i-th layer Output, Yt-1、Yt、Yt+1Indicate continuous three frames video frame, YtNet is indicated for calculating error, Prediction as true value The video frame of network prediction.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below Not constituting a conflict with each other can be combined with each other.
Just technical term of the invention is explained and illustrated first below:
Convolutional neural networks (Convolutional Neural Network, CNN): one kind can be used for image classification, returns The neural network of tasks such as return, its particularity is embodied in two aspects, be on the one hand its interneuronal connection is non-complete Connection, the weight of the connection in another aspect same layer between certain neurons is shared.Network is usually by convolutional layer, pond Change layer and full articulamentum is constituted.Convolutional layer and pond layer are responsible for extracting the hierarchy characteristic of image, and full articulamentum is responsible for extracting Feature classified or returned.The parameter of network includes the parameter and biasing of convolution kernel and full articulamentum, and parameter can be with By reverse conduction algorithm from the acquistion of data middle school to.
Reverse conduction algorithm (Backpropagation Algorithm, BP): being a kind of and optimal method (such as gradient Descent method) be used in combination, for training the common methods of artificial neural network.This method damages weight calculations all in network The gradient of function is lost, this gradient can feed back to optimal method, for updating weight to minimize loss function.Algorithm master To include two stages: the forward direction of excitation, backpropagation and weight update.
With the arrival of big data era, the scale of video database is also increasing, and the solution of this problem is also more next It is more urgent.Deep neural network can by it is a kind of it is preferable in a manner of simulate the working method of human brain data analyzed, In recent years, deep learning all achieves successful application in the every field of computer vision, but video frame rate is turned The problem of changing there is no apparent research, and complicated in view of traditional method for converting video frame rate process, time human cost is higher, this hair The bright one kind that proposes is based on deep learning method for converting video frame rate.This method whole process be end to end, it is easy and efficiently, For video shake, scene switching the problems such as all there is stronger robustness.
As shown in Figure 1, may comprise steps of the present invention is based on the method for converting video frame rate of deep learning:
(1) training sample set is generated using one or more original high frame-rate video segments, the training sample concentrates packet It includes multiple video frame set, includes two training frames and a control frame in each video frame set, described two Training frames are two video frames that a frame or multiframe are spaced in high frame-rate video segment, and the control frame is two training frames Any one frame of midfeather;The frame per second of the high frame-rate video segment is higher than setting frame per second threshold value;
Specifically, high frame-rate video segment can be extracted and obtain sets of video frames, obtain training sample according to a certain percentage Collection;
Training sample set is combined by multiple video frames, and two instructions are included in each video frame set Practice frame and a control frame.Control frame is chosen for the most intermediate of two training frames or close to that most intermediate frame.Generally In the case of refer to and take continuous 3 frame, an intermediate frame is control frame, and another two frame is training frames;If frame per second is sufficiently high, can also take Be separated by two frames of multiframe (depending on frame per second, cannot too many) as training frames, and interphase every multiframe in can choose middle ware Every any one frame be control frame;Such as trained high video frame rate be 60, which has N frame, then according to interval one The mode of this training of frame sample, from the 2nd to N-1 frame in take a frame as true value (control frame) at random, and it is the frame is adjacent Two frames be input to inside network as training sample (two training frames).Similarly, can also come in the way of being spaced multiframe Training sample, can be used for the video of lower frame per second in this way, i.e., the Video Quality Metric of lower frame per second is the video of high frame per second.
(2) the multiple video frame set training binary channels convolutional neural networks model concentrated using the training sample, With binary channels convolutional neural networks after being optimized;Wherein, the binary channels convolutional neural networks model is to be led to by two convolution Convolutional neural networks made of road fusion, two convolutional channels are respectively used to two video frames in input video frame subclass simultaneously Convolution is carried out to the video frame of input respectively, binary channels convolutional neural networks model carries out the convolution results of two convolutional channels It merges and exports to predict frame, the frame flyback training bilateral is compareed with the video frame set according to the prediction frame Road convolutional neural networks model;
It first has to design and Implement a binary channels convolutional neural networks, specifically:
The binary channels convolutional neural networks model established is the convolutional neural networks of two convolutional channels fusion, includes altogether K convolutional layer, k > 0, preferably 3 individually carry out convolution to two video frame pictures (training frames) respectively.First convolutional layer has The convolution kernel of 64 9*9, step-length are 1 pixel, and Filling power 4, Filling power refers to the circle number in the zero padding of characteristic pattern periphery.Second A convolutional layer has the convolution kernel of 32 1*1, and step-length is 1 pixel, Filling power 0.Third volume layer has the convolution kernel of 3 5*5, Step-length is 1, Filling power 2.The mathematical description of convolutional layer are as follows:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i indicates the number of plies of network, and input picture is the 0th layer, and * represents convolution operation, Fi-1Indicate (i-1)-th layer defeated Out, Zi(Y) output after i-th layer of convolution operation, W are indicatediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter;
In 3 convolutional layers, it is connected to the active coating of a ReLU respectively after the 1st and the 2nd convolutional layer to keep The sparsity of network, mathematical description are as follows:
Fi(Y)=max (0, Zi)。
Two video frame pictures are added by the characteristic response figure obtained after third convolutional layer using corresponding position value Mode merged;
After the mixing operation, obtained characteristic response figure is followed by a Sigmoid active coating with by the picture of picture Plain value is mapped between 0-1, mathematical description are as follows:
Before the training binary channels convolutional neural networks, need to each pixel value in video frame divided by 255 into Row normalized, the pixel value after normalization is between 0 to 1;
Also, before the training binary channels convolutional neural networks, need to initialize the use of convolutional neural networks parameter Mean value is 0, and the Gaussian Profile that standard deviation is 1 initializes convolution nuclear parameter, and biasing is initialized as 0, the initialization of benchmark learning rate For 1e-6, benchmark learning rate reduces 10 times after the m period of iteration, and wherein m is preset value;For example, m preferably 2, then in preceding 1-m In a iteration cycle, learning rate=1e-6, after the m period of iteration, learning rate=1e-7, and be always maintained at constant.
Specifically, can use the predicted value of network with compare between error, instructed using error backpropagation algorithm Practice binary channels convolutional neural networks.Use least squares error for our majorized function, mathematical description are as follows:
Wherein i indicates i-th samples pictures, and n indicates the quantity of sample training collection, YiIndicate the video frame of neural network forecast, Indicate the true value of corresponding video frame;
(3) using binary channels convolutional neural networks after the optimization, according to two video of arbitrary neighborhood in low frame-rate video Frame generates the insertion frame of this two video frame, to generate the video that frame per second is higher than the low frame-rate video.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should all include Within protection scope of the present invention.

Claims (3)

1. a kind of high frame-rate video generation method based on deep learning, which is characterized in that the described method comprises the following steps:
(1) training sample set is generated using one or more original high frame-rate video segments, it includes more that the training sample, which is concentrated, A video frame set includes two training frames and a control frame, two training in each video frame set Frame is two video frames that a frame or multiframe are spaced in high frame-rate video segment, and the control frame is among two training frames Any one frame at interval;The frame per second of the high frame-rate video segment is higher than setting frame per second threshold value;
(2) the multiple video frame set training binary channels convolutional neural networks model concentrated using the training sample, to obtain Binary channels convolutional neural networks after must optimizing;Wherein, the binary channels convolutional neural networks model is to be melted by two convolutional channels Convolutional neural networks made of conjunction, two convolutional channels are respectively used to two video frames and difference in input video frame subclass Convolution is carried out to the video frame of input, binary channels convolutional neural networks model merges the convolution results of two convolutional channels And export to predict frame, it is rolled up according to the prediction frame with the frame flyback training binary channels that compares in the video frame set Product neural network model;Wherein,
Each convolutional channel in the binary channels convolutional neural networks model includes k convolutional layer, wherein k > 0, each convolution The mathematical description of layer are as follows:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i indicates that the number of plies of convolutional layer, input video frame are the 0th layer, and * represents convolution operation, Fi-1Indicate (i-1)-th layer defeated Out, Zi(Y) output after i-th layer of convolution operation, W are indicatediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter;
In the convolutional channel, it is connected to the active coating of a ReLU respectively after preceding k-1 convolutional layer to keep network Sparsity, mathematical description are as follows:
Fi(Y)=max (0, Zi);
Use mean value for 0, the Gaussian Profile that standard deviation is 1 initializes convolution nuclear parameter, and biasing is initialized as 0, benchmark study speed Rate is initialized as 1e-6, benchmark learning rate reduces 10 times after the m period of iteration, and wherein m is preset value;
The k value is 3;First convolutional layer has the convolution kernel of 64 9*9, and step-length is 1 pixel, Filling power 4, Filling power Refer to the circle number in the zero padding of characteristic pattern periphery;Second convolutional layer has the convolution kernel of 32 1*1, and step-length is 1 pixel, Filling power It is 0;Third convolutional layer has the convolution kernel of 3 5*5, step-length 1, Filling power 2;
A Sigmoid active coating is followed by what the mixing operation obtained characteristic response figure the pixel value of picture to be mapped to Between 0-1, mathematical description are as follows:
The frame flyback training binary channels convolutional neural networks are compareed with the video frame set according to the prediction frame Model, specifically:
Using prediction frame with compare the error between frame, the binary channels convolutional Neural is trained using error backpropagation algorithm Network;Wherein use least squares error for majorized function, mathematical description are as follows:
Wherein i indicates i-th samples pictures, and n indicates the quantity of sample training collection, YiIndicate the video frame of neural network forecast,It indicates The true value of corresponding video frame;
(3) raw according to two video frame of arbitrary neighborhood in low frame-rate video using binary channels convolutional neural networks after the optimization At the insertion frame of this two video frame, to generate the video that frame per second is higher than the low frame-rate video.
2. the high frame-rate video generation method based on deep learning as described in claim 1, which is characterized in that at described two Video frame is merged by the way of corresponding position value addition by the characteristic response figure obtained after the last one convolutional layer.
3. a kind of high frame-rate video based on deep learning generates system, which is characterized in that including training sample set generation module, Binary channels convolutional neural networks optimization module and high frame-rate video generation module, in which:
The training sample set generation module, for generating training sample set, institute using one or more high frame-rate video segments Stating training sample and concentrating includes multiple video frame set, includes two training frames and one in each video frame set Frame is compareed, two training frames are two video frames that a frame or multiframe are spaced in high frame-rate video segment, the control frame For any one frame of the midfeather of two training frames;The frame per second of the high frame-rate video segment is higher than setting frame per second threshold Value;
The binary channels convolutional neural networks optimization module, multiple video frame set for being concentrated using the training sample Training binary channels convolutional neural networks model, binary channels convolutional neural networks after being optimized;Wherein, the binary channels convolution mind It is the convolutional neural networks of two channels fusion through network model, two channels are respectively used to input in the video frame set Two video frames and convolution carried out respectively to the video frame of input, binary channels convolutional neural networks model rolls up two channels Long-pending result is merged and is exported to predict frame, compares frame flyback with the video frame set according to the prediction frame The training binary channels convolutional neural networks model;
The high frame-rate video generation module, for being regarded according to low frame per second using binary channels convolutional neural networks after the optimization Two video frame of arbitrary neighborhood in frequency generates the insertion frame of this two video frame, to generate frame per second higher than the low frame-rate video Video;
Each convolutional channel in the binary channels convolutional neural networks model includes k convolutional layer, wherein k > 0, each convolution The mathematical description of layer are as follows:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i indicates that the number of plies of convolutional layer, input video frame are the 0th layer, and * represents convolution operation, Fi-1Indicate (i-1)-th layer defeated Out, Zi(Y) output after i-th layer of convolution operation, W are indicatediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter.
CN201611241691.XA 2016-12-29 2016-12-29 A kind of high frame-rate video generation method and system based on deep learning Active CN106686472B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611241691.XA CN106686472B (en) 2016-12-29 2016-12-29 A kind of high frame-rate video generation method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611241691.XA CN106686472B (en) 2016-12-29 2016-12-29 A kind of high frame-rate video generation method and system based on deep learning

Publications (2)

Publication Number Publication Date
CN106686472A CN106686472A (en) 2017-05-17
CN106686472B true CN106686472B (en) 2019-04-26

Family

ID=58872327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611241691.XA Active CN106686472B (en) 2016-12-29 2016-12-29 A kind of high frame-rate video generation method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN106686472B (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9756375B2 (en) 2015-01-22 2017-09-05 Microsoft Technology Licensing, Llc Predictive server-side rendering of scenes
CN107481209B (en) * 2017-08-21 2020-04-21 北京航空航天大学 Image or video quality enhancement method based on convolutional neural network
CN107613299A (en) * 2017-09-29 2018-01-19 杭州电子科技大学 A kind of method for improving conversion effect in frame rate using network is generated
CN107886081B (en) * 2017-11-23 2021-02-02 武汉理工大学 Intelligent underground dangerous behavior grading identification method for double-path U-Net deep neural network mine
CN108111860B (en) * 2018-01-11 2020-04-14 安徽优思天成智能科技有限公司 Video sequence lost frame prediction recovery method based on depth residual error network
CN108322685B (en) * 2018-01-12 2020-09-25 广州华多网络科技有限公司 Video frame insertion method, storage medium and terminal
CN108600655A (en) * 2018-04-12 2018-09-28 视缘(上海)智能科技有限公司 A kind of video image synthetic method and device
CN108600762B (en) * 2018-04-23 2020-05-15 中国科学技术大学 Progressive video frame generation method combining motion compensation and neural network algorithm
CN108830812B (en) * 2018-06-12 2021-08-31 福建帝视信息科技有限公司 Video high frame rate reproduction method based on grid structure deep learning
CN108810551B (en) * 2018-06-20 2021-01-12 Oppo(重庆)智能科技有限公司 Video frame prediction method, terminal and computer storage medium
CN108961236B (en) * 2018-06-29 2021-02-26 国信优易数据股份有限公司 Circuit board defect detection method and device
CN110780664A (en) * 2018-07-25 2020-02-11 格力电器(武汉)有限公司 Robot control method and device and sweeping robot
CN109379550B (en) * 2018-09-12 2020-04-17 上海交通大学 Convolutional neural network-based video frame rate up-conversion method and system
CN109068174B (en) * 2018-09-12 2019-12-27 上海交通大学 Video frame rate up-conversion method and system based on cyclic convolution neural network
CN109120935A (en) * 2018-09-27 2019-01-01 贺禄元 A kind of coding method of video image and device
US10924525B2 (en) 2018-10-01 2021-02-16 Microsoft Technology Licensing, Llc Inducing higher input latency in multiplayer programs
CN109360436B (en) * 2018-11-02 2021-01-08 Oppo广东移动通信有限公司 Video generation method, terminal and storage medium
CN110163061B (en) * 2018-11-14 2023-04-07 腾讯科技(深圳)有限公司 Method, apparatus, device and computer readable medium for extracting video fingerprint
CN111371983A (en) * 2018-12-26 2020-07-03 清华大学 Video online stabilization method and system
CN113766313B (en) * 2019-02-26 2024-03-05 深圳市商汤科技有限公司 Video data processing method and device, electronic equipment and storage medium
JP7201073B2 (en) * 2019-04-01 2023-01-10 株式会社デンソー Information processing equipment
CN110636221A (en) * 2019-09-23 2019-12-31 天津天地人和企业管理咨询有限公司 System and method for super frame rate of sensor based on FPGA
CN112584158B (en) * 2019-09-30 2021-10-15 复旦大学 Video quality enhancement method and system
WO2021104381A1 (en) * 2019-11-27 2021-06-03 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method and device for stylizing video and storage medium
CN113630621B (en) * 2020-05-08 2022-07-19 腾讯科技(深圳)有限公司 Video processing method, related device and storage medium
RU2747965C1 (en) * 2020-10-05 2021-05-18 Самсунг Электроникс Ко., Лтд. Frc occlusion processing with deep learning
US11889227B2 (en) 2020-10-05 2024-01-30 Samsung Electronics Co., Ltd. Occlusion processing for frame rate conversion using deep learning
CN113516050A (en) * 2021-05-19 2021-10-19 江苏奥易克斯汽车电子科技股份有限公司 Scene change detection method and device based on deep learning
CN113420771B (en) * 2021-06-30 2024-04-19 扬州明晟新能源科技有限公司 Colored glass detection method based on feature fusion

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202285412U (en) * 2011-09-02 2012-06-27 深圳市华美特科技有限公司 Low frame rate transmission or motion image twinkling elimination system
CN104102919A (en) * 2014-07-14 2014-10-15 同济大学 Image classification method capable of effectively preventing convolutional neural network from being overfit
CN105787510A (en) * 2016-02-26 2016-07-20 华东理工大学 System and method for realizing subway scene classification based on deep learning
CN106022237A (en) * 2016-05-13 2016-10-12 电子科技大学 Pedestrian detection method based on end-to-end convolutional neural network
CN106228124A (en) * 2016-07-17 2016-12-14 西安电子科技大学 SAR image object detection method based on convolutional neural networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202285412U (en) * 2011-09-02 2012-06-27 深圳市华美特科技有限公司 Low frame rate transmission or motion image twinkling elimination system
CN104102919A (en) * 2014-07-14 2014-10-15 同济大学 Image classification method capable of effectively preventing convolutional neural network from being overfit
CN105787510A (en) * 2016-02-26 2016-07-20 华东理工大学 System and method for realizing subway scene classification based on deep learning
CN106022237A (en) * 2016-05-13 2016-10-12 电子科技大学 Pedestrian detection method based on end-to-end convolutional neural network
CN106228124A (en) * 2016-07-17 2016-12-14 西安电子科技大学 SAR image object detection method based on convolutional neural networks

Also Published As

Publication number Publication date
CN106686472A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
CN106686472B (en) A kind of high frame-rate video generation method and system based on deep learning
CN109064507B (en) Multi-motion-stream deep convolution network model method for video prediction
Zhou et al. Photorealistic facial expression synthesis by the conditional difference adversarial autoencoder
Feng et al. SGANVO: Unsupervised deep visual odometry and depth estimation with stacked generative adversarial networks
CN112669325B (en) Video semantic segmentation method based on active learning
CN108805083A (en) The video behavior detection method of single phase
CN105095862B (en) A kind of human motion recognition method based on depth convolution condition random field
WO2022252272A1 (en) Transfer learning-based method for improved vgg16 network pig identity recognition
CN108830913B (en) Semantic level line draft coloring method based on user color guidance
CN111489372B (en) Video foreground and background separation method based on cascade convolution neural network
CN111444878A (en) Video classification method and device and computer readable storage medium
CN107590432A (en) A kind of gesture identification method based on circulating three-dimensional convolutional neural networks
CN106952271A (en) A kind of image partition method handled based on super-pixel segmentation and EM/MPM
CN110443784B (en) Effective significance prediction model method
CN108647599B (en) Human behavior recognition method combining 3D (three-dimensional) jump layer connection and recurrent neural network
CN111241963B (en) First person view video interactive behavior identification method based on interactive modeling
CN113807318B (en) Action recognition method based on double-flow convolutional neural network and bidirectional GRU
CN109345446A (en) A kind of image style branching algorithm based on paired-associate learning
Simonyan et al. Two-stream convolutional networks for action recognition
CN105976379A (en) Fuzzy clustering color image segmentation method based on cuckoo optimization
Tan et al. Bidirectional long short-term memory with temporal dense sampling for human action recognition
CN109583334A (en) A kind of action identification method and its system based on space time correlation neural network
Desai et al. Next frame prediction using ConvLSTM
CN116012950A (en) Skeleton action recognition method based on multi-heart space-time attention pattern convolution network
CN112365428B (en) DQN-based highway monitoring video defogging method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant