CN112884636B - Style migration method for automatically generating stylized video - Google Patents
Style migration method for automatically generating stylized video Download PDFInfo
- Publication number
- CN112884636B CN112884636B CN202110117964.4A CN202110117964A CN112884636B CN 112884636 B CN112884636 B CN 112884636B CN 202110117964 A CN202110117964 A CN 202110117964A CN 112884636 B CN112884636 B CN 112884636B
- Authority
- CN
- China
- Prior art keywords
- encoder
- migration
- style
- video
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013508 migration Methods 0.000 title claims abstract description 80
- 230000005012 migration Effects 0.000 title claims abstract description 80
- 238000000034 method Methods 0.000 title claims abstract description 48
- 230000008569 process Effects 0.000 claims abstract description 17
- 238000013140 knowledge distillation Methods 0.000 claims abstract description 14
- 230000004927 fusion Effects 0.000 claims abstract description 6
- 238000007906 compression Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 18
- 238000010586 diagram Methods 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 claims description 6
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 5
- 238000004821 distillation Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 9
- 230000003287 optical effect Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000005034 decoration Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a style migration method for automatically generating a stylized video, which comprises the steps of constructing a high-compression self-encoder model based on knowledge distillation and a style migration model for automatically generating the stylized video based on a feature migration module of semantic alignment; the self-encoder is divided into an encoder and a decoder, the encoder can encode an original video content frame and a style image into a feature map, the feature migration module can fuse the content features and the style features obtained by the encoding of the encoder based on semantic alignment, finally, fusion migration features based on the semantic alignment are obtained, and finally, the stylized video frame is obtained through the decoder. The method and the device can ensure the stability of the video after migration, can realize the stylization of any video in any style, have very high speed of the style migration process, and have higher practicability.
Description
Technical Field
The invention belongs to the field of computer application, and particularly relates to a style migration method for automatically generating stylized videos.
Background
With the development and popularization of the internet and the mobile internet, more and more short video platforms are coming up, and the artistic demands of people for videos are gradually increased based on the development and popularization of the short video platforms, and the creation of the short video platforms by professional artists or professional clipping agents is inconvenient and high in cost. Therefore, automatic generation of video of any artistic style from video by computer technology is attracting attention and favor.
Given a content map and a target style map, the purpose of style migration is to produce a stylized image that can have both content map structure and style map texture. The style migration method based on the single image has a great deal of research work, and a great deal of attention is paid to the video style migration field at present, because the video style migration has very wide application prospect (including short video artistic conversion and the like); obviously, style migration of video is more practical and challenging than style migration of a single image.
Compared with the traditional image style migration, the video style migration is more difficult in that the stylized quality, the stability and the calculation efficiency are simultaneously considered. Currently existing video style migration methods can be broadly divided into two categories depending on whether or not light streaming is used.
The first is a method using optical flow, which puts forward a loss of timing consistency to achieve stability between adjacent frames through the supervised constraint of optical flow. Including optimization-based optical flow constraint methods, which, while enabling stable migration videos, require nearly three minutes for each frame style of video to migrate, this extremely slow migration rate is unacceptable. Video style migration methods based on feedforward networks have been proposed later, but because optical flow constraints are still used in the training stage and the testing stage, real-time effects cannot be achieved in video migration tasks. To solve this problem, some methods use light flow only during the training phase and avoid light flow during the testing phase, but the effect of the final migration is very unstable, although the speed is increased compared to those methods that also use light flow during the testing phase.
The second is a method that does not use optical flow, such as LST, and can realize characteristic affine so that stable stylized video can be obtained. After this, studies have proposed using a Avatar-Net based decoration module in combination with a component normalization method to guarantee video stability. But the existing methods without using optical streams in this category all use the original VGG network to encode content and style characteristics, and the VGG network is very bulky, meaning that a very large memory space is required to store the VGG model, which will limit their application in some small terminal devices to a great extent.
Disclosure of Invention
The invention aims to: the invention provides a style migration method for automatically generating stylized videos, which can realize real-time stable arbitrary video style migration.
The technical scheme is as follows: the invention provides a style migration method for automatically generating stylized video, which specifically comprises the following steps:
(1) Constructing a video style migration network model, wherein the model comprises a high-compression self-encoder module based on knowledge distillation and a feature migration module based on semantic alignment; the self-encoder module comprises a lightweight encoder and a lightweight decoder;
(2) Encoder encoding of content video frames and style sheets: performing knowledge distillation on a lightweight encoder based on a VGG network, enabling the encoder to learn the encoding capability of the VGG encoder of a teacher network while being lightweight enough, and encoding an original video content frame and a style image into a feature map;
(3) Feature migration module based on semantic alignment: fusing the content characteristics and style characteristics obtained by encoding the encoder to obtain fusion migration characteristics based on semantic alignment;
(4) Knowledge distillation is carried out on the lightweight decoder based on VGG network: the decoder can learn the decoding capability of the VGG decoder of the teacher network while being light enough, and the decoder decodes the fused and migrated features to obtain stylized video frames, and finally synthesizes the video.
Further, the implementation of the step (2) requires optimizing the loss function as follows:
wherein I is an original image, an encoder in a VGG network is E, and a lightweight encoder isI' is a reconstructed picture, E k (I) Outputting a feature map for a kth layer in an original VGG encoder,>in encoders of light weightAnd outputting a characteristic diagram by the k layer, wherein lambda and gamma are super parameters.
Further, the implementation process of the step (3) is as follows:
the characteristic diagram of the output of the content image obtained by the encoder is F c ∈R Cx(WxH) The output of the style image is F s ∈R Cx(WxH) Wherein C is the number of channels of the feature map, and W and H are the width and height of the feature map respectively; feature migration module based on semantic alignment aims at finding a feature migration which converts content graphs of different video frames to enable semantic alignment, and the conversion process is assumed to be parameterized into a projection matrix P epsilon R CxC The optimized objective function is:
wherein ,representing the slave F c An operation of selecting an ith position feature vector, A ij Representation-> and />K neighbor matrices of (a);
solving P is as follows:
wherein A is an affine matrix as defined above, U is a diagonal matrix, and is a matrix with feature alignment function,projection matrix P is formalized as p=g (F) c )f(F s ) T ) In the linear conversion process, g (x) =mx and f (x) =xt T The method comprises the steps of carrying out a first treatment on the surface of the The f (x) procedure chooses to fit with three convolutional layers, and the g () procedure uses a full join layer fit.
Further, the implementation of the step (4) requires optimizing the loss function as follows:
wherein I is an original image, E k (I) The feature map is output for the kth layer in the original VGG encoder,for the k-th layer output feature map in a lightweight encoder, I' is the decoder with a lightweight>The reconstructed picture obtained by decoding, lambda being the super parameter, the distillation process above being aimed at letting +.>While the information of the original E can be preserved, < >>Can well contain->And reconstructing the image by the obtained output characteristic information.
The beneficial effects are that: compared with the prior art, the invention has the beneficial effects that: 1. the stability between adjacent frames needs to be considered when the higher stylized quality of the video frames is finished, namely the sequence consistency is considered, so that the stability of the video after migration can be ensured; 2. the stylization has rich diversity, and can realize the stylization of any video in any style; 3. in the process of video style migration, real-time performance needs to be achieved, namely, the speed of the style migration process needs to be guaranteed to be very high, and in order to have higher practicability, the whole model needs to be guaranteed to be light.
Drawings
FIG. 1 is a flow chart of the invention;
FIG. 2 is a schematic diagram of a high compression self-encoder module based on knowledge distillation in accordance with the present invention;
FIG. 3 is a schematic diagram of a video style migration network constructed in accordance with the present invention;
FIG. 4 is an exemplary diagram of a video style migration effect of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings.
The invention provides a style migration method for automatically generating stylized video, which mainly needs to go through three stages in the video style migration process, wherein the first stage is to encode a content video frame and a style map by an encoder, the second stage is to perform feature style migration fusion on the encoded content and style features, and the third stage is to decode the migrated and fused features by a decoder so as to obtain the stylized video frame, and finally synthesize the video. The model sizes of the encoder and the decoder determine whether the model is light in weight to a great extent, and whether the feature migration part is designed or not directly determines whether the stylized video obtained by migration is stable, whether real-time style migration can be completed or not and whether the model has any style migration capability or not. As shown in fig. 1, the method specifically comprises the following steps:
step 1: a video style migration network model is constructed, as shown in fig. 3, comprising a knowledge distillation based high compression self-encoder module and a semantic alignment based feature migration module.
The self-encoder is divided into an encoder and a decoder, the encoder can encode an original video content frame and a style image into a feature map, the feature migration module can fuse the content features and the style features obtained by the encoding of the encoder based on semantic alignment, finally, fusion migration features based on the semantic alignment are obtained, and finally, the stylized video frame is obtained through the decoder.
A feature style migration module (FTM) based on semantic alignment, which can ensure stability between adjacent frames in the video style migration process; the video style migration model size is only 2.67MB and the speed at which video style migration is performed can reach 166.67fps.
Step 2: encoder encoding of content video frames and style sheets: and performing knowledge distillation on the lightweight encoder based on the VGG network, so that the encoder learns the encoding capability of the VGG encoder of the teacher network while being lightweight enough, and encoding the original video content frames and the style images into feature maps.
As shown in fig. 2, the lightweight encoder and decoder network architecture specifically includes: a symmetrical four-set up-and down-sampling convolutional layer, a max-pooling layer, and a lightweight encoder network that uses a ReLU activation function to feature-encode input video frames as well as arbitrary style images. The VGG network is a network structure widely used in style migration, and the lightweight encoder network is a student network obtained by knowledge distillation based on a VGG teacher network, so that the VGG network can realize the encoding process of images by using as few parameters as possible. As in the network architecture of figure 2Partly shown, there is a need for an encoder network that is lightweight enough to learn the encoding capabilities of a teacher network VGG encoder, where the loss function that needs to be optimized is as follows:
wherein the encoder in the original VGG-based network is E, and the lightweight encoder is defined asI' is a reconstructed picture obtained by reconstruction of a decoder, wherein I is an original image, an encoder in a network of VGG is E, and a lightweight encoder isI' is a reconstructed picture, E k (I) Outputting a feature map for a kth layer in an original VGG encoder,>for the k layer output characteristic diagram in the lightweight encoder, lambda and gamma are super parameters
Step 3: the light-weight decoder carries out knowledge distillation based on the VGG network, so that the decoder can learn the decoding capability of the VGG decoder of the teacher network while being light enough.
As in the network architecture of figure 2As shown in part, for a lightweight decoder network that performs feature decoding on migrated features, knowledge distillation is performed using a VGG network as a teacher network, and it is necessary to enable the decoder network to learn the decoding capability of the VGG decoder of the teacher network while being lightweight enough, where the loss function to be optimized is as follows:
wherein Is implemented by a lightweight decoder->The reconstructed picture obtained by decoding, the goal of the above distillation procedure is to let +.>While the information of the original E can be preserved, < >>Can well contain->And reconstructing the image by the obtained output characteristic information.
Step 4: and the feature migration module based on semantic alignment fuses the content features and style features obtained by encoding of the encoder to obtain fusion migration features based on semantic alignment.
The feature migration module based on semantic alignment is a key for realizing real-time stable video style migration, and feature semantic alignment is required to be performed while style feature migration is completed efficiently. To achieve the above, the idea of manifold alignment is employed. Assume that a characteristic diagram of the content image output by the encoder is F c ∈R Cx(WxH) The output of the style image obtained by the lightweight coding network is F s ∈R Cx(WxH) Where C is the number of feature map channels and W and H are the width and height of the feature map, respectively. The FTM module designed will output feature F after semantic alignment migration cs And outputs it to the decoder to obtain a migrated result map. In practice, the goal of our FTM module design is to find a transformation that enables semantically aligned feature migration of content graphs of different video frames, assuming that the transformation process can be parameterized as a projection matrix P e R CxC The optimized objective function is:
wherein ,representing the slave F c An operation of selecting an ith position feature vector, A ij Representation-> and />Is a k-nearest neighbor matrix of (c). Thus, the objective function is to let the transformThe content features thereafter are similar to the k-nearest neighbor features of the grid feature space. Equivalent to the process of video style migration, there may be some moving objects and some illumination changes, which may cause jitter after migration. But based on the affine preserving transformation above, the adjacent two frames can be kept consistent, thereby generating stable video style migration results.
Solving the above equation is in effect calculating its closed-form solution for P, which can be found by deriving P and letting the derivative be 0:
wherein A is an affine matrix as defined above, U is a diagonal matrix, and also a matrix. Since A is a diagonal matrix, it can be decomposed into T T T, the projection matrix P may thus be formed as p=g (F c )f(F s ) T ) In the above linear conversion process, g (x) =mx and f (x) =xt T . Even though we can solve P in a closed form, the process of matrix inversion is very time consuming, so we have designed an FTM network module result to fit the above solution process. Where the f (x) procedure we choose to fit with three convolutional layers and the g () procedure uses a full join layer fit.
The content images that need to be used from encoder training are preprocessed. Uniformly adjusting the image to 256×256 pixels; the content images are input into the student self-encoder network and the teacher self-encoder network respectively, and the student self-encoder network comprises an encoding part and a decoding part. The encoding part encodes the image; the decoding section reconstructs the input image based on the feature codes obtained by the encoder. Meanwhile, through the feature perception loss and the reconstruction loss, as shown in fig. 2, the training method based on knowledge distillation ensures that a lightweight self-encoder network obtained by distillation can have the capabilities of multi-level feature extraction and feature-based image reconstruction; the content image and the style image are respectively sent into a style migration network added with a semantic alignment feature migration module as shown in fig. 3, the middle feature migration module is trained (a lightweight self-encoder network which is distilled is fixed), and the migration module is trained based on designed content loss Lc and wind lattice loss Ls.
In the test stage, the video frames and the selected style images are directly input into a trained lightweight style migration model, the model automatically and efficiently outputs the stylized result, and finally, the stable stylized video is synthesized in real time, as shown in fig. 4, the style migration result of 10 frames per interval in one video can be seen, and the style migration with semantic alignment can be performed to generate a stable video frame result no matter whether the video is foreground or background.
Claims (3)
1. A style migration method for automatically generating stylized video is characterized by comprising the following steps:
(1) Constructing a video style migration network model, wherein the model comprises a high-compression self-encoder module based on knowledge distillation and a feature migration module based on semantic alignment; the self-encoder module comprises a lightweight encoder and a lightweight decoder;
(2) Encoder encoding of content video frames and style sheets: performing knowledge distillation on a lightweight encoder based on a VGG network, enabling the encoder to learn the encoding capability of the VGG encoder of a teacher network while being lightweight enough, and encoding an original video content frame and a style image into a feature map;
(3) Feature migration module based on semantic alignment: fusing the content characteristics and style characteristics obtained by encoding the encoder to obtain fusion migration characteristics based on semantic alignment;
(4) Knowledge distillation is carried out on the lightweight decoder based on VGG network: the decoder can learn the decoding capability of the VGG decoder of the teacher network while being light enough, and the decoder decodes the fused and migrated features to obtain stylized video frames, and finally synthesizes the video;
the implementation process of the step (3) is as follows:
the characteristic diagram of the output of the content image obtained by the encoder is F c ∈R Cx(WxH) The output of the style image is F s ∈R Cx (WxH) Wherein C is the number of channels of the feature map, and W and H are the width and height of the feature map respectively; feature migration module based on semantic alignment aims at finding a feature migration which converts content graphs of different video frames to enable semantic alignment, and the conversion process is assumed to be parameterized into a projection matrix P epsilon R CxC The optimized objective function is:
wherein ,representing the slave F c An operation of selecting an ith position feature vector, A ij Representation-> and />K neighbor matrices of (a);
solving P is as follows:
wherein A is an affine matrix as defined above, U is a diagonal matrix, and is a matrix with feature alignment function, and the projection matrix P is formed as p=g (F c )f(F s ) T ) In the linear conversion process, g (x) =mx and f (x) =xt T The method comprises the steps of carrying out a first treatment on the surface of the The f (x) procedure chooses to fit with three convolutional layers, and the g () procedure uses a full join layer fit.
2. The method for automatically generating stylized video style migration of claim 1, wherein the implementation of step (2) requires optimization of a loss function as follows:
wherein I is an original image, an encoder in a VGG network is E, and a lightweight encoder is' I is a reconstructed picture, E k (I) Outputting a feature map for a kth layer in an original VGG encoder,>and outputting a characteristic diagram for a k layer in the lightweight encoder, wherein lambda and gamma are super parameters.
3. The method for automatically generating stylized video style migration of claim 1, wherein the implementation of step (4) requires optimization of a loss function as follows:
wherein I is an original image, E k (I) The feature map is output for the kth layer in the original VGG encoder,for the k-th layer output feature map in a lightweight encoder, I' is the decoder with a lightweight>The reconstructed picture obtained by decoding, lambda being the super parameter, the distillation process above being aimed at letting +.>While the information of the original E can be preserved, < >>Can well contain->And reconstructing the image by the obtained output characteristic information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110117964.4A CN112884636B (en) | 2021-01-28 | 2021-01-28 | Style migration method for automatically generating stylized video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110117964.4A CN112884636B (en) | 2021-01-28 | 2021-01-28 | Style migration method for automatically generating stylized video |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112884636A CN112884636A (en) | 2021-06-01 |
CN112884636B true CN112884636B (en) | 2023-09-26 |
Family
ID=76052976
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110117964.4A Active CN112884636B (en) | 2021-01-28 | 2021-01-28 | Style migration method for automatically generating stylized video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112884636B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113989102B (en) * | 2021-10-19 | 2023-01-06 | 复旦大学 | Rapid style migration method with high shape-preserving property |
CN114331827B (en) * | 2022-03-07 | 2022-06-07 | 深圳市其域创新科技有限公司 | Style migration method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110175951A (en) * | 2019-05-16 | 2019-08-27 | 西安电子科技大学 | Video Style Transfer method based on time domain consistency constraint |
CN110310221A (en) * | 2019-06-14 | 2019-10-08 | 大连理工大学 | A kind of multiple domain image Style Transfer method based on generation confrontation network |
CN110706151A (en) * | 2018-09-13 | 2020-01-17 | 南京大学 | Video-oriented non-uniform style migration method |
CN111325681A (en) * | 2020-01-20 | 2020-06-23 | 南京邮电大学 | Image style migration method combining meta-learning mechanism and feature fusion |
CN111932445A (en) * | 2020-07-27 | 2020-11-13 | 广州市百果园信息技术有限公司 | Compression method for style migration network and style migration method, device and system |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116823593A (en) * | 2016-10-21 | 2023-09-29 | 谷歌有限责任公司 | Stylized input image |
US10748324B2 (en) * | 2018-11-08 | 2020-08-18 | Adobe Inc. | Generating stylized-stroke images from source images utilizing style-transfer-neural networks with non-photorealistic-rendering |
US20200167658A1 (en) * | 2018-11-24 | 2020-05-28 | Jessica Du | System of Portable Real Time Neurofeedback Training |
-
2021
- 2021-01-28 CN CN202110117964.4A patent/CN112884636B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110706151A (en) * | 2018-09-13 | 2020-01-17 | 南京大学 | Video-oriented non-uniform style migration method |
CN110175951A (en) * | 2019-05-16 | 2019-08-27 | 西安电子科技大学 | Video Style Transfer method based on time domain consistency constraint |
CN110310221A (en) * | 2019-06-14 | 2019-10-08 | 大连理工大学 | A kind of multiple domain image Style Transfer method based on generation confrontation network |
CN111325681A (en) * | 2020-01-20 | 2020-06-23 | 南京邮电大学 | Image style migration method combining meta-learning mechanism and feature fusion |
CN111932445A (en) * | 2020-07-27 | 2020-11-13 | 广州市百果园信息技术有限公司 | Compression method for style migration network and style migration method, device and system |
Non-Patent Citations (1)
Title |
---|
深度伪造视频检测技术综述;暴雨轩;芦天亮;杜彦辉;;计算机科学(第09期);289-298 * |
Also Published As
Publication number | Publication date |
---|---|
CN112884636A (en) | 2021-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112884636B (en) | Style migration method for automatically generating stylized video | |
CN110533044B (en) | Domain adaptive image semantic segmentation method based on GAN | |
CN111862294B (en) | Hand-painted 3D building automatic coloring network device and method based on ArcGAN network | |
CN110072119B (en) | Content-aware video self-adaptive transmission method based on deep learning network | |
CN107480206A (en) | A kind of picture material answering method based on multi-modal low-rank bilinearity pond | |
CN113034380A (en) | Video space-time super-resolution method and device based on improved deformable convolution correction | |
CN115829876A (en) | Real degraded image blind restoration method based on cross attention mechanism | |
CN108924528B (en) | Binocular stylized real-time rendering method based on deep learning | |
CN114841859A (en) | Single-image super-resolution reconstruction method based on lightweight neural network and Transformer | |
CN112381716A (en) | Image enhancement method based on generation type countermeasure network | |
CN112837212B (en) | Image arbitrary style migration method based on manifold alignment | |
CN113052764A (en) | Video sequence super-resolution reconstruction method based on residual connection | |
CN116962657B (en) | Color video generation method, device, electronic equipment and storage medium | |
Sun et al. | ESinGAN: Enhanced single-image GAN using pixel attention mechanism for image super-resolution | |
CN117237190A (en) | Lightweight image super-resolution reconstruction system and method for edge mobile equipment | |
CN116091319A (en) | Image super-resolution reconstruction method and system based on long-distance context dependence | |
CN113436094B (en) | Gray level image automatic coloring method based on multi-view attention mechanism | |
CN116109510A (en) | Face image restoration method based on structure and texture dual generation | |
CN113780209B (en) | Attention mechanism-based human face attribute editing method | |
CN113393377B (en) | Single-frame image super-resolution method based on video coding | |
Bai et al. | Itstyler: Image-optimized text-based style transfer | |
Wang et al. | Image quality enhancement using hybrid attention networks | |
Zhang et al. | Deep Learning Technology in Film and Television Post-Production | |
CN114513684B (en) | Method for constructing video image quality enhancement model, video image quality enhancement method and device | |
CN116823973B (en) | Black-white video coloring method, black-white video coloring device and computer readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |