CN109359687A - Video style conversion process method and device - Google Patents

Video style conversion process method and device Download PDF

Info

Publication number
CN109359687A
CN109359687A CN201811220100.XA CN201811220100A CN109359687A CN 109359687 A CN109359687 A CN 109359687A CN 201811220100 A CN201811220100 A CN 201811220100A CN 109359687 A CN109359687 A CN 109359687A
Authority
CN
China
Prior art keywords
video
style
target
samples pictures
conversion process
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811220100.XA
Other languages
Chinese (zh)
Other versions
CN109359687B (en
Inventor
柏提
孙昊
刘霄
李鑫
赵翔
杨凡
李旭斌
文石磊
丁二锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201811220100.XA priority Critical patent/CN109359687B/en
Publication of CN109359687A publication Critical patent/CN109359687A/en
Application granted granted Critical
Publication of CN109359687B publication Critical patent/CN109359687B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The application proposes a kind of video style conversion process method and device, wherein, method includes: that the first object output vector of reflection style attribute character network layer is arranged according to the style attribute information of samples pictures, second target output vector of reflection content characteristic network layer is set according to the content information of current input video frame, the third target output vector of reflection optical flow field character network layer is set according to the optical flow field information of current input video frame, according to first object output vector, second target output vector, the network parameter of each network layer in third target output vector training objective model, video style transformation model corresponding with samples pictures is generated according to target network parameter corresponding when meeting default training condition and object module, to carry out conversion process generation and samples pictures to target video according to video style transformation model Matched video style.The efficiency of video style conversion process is improved while ensuring video fluency as a result,.

Description

Video style conversion process method and device
Technical field
This application involves technical field of video processing more particularly to a kind of video style conversion process methods and device.
Background technique
With the continuous development of Internet technology, user is also higher and higher to the rich demand of media resource, such as The image content evolved finally from classical word content, and then pandemic video is especially in short-sighted frequency by now Hold.On the other hand, user also wants to carry out content itself art processing again and more novel more has to obtain Artistic form ingenious, such as the artistic style conversion of video.
In the related technology, the conversion method of artistic style can only carry out style conversion for single picture, therefore utilize warp Allusion quotation carries out style conversion process based on the mode of picture frame by frame, inevitably results in since video content usually contains mass data There may be mutation between the problem of taking long time and every frame picture, i.e., the image content and original conversion after style conversion Preceding image content does not have identical optical flow field, influences the fluency of video.
Apply for content
The application is intended to solve at least some of the technical problems in related technologies.
For this purpose, first purpose of the application is to propose a kind of video style conversion process method, generated by training Video style transformation model, improves the efficiency of video style conversion process while ensuring video fluency.
Second purpose of the application is to propose another video style conversion process method.
The third purpose of the application is to propose a kind of video style conversion processing unit.
The 4th purpose of the application is to propose another video style conversion processing unit.
The 5th purpose of the application is to propose a kind of computer equipment.
The 6th purpose of the application is to propose a kind of computer program product.
The 7th purpose of the application is to propose a kind of non-transitorycomputer readable storage medium.
In order to achieve the above object, first purpose of the application is to propose a kind of video style conversion process method, comprising:
Obtain the samples pictures and corresponding Sample video set for being used for model training;
The style attribute information of the samples pictures is obtained, and then during the training object module, according to institute State the first object output vector for reflecting style attribute character network layer in style attribute information setting object module;
The content information of each video frame and optical flow field information in the Sample video are obtained, and then in the training target During model, it is arranged in the object module according to the content information of current input video frame and reflects content characteristic network layer The second target output vector, and be arranged in the object module according to the optical flow field information of current input video frame and reflect light The third target output vector of Field Characteristics network layer;
According to the first object output vector, second target output vector and the third target export to The network parameter of each network layer in the amount training object module, and corresponding target network when training condition default according to satisfaction Network parameter and the object module generate video style transformation model corresponding with the samples pictures, according to the video Style transformation model carries out conversion process to target video and generates and the matched video style of the samples pictures.
In order to achieve the above object, second purpose of the application is to propose a kind of video style conversion process method, comprising:
Obtain the video style convert requests comprising target video and Target Photo;
Obtain the target video style transformation model corresponding with the Target Photo of training in advance;
Conversion process is carried out to the target video according to the target video style transformation model to generate and the target The video style of picture match.
In order to achieve the above object, the third purpose of the application is to propose a kind of video style conversion processing unit, comprising:
Module is obtained, for obtaining the samples pictures and corresponding Sample video set that are used for model training;
First setup module, for obtaining the style attribute information of the samples pictures, and then in the training target mould During type, the first object that style attribute character network layer is reflected in object module is set according to the style attribute information Output vector;
Second setup module, for obtaining the content information of each video frame and optical flow field information in the Sample video, And then during the training object module, it is arranged in the object module according to the content information of current input video frame Reflect the second target output vector of content characteristic network layer, and institute is arranged according to the optical flow field information of current input video frame State the third target output vector for reflecting optical flow field character network layer in object module;
Training generation module, for according to the first object output vector, second target output vector, Yi Jisuo The network parameter of each network layer in the third target output vector training object module is stated, and presets training item according to meeting Corresponding target network parameter and the object module when part generate video style modulus of conversion corresponding with the samples pictures Type is generated and the matched view of the samples pictures with carrying out conversion process to target video according to the video style transformation model Frequency style.
In order to achieve the above object, the 4th purpose of the application is to propose a kind of video style conversion processing unit, comprising:
Second obtains module, for obtaining the video style convert requests comprising target video and Target Photo;
Third obtains module, for obtaining the target video style modulus of conversion corresponding with the Target Photo of training in advance Type;
Conversion module, it is raw for carrying out conversion process to the target video according to the target video style transformation model At with the matched video style of the Target Photo.
In order to achieve the above object, the 5th aspect embodiment of the application proposes a kind of computer equipment, including memory, processing Device and storage on a memory and the computer program that can run on a processor, when the processor executes described program, reality The now video style conversion process as described in preceding method embodiment.
In order to achieve the above object, the 6th aspect embodiment of the application proposes a kind of computer program product, when the calculating Instruction processing unit in machine program product realizes the video style conversion process method as described in preceding method embodiment when executing.
In order to achieve the above object, the 7th aspect embodiment of the application proposes a kind of non-transitory computer-readable storage medium Matter is stored thereon with computer program, realizes as described in preceding method embodiment when the computer program is executed by processor Video style conversion process method.
Technical solution provided by the embodiments of the present application may include it is following the utility model has the advantages that
The samples pictures and corresponding Sample video set for being used for model training are obtained, the style category of samples pictures is obtained Property information be arranged in object module according to style attribute information and reflect style attribute and then during training objective model The first object output vector of character network layer obtains the content information and optical flow field information of each video frame in Sample video, And then during training objective model, it is arranged in object module according to the content information of current input video frame and reflects content Second target output vector of character network layer, and be arranged in object module according to the optical flow field information of current input video frame The third target output vector for reflecting optical flow field character network layer, according to first object output vector, the second target output vector, And in third target output vector training objective model each network layer network parameter, and according to meeting default training condition When corresponding target network parameter and object module, video style transformation model corresponding with samples pictures is generated, according to view Frequency style transformation model carries out conversion process to target video and generates and the matched video style of samples pictures.Ensuring as a result, The efficiency of video style conversion process is improved while video fluency.
The additional aspect of the application and advantage will be set forth in part in the description, and will partially become from the following description It obtains obviously, or recognized by the practice of the application.
Detailed description of the invention
The application is above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:
Fig. 1 is the flow chart according to the video style conversion process method of the application one embodiment;
Fig. 2 is the flow chart according to the video style conversion process method of the application another embodiment
Fig. 3 is the structural schematic diagram according to the video style conversion processing unit of the application one embodiment;
Fig. 4 is the structural schematic diagram according to the video style conversion processing unit of the application another embodiment;
Fig. 5 is the structural schematic diagram according to the video style conversion processing unit of the application further embodiment.
Specific embodiment
Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the application, and should not be understood as the limitation to the application.
In order to solve in the related technology by it is classical based on picture frame by frame in the way of carry out style conversion process and lead Image content after the problem of cause takes long time and style conversion and the image content before original conversion do not have identical light Flow field influences the fluency of video.
In the application, by generating a kind of video style transformation model, it can be ensured that the video of Video Quality Metric before and after the processing Content is constant, only the style conversion of video, it is ensured that the fluency of video.
Below with reference to the accompanying drawings the video style conversion process method and device of the embodiment of the present application are described.
Fig. 1 is according to the flow chart of the video style conversion process method of the application one embodiment, as shown in Figure 1, should Method includes:
Step 101, the samples pictures and corresponding Sample video set for being used for model training are obtained.
It is understood that a samples pictures indicate a kind of artistic style, that is to say, that a samples pictures corresponding one A video style transformation model.
It will also be appreciated that in order to which the video style transformation model that training generates in the application example can be suitable for not Style with target video is converted, it is therefore desirable to which the video for obtaining different scenes as much as possible is used for as Sample video set Model training.
Wherein, in order to improve the validity of video style transformation model, the natural scene of nearly thousand order of magnitude is used as far as possible Video is used for model training as Sample video set.
It should be noted that can be able to achieve to further increase the rich and accuracy of model training as one kind Mode, according to the input dimension of picture of object module, to each video frame in samples pictures and/or Sample video set Size is adjusted, and makes the size of each video frame in samples pictures adjusted and/or Sample video set Match with input dimension of picture.
Wherein, input dimension of picture can be adjusted according to the actual application, and to samples pictures and/or sample There are many kinds of the modes that the size of each video frame in video collection is adjusted, and is illustrated below:
The first example cuts the size of each video frame in samples pictures and/or Sample video set Make arrangement after due consideration reason.
Wherein, any position for each video frame that processing can be in samples pictures and/or Sample video set is cut out, Further increase the flexibility of processing.
Second of example carries out the size of each video frame in samples pictures and/or Sample video set slotting Value processing.
Step 102, the style attribute information of samples pictures is obtained, and then during training objective model, according to wind The first object output vector for reflecting style attribute character network layer in object module is arranged in lattice attribute information.
Specifically, different samples pictures have different style attribute information, can be by picture in the related technology at The modes such as adjustment method obtain the style attribute information in samples pictures, for example, van gogh starry sky as samples pictures, obtain the sample The style attribute information of this picture is such as the sky blue as sea, the soft quiet style of tone.
It is understood that object module has plurality of layers network layer, can according to need one layer therein or multilayer It is arranged the first of reflection style attribute character network layer as reflection style attribute character network layer, and according to style attribute information Target output vector.Wherein, object module can be VGG19 and train the model come on ImageNet.
Step 103, the content information and optical flow field information of each video frame in Sample video are obtained, and then in training objective During model, it is arranged in object module according to the content information of current input video frame and reflects the of content characteristic network layer Two target output vectors, and be arranged in object module according to the optical flow field information of current input video frame and reflect optical flow field feature The third target output vector of network layer.
It is to be appreciated that Sample video is made of multiple video frames, each video frame has corresponding content information and light Information of flow.Wherein, content information such as text, image etc.;The motion information of optical flow field information such as object, and related scape The abundant information etc. of object three-dimensional structure.
Similarly, it can according to need one layer in the Multilayer Network network layers using object module or multilayer be special as reflection content Levy network layer, and according to the content information of current input video frame be arranged reflection content characteristic network layer the second target export to Amount;One layer in the Multilayer Network network layers using object module or multilayer be can according to need as reflection optical flow field character network Layer, and according to the optical flow field information of current input video frame be arranged reflection optical flow field character network layer third target export to Amount.
For example, object module is 90 layer network layers, regard bottom as reflection style attribute character network layer for 30 layers, Intermediate 30 layers are 30 layers of reflection content characteristic network layer and high portion as reflection optical flow field character network layer.
It is emphasized that the network layer in object module can not be repeated arranged, such as in the application example Setting cannot be again repeated by being set into reflection style attribute character network layer, be arranged to reflect optical flow field feature for another example Network layer cannot be again repeated setting.
That is, can reflect different features with the heterogeneous networks layer in object module, for example style attribute is special Sign, content characteristic and optical flow field feature etc., to improve the validity of model training.
Step 104, it is instructed according to first object output vector, the second target output vector and third target output vector Practice the network parameter of each network layer in object module, and according to target network parameter corresponding when meeting default training condition and Object module generates video style transformation model corresponding with samples pictures, to be regarded according to video style transformation model to target Frequency carries out conversion process and generates and the matched video style of samples pictures.
Specifically, it is instructed by first object output vector, the second target output vector and third target output vector Practice the network parameter of each network layer in object module, so that the network parameter of heterogeneous networks layer respectively can nothing in object module Limit is close to corresponding reflection style attribute character network layer, reflection content characteristic network layer and reflects optical flow field character network layer, Then to meet default training condition, so as to which corresponding network parameter and object module are generated view corresponding with samples pictures Frequency style transformation model.
In turn, conversion process generation can be carried out to target video according to video style transformation model to match with samples pictures Video style.
It is understood that video style transformation model output picture and samples pictures correspond to the reflection wind of object module Lattice attributive character network layer is trained to guarantee there is style similitude between the two;Video style transformation model output figure Piece is trained to guarantee have content similar between the two to the content characteristic network layer that video frame corresponds to object module Property, thus ensure the similitude in content and style.And by video optical flow field before converting and transformation rear video light Flow field is trained, so that it is guaranteed that the two has similitude light stream, so that the video after style conversion process has interframe smooth The characteristics of.
In practical applications, different users is different for the hobby of artistic style, for training life in the application example At video style transformation model can satisfy the video style conversion requirements in different user, it is therefore desirable to obtain it is as more as possible Different artistic styles samples pictures for generating multiple and different video style transformation models.
Specifically, can show at random or the samples pictures of the different artistic styles of purposive selection as one kind Example, the art picture for obtaining a variety of occidental art drawing styles turn as samples pictures for generating multiple and different video styles Mold changing type;As another example, the art pictures of a variety of occidental arts drawing styles, a variety of Chinese traditional painting styles are obtained Art picture and the art pictures of a variety of Japanese animation styles be used to generate multiple and different video styles as samples pictures Transformation model.
In use, it is raw to carry out conversion process to target video according to user demand selection target video style transformation model At with the matched video style of samples pictures.The wherein target art that the artistic style of samples pictures i.e. user need to change into Style.
In the application example, in order to ensure video style transformation model can be converted in real time in terminal device, After generation video style transformation model corresponding with samples pictures, it is also necessary to be carried out according to preset algorithm to target network parameter It calculates, the corresponding network layer of candidate network parameter that calculated result meets default filter condition is subjected to delete processing.
As a kind of possible implementation, the filtering algorithm based on L1 norm be calculated absolutely to target network parameter Delete processing is carried out to the corresponding network layer of corresponding candidate network parameter that value is less than preset threshold, realizes the conversion of video style The effect that model compression accelerates.
That is, no longer needing to carry out explicit algorithm to optical flow field after the completion of the training of video style transformation model More smooth stylized video is generated, video style transformation model processing speed can be so greatly improved, further increase The practicability of video style transformation model.
In the application example, video style transformation model deployment while, can using memory multiplexing technology come Save the efficient utilization of memory.
As a kind of possible implementation, memory multiplexing setting is carried out to the network layer in video style transformation model, with Make to delete network layer stored in memory during carrying out conversion process to target video according to video style transformation model Processing data.
That is, the video of memory multiplexing setting can will be carried out after carrying out style conversion processing to target video A The processing of network layer in style transformation model is deleted, to carry out style conversion processing to next target video B, by This, improves video style conversion process efficiency.
To sum up, the video style conversion process method of the embodiment of the present application, obtain for model training samples pictures with And corresponding Sample video set, the style attribute information of samples pictures is obtained, and then during training objective model, root According to the first object output vector for reflecting style attribute character network layer in style attribute information setting object module, sample is obtained The content information of each video frame and optical flow field information in video, and then during training objective model, according to current defeated Enter video frame content information setting object module in reflect content characteristic network layer the second target output vector, and according to Current input video frame optical flow field information setting object module in reflect optical flow field character network layer third target export to Amount, according to every in first object output vector, the second target output vector and third target output vector training objective model The network parameter of a network layer, and corresponding target network parameter and object module when training condition default according to satisfaction, generate Video style transformation model corresponding with samples pictures, to carry out conversion process to target video according to video style transformation model It generates and the matched video style of samples pictures.It is improved while ensuring video fluency at video style conversion as a result, The efficiency of reason.
Fig. 2 be according to the flow chart of the video style conversion process method of the application another embodiment, as shown in Fig. 2, This method comprises:
Step 201, the video style convert requests comprising target video and Target Photo are obtained.
Step 202, the target video style transformation model corresponding with Target Photo of training in advance is obtained.
Step 203, conversion process generation and Target Photo are carried out to target video according to target video style transformation model Matched video style.
Specifically, different users is different for the hobby of artistic style, it is also possible to which usage scenario is different, to artistic wind Demand difference of lattice etc..Therefore the samples pictures for needing to obtain different artistic styles as much as possible are multiple and different for generating Video style transformation model.
To in use, obtaining the video style convert requests comprising target video and Target Photo.Namely with When family needs to carry out style conversion to target video, it is first determined target video and Target Photo, target video are to need to convert Video, Target Photo is to need the target style converted, then obtains target corresponding with the Target Photo view of training in advance Frequency style transformation model, that is, target video style transformation model corresponding with target style is obtained, it is regarded thus according to target Frequency style transformation model carries out conversion process generation and the matched video style of Target Photo to target video, and thus, it is possible to quick It realizes video style conversion process, improves user experience.
In order to realize above-described embodiment, the embodiment of the present application also proposed a kind of video style conversion processing unit, and Fig. 3 is According to the structural schematic diagram of the video style conversion processing unit of the application one embodiment, as shown in figure 3, the video style turns Changing processing unit includes: the first acquisition module 310, the first setup module 320, the second setup module 330 and training generation module 340。
Wherein, first module 310 is obtained, for obtaining the samples pictures for being used for model training and corresponding Sample video Set.
First setup module 320, for obtaining the style attribute information of samples pictures, and then in the mistake of training objective model The first object output vector for reflecting style attribute character network layer in object module is arranged according to style attribute information by Cheng Zhong.
Second setup module 330, for obtaining the content information of each video frame and optical flow field information in Sample video, into And during training objective model, it is special that reflection content in object module is arranged according to the content information of current input video frame The second target output vector of network layer is levied, and is arranged in object module instead according to the optical flow field information of current input video frame The third target output vector of illumination Field Characteristics network layer.
Training generation module 340, for according to first object output vector, the second target output vector and third mesh Mark the network parameter of each network layer in output vector training objective model, and corresponding mesh when training condition default according to satisfaction Network parameter and object module are marked, video style transformation model corresponding with samples pictures is generated, to be converted according to video style Model carries out conversion process to target video and generates and the matched video style of samples pictures.
In one embodiment of the application, as shown in figure 4, on the basis of Fig. 3 further include: adjustment module 350 calculates Removing module 360, multiplexing removing module 370.
Module 350 is adjusted, for the input dimension of picture according to object module, to samples pictures and/or Sample video collection The size of each video frame in conjunction is adjusted, and is made every in samples pictures adjusted and/or Sample video set The size of a video frame matches with the input dimension of picture.
In one embodiment of the application, to the ruler of each video frame in samples pictures and/or Sample video set Very little size is cut out processing, alternatively, to the size of each video frame in samples pictures and/or Sample video set into Row interpolation processing.
Removing module 360 is calculated to meet calculated result for calculating target network parameter according to preset algorithm The corresponding network layer of candidate network parameter of default filter condition carries out delete processing.
It is multiplexed removing module 370, for carrying out memory multiplexing setting to the network layer in video style transformation model, so that During carrying out conversion process to target video according to video style transformation model, network layer stored in memory is deleted Handle data.
That is, no longer needing to carry out explicit algorithm to optical flow field after the completion of the training of video style transformation model More smooth stylized video is generated, video style transformation model processing speed can be so greatly improved, further increase The practicability of video style transformation model.
That is, the video of memory multiplexing setting can will be carried out after carrying out style conversion processing to target video A The processing of network layer in style transformation model is deleted, to carry out style conversion processing to next target video B, by This, improves video style conversion process efficiency.
It should be noted that the aforementioned explanation to video style conversion process method embodiment is also applied for the implementation The video style conversion processing unit of example, realization principle is similar, and details are not described herein again.
To sum up, the video style conversion processing unit of the embodiment of the present application, obtain for model training samples pictures with And corresponding Sample video set, the style attribute information of samples pictures is obtained, and then during training objective model, root According to the first object output vector for reflecting style attribute character network layer in style attribute information setting object module, sample is obtained The content information of each video frame and optical flow field information in video, and then during training objective model, according to current defeated Enter video frame content information setting object module in reflect content characteristic network layer the second target output vector, and according to Current input video frame optical flow field information setting object module in reflect optical flow field character network layer third target export to Amount, according to every in first object output vector, the second target output vector and third target output vector training objective model The network parameter of a network layer, and corresponding target network parameter and object module when training condition default according to satisfaction, generate Video style transformation model corresponding with samples pictures, to carry out conversion process to target video according to video style transformation model It generates and the matched video style of samples pictures.It is improved while ensuring video fluency at video style conversion as a result, The efficiency of reason.
In order to realize above-described embodiment, the embodiment of the present application also proposed a kind of video style conversion processing unit, and Fig. 5 is According to the structural schematic diagram of the video style conversion processing unit of the application further embodiment, as shown in figure 5, the video style Conversion processing unit includes: the second acquisition module 510, third acquisition module 520, conversion module 530.
Second obtains module 510, for obtaining the video style convert requests comprising target video and Target Photo.
Third obtains module 520, for obtaining the target video style modulus of conversion corresponding with Target Photo of training in advance Type.
Conversion module 530, for according to target video style transformation model to target video carry out conversion process generate with The matched video style of Target Photo.
To in use, obtaining the video style convert requests comprising target video and Target Photo.Namely with When family needs to carry out style conversion to target video, it is first determined target video and Target Photo, target video are to need to convert Video, Target Photo is to need the target style converted, then obtains target corresponding with the Target Photo view of training in advance Frequency style transformation model, that is, target video style transformation model corresponding with target style is obtained, it is regarded thus according to target Frequency style transformation model carries out conversion process generation and the matched video style of Target Photo to target video, and thus, it is possible to quick It realizes video style conversion process, improves user experience.
In order to realize above-described embodiment, the embodiment of the present application also proposed a kind of computer equipment, including memory, processing On a memory and the computer program that can run on a processor, when processor executes described program, realization is such as device and storage Video style conversion process described in preceding method embodiment.
In order to realize above-described embodiment, the application also proposes a kind of computer program product, when in computer program product Instruction processing unit execute when realize the video style conversion process method as described in preceding method embodiment.
In order to realize above-described embodiment, the application also proposes a kind of non-transitorycomputer readable storage medium, deposits thereon Computer program is contained, realizes that the video style as described in preceding method embodiment turns when computer program is executed by processor Change processing method.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is contained at least one embodiment or example of the application.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present application, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing custom logic function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the application includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, to execute function, this should be by the application Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable Medium, because can then be edited, be interpreted or when necessary with it for example by carrying out optical scanner to paper or other media His suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the application can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.Such as, if realized with hardware in another embodiment, following skill well known in the art can be used Any one of art or their combination are realized: have for data-signal is realized the logic gates of logic function from Logic circuit is dissipated, the specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile Journey gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, can integrate in a processing module in each functional unit in each embodiment of the application It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above Embodiments herein is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as the limit to the application System, those skilled in the art can be changed above-described embodiment, modify, replace and become within the scope of application Type.

Claims (14)

1. a kind of video style conversion process method, which comprises the following steps:
Obtain the samples pictures and corresponding Sample video set for being used for model training;
The style attribute information of the samples pictures is obtained, and then during the training object module, according to the wind The first object output vector for reflecting style attribute character network layer in object module is arranged in lattice attribute information;
The content information of each video frame and optical flow field information in the Sample video are obtained, and then in the training object module During, it is arranged in the object module according to the content information of current input video frame and reflects the of content characteristic network layer Two target output vectors, and be arranged in the object module according to the optical flow field information of current input video frame and reflect optical flow field The third target output vector of character network layer;
It is instructed according to the first object output vector, second target output vector and the third target output vector Practice the network parameter of each network layer in the object module, and is joined according to target network corresponding when meeting default training condition The several and object module generates video style transformation model corresponding with the samples pictures, according to the video style Transformation model carries out conversion process to target video and generates and the matched video style of the samples pictures.
2. the method as described in claim 1, which is characterized in that obtain described for the samples pictures of model training and right After the Sample video set answered, further includes:
According to the input dimension of picture of the object module, to every in the samples pictures and/or the Sample video set The size of a video frame is adjusted, and is made every in the samples pictures adjusted and/or the Sample video set The size of a video frame matches with the input dimension of picture.
3. method according to claim 2, which is characterized in that described to the samples pictures and/or the Sample video collection The size of each video frame in conjunction is adjusted, comprising:
Processing is cut out to the size of each video frame in the samples pictures and/or the Sample video set, Alternatively,
Interpolation processing is carried out to the size of each video frame in the samples pictures and/or the Sample video set.
4. the method as described in claim 1, which is characterized in that generate video style corresponding with the samples pictures described After transformation model, further includes:
The target network parameter is calculated according to preset algorithm, calculated result is met to the candidate net of default filter condition The corresponding network layer of network parameter carries out delete processing.
5. the method as described in claim 1, which is characterized in that regarded according to the video style transformation model to target described Frequency carries out before conversion process generation and the matched video style of the samples pictures, further includes:
Memory multiplexing setting is carried out to the network layer in the video style transformation model, so as to turn according to the video style During mold changing type carries out conversion process to target video, the processing data of the network layer stored in memory are deleted.
6. a kind of video style conversion process method, which comprises the following steps:
Obtain the video style convert requests comprising target video and Target Photo;
Obtain the target video style transformation model corresponding with the Target Photo of training in advance;
Conversion process is carried out to the target video according to the target video style transformation model to generate and the Target Photo Matched video style.
7. a kind of video style conversion processing unit characterized by comprising
First obtains module, for obtaining the samples pictures and corresponding Sample video set that are used for model training;
First setup module, for obtaining the style attribute information of the samples pictures, and then in the training object module In the process, it is arranged according to the style attribute information and reflects that the first object of style attribute character network layer exports in object module Vector;
Second setup module, for obtaining the content information of each video frame and optical flow field information in the Sample video, in turn During the training object module, it is arranged in the object module according to the content information of current input video frame and is reflected Second target output vector of content characteristic network layer, and the mesh is arranged according to the optical flow field information of current input video frame Mark the third target output vector for reflecting optical flow field character network layer in model;
Training generation module, for according to the first object output vector, second target output vector and described the The network parameter of each network layer in the three target output vectors training object module, and when training condition default according to satisfaction Corresponding target network parameter and the object module generate video style transformation model corresponding with the samples pictures, with Conversion process is carried out to target video according to the video style transformation model to generate and the matched video wind of the samples pictures Lattice.
8. a kind of video style conversion processing unit characterized by comprising
Second obtains module, for obtaining the video style convert requests comprising target video and Target Photo;
Third obtains module, for obtaining the target video style transformation model corresponding with the Target Photo of training in advance;
Conversion module, for according to the target video style transformation model to the target video carry out conversion process generate with The matched video style of Target Photo.
9. a kind of computer equipment, which is characterized in that on a memory and can be in processor including memory, processor and storage The computer program of upper operation when the processor executes described program, realizes such as video as claimed in any one of claims 1 to 5 Style conversion process.
10. a kind of computer program product, which is characterized in that when the instruction processing unit in the computer program product executes When, realize such as video style conversion process method as claimed in any one of claims 1 to 5.
11. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, which is characterized in that the meter When calculation machine program is executed by processor, such as video style conversion process method as claimed in any one of claims 1 to 5 is realized.
12. a kind of computer equipment, which is characterized in that including memory, processor and store on a memory and can handle The computer program run on device when the processor executes described program, realizes that video style as claimed in claim 6 turns Change processing.
13. a kind of computer program product, which is characterized in that when the instruction processing unit in the computer program product executes When, realize video style conversion process method as claimed in claim 6.
14. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, which is characterized in that the meter When calculation machine program is executed by processor, video style conversion process method as claimed in claim 6 is realized.
CN201811220100.XA 2018-10-19 2018-10-19 Video style conversion processing method and device Active CN109359687B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811220100.XA CN109359687B (en) 2018-10-19 2018-10-19 Video style conversion processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811220100.XA CN109359687B (en) 2018-10-19 2018-10-19 Video style conversion processing method and device

Publications (2)

Publication Number Publication Date
CN109359687A true CN109359687A (en) 2019-02-19
CN109359687B CN109359687B (en) 2020-11-24

Family

ID=65345917

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811220100.XA Active CN109359687B (en) 2018-10-19 2018-10-19 Video style conversion processing method and device

Country Status (1)

Country Link
CN (1) CN109359687B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110599421A (en) * 2019-09-12 2019-12-20 腾讯科技(深圳)有限公司 Model training method, video fuzzy frame conversion method, device and storage medium
CN111556244A (en) * 2020-04-23 2020-08-18 北京百度网讯科技有限公司 Video style migration method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102355555A (en) * 2011-09-22 2012-02-15 中国科学院深圳先进技术研究院 Video processing method and system
CN105303598A (en) * 2015-10-23 2016-02-03 浙江工业大学 Multi-style video artistic processing method based on texture transfer
WO2018075927A1 (en) * 2016-10-21 2018-04-26 Google Llc Stylizing input images
WO2018111786A1 (en) * 2016-12-16 2018-06-21 Microsoft Technology Licensing, Llc Image stylization based on learning network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102355555A (en) * 2011-09-22 2012-02-15 中国科学院深圳先进技术研究院 Video processing method and system
CN105303598A (en) * 2015-10-23 2016-02-03 浙江工业大学 Multi-style video artistic processing method based on texture transfer
WO2018075927A1 (en) * 2016-10-21 2018-04-26 Google Llc Stylizing input images
WO2018111786A1 (en) * 2016-12-16 2018-06-21 Microsoft Technology Licensing, Llc Image stylization based on learning network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
操江峰: "基于深度学习的图像与视频风格化研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110599421A (en) * 2019-09-12 2019-12-20 腾讯科技(深圳)有限公司 Model training method, video fuzzy frame conversion method, device and storage medium
CN110599421B (en) * 2019-09-12 2023-06-09 腾讯科技(深圳)有限公司 Model training method, video fuzzy frame conversion method, device and storage medium
CN111556244A (en) * 2020-04-23 2020-08-18 北京百度网讯科技有限公司 Video style migration method and device
CN111556244B (en) * 2020-04-23 2022-03-11 北京百度网讯科技有限公司 Video style migration method and device

Also Published As

Publication number Publication date
CN109359687B (en) 2020-11-24

Similar Documents

Publication Publication Date Title
Srinivasan et al. Biases in generative art: A causal look from the lens of art history
Crook et al. Motion graphics: Principles and practices from the ground up
CN107180443B (en) A kind of Freehandhand-drawing animation producing method and its device
CN101754056B (en) Digital content inventory management system supporting automatic mass data processing and the method thereof
CN110188760A (en) A kind of image processing model training method, image processing method and electronic equipment
WO2021259322A1 (en) System and method for generating video
CN110012237A (en) Video generation method and system based on interaction guidance and cloud enhancing rendering
CN109300179A (en) Animation method, device, terminal and medium
CN107392974A (en) Picture generation method and device and terminal equipment
CN109359687A (en) Video style conversion process method and device
CN107820018A (en) User's photographic method, device and equipment
CN109993820A (en) A kind of animated video automatic generation method and its device
CN106780363A (en) Picture processing method and device and electronic equipment
CN109656554A (en) User interface creating method and device
KR20210041057A (en) Technology to capture and edit dynamic depth images
CN105956995A (en) Face appearance editing method based on real-time video proper decomposition
CN110443874A (en) Viewpoint data creation method and device based on convolutional neural networks
WO2023056835A1 (en) Video cover generation method and apparatus, and electronic device and readable medium
Zhao et al. Cartoon image processing: a survey
CN107122393A (en) Electron album generation method and device
Burgert et al. Diffusion illusions: Hiding images in plain sight
WO2024131565A1 (en) Garment image extraction method and apparatus, and device, medium and product
CN108062339B (en) Processing method and device of visual chart
KR102108422B1 (en) System and Method for Optimizing Facial Expression of Virtual Characters through AI-based Facial Expression Classification and Retargeting, and Computer Readable Storage Medium
CN109741442A (en) A method of threedimensional model is quickly generated according to plane picture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant