WO2022141533A1 - Video processing method, video processing apparatus, terminal device, and storage medium - Google Patents

Video processing method, video processing apparatus, terminal device, and storage medium Download PDF

Info

Publication number
WO2022141533A1
WO2022141533A1 PCT/CN2020/142432 CN2020142432W WO2022141533A1 WO 2022141533 A1 WO2022141533 A1 WO 2022141533A1 CN 2020142432 W CN2020142432 W CN 2020142432W WO 2022141533 A1 WO2022141533 A1 WO 2022141533A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
template
pit
matching
segment
Prior art date
Application number
PCT/CN2020/142432
Other languages
French (fr)
Chinese (zh)
Inventor
刘志鹏
李熠宸
朱高
朱梦龙
蒋金峰
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to CN202080075426.7A priority Critical patent/CN114731458A/en
Priority to PCT/CN2020/142432 priority patent/WO2022141533A1/en
Publication of WO2022141533A1 publication Critical patent/WO2022141533A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects

Definitions

  • the present application relates to the technical field of video processing, and in particular, to a video processing method, a video processing apparatus, a terminal device, and a storage medium.
  • video content has become the mainstream of self-media, and users share their daily life by shooting short videos.
  • users can freely combine various video materials shot by a shooting device (drone, handheld gimbal, camera or mobile phone), and edit and merge multiple video materials into one video. Post on social networking sites.
  • a shooting device drone, handheld gimbal, camera or mobile phone
  • video editing solutions still require user participation, and there is no effective solution for automatic video editing.
  • the present application provides a video processing method, a video processing device, a terminal device, and a storage medium.
  • the video processing method processes the video material to be processed based on a template, aiming at reducing the workload of the user when performing video editing, providing Diverse recommended videos.
  • the present application provides a video processing method, including:
  • the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
  • the present application also provides a video processing method, including:
  • the flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits.
  • the application also provides a video processing method, including:
  • the template includes at least one video pit
  • Matching video clips for the video pits of each of the templates obtaining a matching relationship corresponding to each of the templates, and determining a matching score of the matching relationship corresponding to each of the templates, wherein the video clips are the video material to be processed fragment;
  • the video clip is filled in the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template to obtain a recommended video.
  • the present application also provides a video processing method, including:
  • the to-be-processed video material is divided to generate a plurality of video segments
  • the pit information of the video pits of the template determine the video clips to be filled in each video pit in the template, and obtain the matching relationship corresponding to the template;
  • the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
  • the present application further provides a video processing device, the video processing device comprising a processor and a memory;
  • the memory is used to store computer programs
  • the processor is configured to execute the computer program and when executing the computer program, realize:
  • the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
  • the present application further provides a video processing apparatus, the video processing apparatus comprising a processor and a memory;
  • the memory is used to store computer programs
  • the processor is configured to execute the computer program and when executing the computer program, realize:
  • the flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits.
  • the present application further provides a video processing device, the video processing device comprising a processor and a memory;
  • the memory is used to store computer programs
  • the processor is configured to execute the computer program and when executing the computer program, realize:
  • the template includes at least one video pit
  • Matching video clips for the video pits of each of the templates obtaining a matching relationship corresponding to each of the templates, and determining a matching score of the matching relationship corresponding to each of the templates, wherein the video clips are the video material to be processed fragment;
  • the video clip is filled in the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template to obtain a recommended video.
  • the present application further provides a video processing device, the video processing device comprising a processor and a memory;
  • the memory is used to store computer programs
  • the processor is configured to execute the computer program and when executing the computer program, realize:
  • the to-be-processed video material is divided to generate a plurality of video segments
  • the pit information of the video pits of the template determine the video clips to be filled in each video pit in the template, and obtain the matching relationship corresponding to the template;
  • the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
  • the present application further provides a terminal device, the terminal device comprising a processor and a memory;
  • the memory is used to store computer programs
  • the processor is configured to execute the computer program and when executing the computer program, realize:
  • the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
  • the present application further provides a terminal device, the terminal device comprising a processor and a memory;
  • the memory is used to store computer programs
  • the processor is configured to execute the computer program and when executing the computer program, realize:
  • the flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits.
  • the present application further provides a terminal device, where the terminal device includes a processor and a memory;
  • the memory is used to store computer programs
  • the processor is configured to execute the computer program and when executing the computer program, realize:
  • the template includes at least one video pit
  • Matching video clips for the video pits of each of the templates obtaining a matching relationship corresponding to each of the templates, and determining a matching score of the matching relationship corresponding to each of the templates, wherein the video clips are the video material to be processed fragment;
  • the video clip is filled in the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template to obtain a recommended video.
  • the present application further provides a terminal device, the terminal device comprising a processor and a memory;
  • the memory is used to store computer programs
  • the processor is configured to execute the computer program and when executing the computer program, realize:
  • the to-be-processed video material is divided to generate a plurality of video segments
  • the pit information of the video pits of the template determine the video clips to be filled in each video pit in the template, and obtain the matching relationship corresponding to the template;
  • the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
  • the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the processor enables the processor to realize the video as described above The steps of the processing method.
  • Embodiments of the present application provide a video processing method, a video processing device, a terminal device, and a storage medium, which can quickly form a video material to be processed, reduce the workload of video editing, increase the diversity of recommended videos, and improve user experience.
  • FIG. 1 is a flowchart of steps of a video processing method provided by an embodiment of the present application.
  • Fig. 2 is the sub-step flow chart of a kind of video processing method provided in Fig. 1;
  • FIG. 3 is a flowchart of steps for clustering and segmenting a first video segment provided by an embodiment of the present application
  • Fig. 4 is the training flow chart of the image feature network model provided by the embodiment of the present application.
  • FIG. 5 is a schematic diagram of a similarity calculation result provided by an embodiment of the present application.
  • FIG. 6 is a flowchart of steps of another video processing method provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a flow network diagram constructed in an embodiment of the present application.
  • FIG. 8 is a flowchart of steps of another video processing method provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a source of video material to be processed provided by an embodiment of the present application.
  • Fig. 10 is the sub-step flow chart of a kind of video processing method provided in Fig. 8;
  • FIG. 11 is a flowchart of steps for determining a plurality of matching templates according to video tags provided by an embodiment of the present application
  • FIG. 13 is a flowchart of steps of another video processing method provided by an embodiment of the present application.
  • 15 is a flowchart of steps for determining a recommended template according to a matching score provided by an embodiment of the present application
  • Figure 16 is a flowchart of the sub-steps provided in Figure 15 for determining a recommended template according to the matching score;
  • Fig. 17 is the sub-step flow chart of a kind of video processing method provided in Fig. 13;
  • FIG. 19 is a schematic diagram of selecting a recommended template from a template group provided by an embodiment of the present application.
  • FIG. 20 is a schematic block diagram of a video processing apparatus provided by an embodiment of the present application.
  • FIG. 21 is a schematic block diagram of a terminal device provided by an embodiment of the present application.
  • FIG. 1 is a flowchart of steps of a video processing method provided by an embodiment of the present application.
  • the video processing method can be applied to a terminal device or a cloud device for synthesizing the video material to be processed and a preset template.
  • terminal devices include mobile phones, tablets, and notebook computers.
  • the video processing method includes steps S101 to S103.
  • the to-be-processed video material is used for synthesizing with a preset template to generate a recommended video for the user, and the preset template includes at least one video pit for filling the video clips.
  • the to-be-processed video material is segmented according to the video information of the to-be-processed video material to obtain a plurality of video clips, so that the video clips can be filled into the video pits of the template to generate a recommended video.
  • the video material to be processed may include video material from various sources, such as video material captured by a handheld terminal, video material captured by a mobile platform, video material obtained from a cloud server, and video material obtained from a local server. Obtained video material, etc.
  • the handheld terminal may be, for example, a mobile phone, a tablet, a motion camera, etc.
  • the movable platform may be, for example, a drone.
  • the UAV may be a rotary-wing UAV, such as a quad-rotor UAV, a hexa-rotor UAV, an octa-rotor UAV, or a fixed-wing UAV.
  • the drone has camera equipment on it.
  • step S101 includes step S1011 and step S1012.
  • the moving direction of the camera in each obtained first video segment is the same or similar, and there is no change in the moving direction of the camera.
  • the to-be-processed video material may be segmented according to the change of the mirror movement direction in the to-be-processed video material to generate a plurality of first video segments.
  • the video material to be processed includes continuous front and rear mirror movements, and the video material to be processed can be divided into two first video clips according to the change of the front and rear mirror movement directions, and the mirror movement direction of one first video clip is forward.
  • the camera movement direction of the other first video clip is backward, and the camera movement direction in each first video clip is the same.
  • some algorithms for detecting the moving direction of the mirror can be used to determine the change of the moving direction of the mirror in the video material to be processed.
  • the obtained scenes in each first video segment are similar.
  • the to-be-processed video material includes a snowy mountain image
  • the to-be-processed video material may be divided into multiple first video segments according to whether the to-be-processed video material includes a snowy mountain image and an image similar to the snowy mountain image.
  • the to-be-processed video material may also be divided according to the same subject in the to-be-processed video material.
  • the subject may be a target person or a pet or the like.
  • the to-be-processed video material can be divided according to whether the cat is in the footage of the to-be-processed video material, and the to-be-processed video material is divided into multiple first video segments .
  • the first video clip may be segmented a second time to obtain a plurality of second video clips, and the second video clips are used as the video clips to be filled in the video pits of the template.
  • cluster segmentation includes clustering using similar scenes.
  • the first video segment can be selectively clustered and segmented, that is, the first video segment can be clustered and segmented only when the first video segment satisfies certain conditions; otherwise, the first video segment may not be clustered and segmented. Perform cluster segmentation.
  • the step of clustering and segmenting the first video clips before clustering and segmenting the first video clips to obtain a plurality of second video clips, it is determined whether there is a first video clip whose video duration is longer than a preset duration in the plurality of first video clips.
  • a video clip if there is a first video clip with a video duration greater than the preset duration, perform the step of clustering and segmenting the first video clip.
  • the preset duration is an experience value, which can be adjusted according to experience or the duration of video pits of the template.
  • the second division is performed on these first video clips whose video duration is greater than the preset duration. However, for the first video segment whose video duration is less than or equal to the preset duration, the second division may not be performed.
  • the steps of clustering and segmenting the first video segment refer to FIG. 3 , which specifically includes steps S1012a to S1012c.
  • the cluster center When performing cluster segmentation on the first video segment, first determine a sliding window and a cluster center, wherein the sliding window is used to determine the current video frame to be processed, and the cluster center is used to determine the first video The video split point of the clip.
  • the cluster center includes image features of the first video frame of the first video segment.
  • the image features of the first video frame of the first video clip may include image coding features of the first video frame of the first video clip. Among them, the image features are obtained according to the pre-trained image feature network model.
  • the image feature network model can output image features of each video frame in the first video segment.
  • the size of the sliding window can be set based on this principle. In one embodiment, the size of the sliding window is equal to one.
  • the size of the sliding window is related to the duration of the first video segment.
  • a larger value may be set for the sliding window to improve the speed of cluster segmentation, for example, setting the size of the sliding window to 3.
  • a smaller value may be set for the sliding window, for example, the size of the sliding window is set to 1.
  • the size of the sliding window is related to the desired segmentation speed set by the user.
  • a larger value can be set for the sliding window to improve the segmentation speed.
  • a smaller value can be set for the sliding window to reduce the segmentation speed.
  • the size of the sliding window is related to the fineness of the segmentation. The smaller the sliding window is, the more video frames are processed and the finer the segmentation is when performing cluster segmentation. Therefore, the size of the sliding window can be set according to the requirement of the segmentation fineness.
  • the image feature network model may be pre-trained.
  • the training flow chart can be shown in Figure 4, and the training process can be as follows:
  • the three video materials are respectively input into three convolutional neural networks.
  • the convolutional neural network can be a CNN network.
  • two video materials belong to the same category, and one video material belongs to other categories. .
  • the trained convolutional neural network is used as an image feature network model for outputting image features of each video frame in the first video segment.
  • the current video frame to be processed is determined according to the sliding window, and then the current video frame to be processed and the cluster center are subjected to cluster analysis to determine the video segmentation point.
  • the cluster center is set as the image feature of the first video frame, and the first sliding is performed.
  • the sliding window is 1, the current video frame to be processed during the first sliding is the second video frame; when the sliding window is 2, the current video frame to be processed during the first sliding is the third video frame,
  • the sliding window is N, the current video frame to be processed during the first sliding is the N+1 th video frame.
  • step S1012b is specifically: determining the current video frame according to the sliding window, and determining the similarity between the image feature of the current video frame and the cluster center; if the similarity is less than a preset threshold , the current video frame is taken as the video segmentation point, and the cluster center is re-determined; according to the re-determined cluster center, the video segmentation point is continued to be determined until the last video frame of the first video segment.
  • the cluster center is initialized, and the first sliding is performed.
  • the cluster center is set as the image feature of the first video frame of the first video segment, denoted as C 0 .
  • the size of the sliding window is denoted as N
  • the current video frame to be processed determined during the first sliding is the N+1 th video frame of the first video clip
  • the current video frame to be processed determined during the mth sliding is The m*N+1 th video frame of the first video clip.
  • m is the sliding number of the sliding window.
  • the current video frame to be processed that is, the N+1th frame of the first video clip
  • the pre-trained image feature network model to obtain the image feature F N+1 of the current frame, and calculate The similarity between the image feature C 0 of the cluster center and the image feature F N+1 of the current frame.
  • the similarity is less than the preset threshold, it is considered that the video content between the first video frame of the first video clip and the N+1 th video frame of the first video clip has changed greatly, and the The N+1th frame is used as a video division point, and the video frame before the N+1th frame of the first video segment is cut into a second video segment. Then, the image feature F N+1 of the N+1th frame of the first video segment is used as the re-determined clustering center, and the next video segmentation point is continued to be determined until the last video frame of the first video segment is reached.
  • the similarity may be the cosine similarity between image features, and the calculated cosine similarity between image features is taken as the similarity between the image feature C 0 of the cluster center and the image feature F N+1 of the current frame Spend.
  • the preset threshold may be a preset empirical value.
  • the re-determining the cluster center includes: using the image feature of the current video frame as the re-determined cluster center.
  • the current video frame is used as the video dividing point, and the video frame before the current video frame is cut into a second video segment.
  • the image feature of the current video frame is used as the re-determined clustering center.
  • the cluster center is updated; according to the update The subsequent cluster centers continue to determine the similarity between the image features of the current video frame and the updated cluster centers.
  • the similarity between the image feature C 0 of the cluster center and the image feature F N+1 of the current frame is calculated to be greater than the preset threshold, it is considered that the first video frame of the first video segment is the first video frame.
  • the video content does not change much from the N+1th video frame of the first video clip, then update the cluster center, and determine the image characteristics and update of the current video frame to be processed at this time according to the updated cluster center. the similarity of the cluster centers.
  • the calculation method from the first video frame to the Nth video frame can be calculated. +1 average of image features between video frames, and the calculated average is used as the image feature of the updated cluster center.
  • the updating of the cluster centers includes: acquiring the image features of the current video frame; determining the updated cluster centers according to the image features of the current video frame and the cluster centers. .
  • the formula for determining the updated cluster center can be:
  • m is the number of clustering times
  • N is the size of the sliding window
  • C m is the image feature of the updated cluster center C after the m-th clustering
  • F m*N+1 is the current video during the m-th clustering Image features of the frame.
  • the cluster center C is initialized as the first frame, and the clustering The image feature of the center is C 0 , and the first sliding is performed.
  • the current video frame determined by the first sliding is the second frame, and the image feature of the second frame is F 2 , and the image feature C 0 of the cluster center is compared.
  • the second clustering and perform the second sliding.
  • the current video frame determined by the second sliding is the third frame
  • the image feature of the third frame is F 3 .
  • S1012c Perform video segmentation on the first video segment according to the video segmentation point.
  • video division is performed on the first video segment according to the video division point to obtain a plurality of second video segments.
  • the current video frame is used as the video segmentation point, and the video frame before the current video frame is used as a second video segment to perform segmentation.
  • FIG. 5 it is a schematic diagram of a similarity calculation result.
  • the first video clip has a total of 2300 frames, the 490th frame is a video split point, and the 1100th frame is a video split point, then when the video is split, the first video clip is split at the 490th frame and the 1100th frame respectively,
  • Three second video clips are obtained, which are the 1st to 489th frames of the first video clip as a second video clip, the 490th to 1099th frames as a second video clip, and the 1100th to 2300th frames. as a second video clip.
  • S102 Determine, according to the pit information of the video pits of the template, video clips to be filled in each video pit in the template, and obtain a matching relationship corresponding to the template.
  • the matching relationship corresponding to the template is the corresponding relationship between the video pits in the template and the corresponding video clips.
  • the video clip corresponding to the video pit can be determined according to the pit information of the video pit in the template, so as to obtain the corresponding template matching relationship.
  • the pit information includes at least one of pit music and pit tags.
  • the pit tag may be preset, and each video pit may be preset with a pit tag.
  • the pit label includes the direction of the mirror movement, the scene, the theme of the pit, the video theme of the video clip to be filled in the video pit, the target size and position of a single video frame in the video clip to be filled in the video pit, At least one of the target size and position of consecutive video frames in the video segment to be filled in the video pit, and the similarity of adjacent video frames in the video segment to be filled in the video pit.
  • the pit music of the video pit can be a fragment in the template music of the entire template. Since the template includes a plurality of video pits, the plurality of video pits are sequentially combined into a template. Therefore, for the template music of the template, according to the video According to the sequence and duration of the pits, the template music is split to obtain the pit music corresponding to each video pit.
  • the template may be divided into multiple video pits, and each video pit needs to be filled with one video segment.
  • the template may be segmented according to the template music of the template, and the template may also be segmented according to a plurality of segmented video clips.
  • the template music of the template can be divided into multiple segments according to a certain time interval, each segment of music corresponds to a video pit, and the duration of the music is equal to the duration of the video pit.
  • each piece of music corresponds to a video pit, and the duration of the music is the same as the video pit.
  • the duration of the bits is equal.
  • the template music can be divided into multiple segments according to the rhythm of the template, each segment of music corresponds to a video pit, and the duration of the music is equal to the duration of the video pit.
  • the lengths of the obtained multiple video clips may be different. Therefore, the template can be divided according to the time length of the video clips, so that the length of the video pits obtained from the division can be exactly equal to The duration of the video clip.
  • the 30-second to-be-processed video material is divided into three video clips, and the durations of the three video clips are 5 seconds, 15 seconds, and 10 seconds respectively.
  • the template may be divided into corresponding video pits according to the duration of the three video clips, so as to fill the video clips into the corresponding video pits.
  • the template when the template is divided according to the duration of the video clips, since there are multiple video clips, the template can be divided according to the highlight level of the video clips.
  • a template needs to be filled with at least one video segment with the highest level of excitement. Therefore, at least one video slot with the same duration as the video segment with the highest level of excitement can be segmented into the template for filling in the segment with the highest level of excitement. video clips.
  • the template may be segmented according to the movement of the template, scene or scene, and the like.
  • the determining the video segment corresponding to the video pit according to the pit information of the video pit in the template includes: determining a video segment corresponding to the video pit according to the pit music of the video pit in the template. The video segment corresponding to the video pit.
  • the pit music of the video pit itself has a sense of rhythm, for example, the pit music of the video pit is a soothing rhythm or an explosive rhythm. Therefore, suitable video clips can be matched for video pits according to the rhythm of the pit music and the video content of the video clips.
  • the music is bursting, and if there is a powerful burst such as a fountain in the video clip, then the two are a good match.
  • a powerful burst such as a fountain in the video clip
  • it is suitable for slow motion of characters; if it is an epic blockbuster, it is suitable for large scene time-lapse photography.
  • the determining the video clip corresponding to the video pit according to the pit music of the video pit in the template includes: determining the difference between the pit music of the video pit and the video fragment in the template. Matching degree; the video segment corresponding to the video pit in the template is determined according to the matching degree.
  • a threshold of matching degree can be set, and when the matching degree exceeds the threshold, it is determined that the video pit matches the video clip; or the video clip with the highest matching degree with the video pit can be selected according to the matching degree as the video clip.
  • the matching degree of the pit music of the video pit in the template and the video clip is obtained by utilizing a pre-trained music matching model, and the music matching model can output the pit of the video pit in the template.
  • the match score for the music and video clips is obtained by utilizing a pre-trained music matching model, and the music matching model can output the pit of the video pit in the template.
  • the video clips corresponding to the video pits after determining the video clips corresponding to the video pits according to the pit information of the video pits in the template, determine a plurality of the video clips corresponding to the video pits in the template.
  • Shooting quality determine the optimal video fragment corresponding to the video pit in the template according to the shooting quality of a plurality of the video fragments; obtain the matching relationship corresponding to the template according to the optimal video fragment corresponding to the video pit in the template .
  • the shooting quality of the video clip is determined according to the image content of the video clip and the evaluation of the video clip.
  • the image content includes lens stability, color saturation, whether there is a main shooting object and the amount of information in the lens, etc.
  • the video clip evaluation includes the aesthetic score of the video clip.
  • the aesthetic scoring of the video clip may take into account factors such as color, composition, mirror movement, and scene recognition to perform the aesthetic score for the video clip.
  • the shooting quality of the video clip is scored based on the image content of the video clip and the evaluation of the video clip. The higher the score, the higher the shooting quality of the video clip.
  • the video clip with the highest shooting quality is selected from the video clips as the optimal video clip corresponding to the video pit in the template, so as to obtain the matching relationship corresponding to the template.
  • the video clip corresponding to the video pit is determined according to the pit information of the video pit in the template
  • the video clip corresponding to two adjacent video pits in the template is determined.
  • the matching degree between them is determined;
  • the optimal video segment corresponding to the video pit is determined according to the matching degree;
  • the matching relationship corresponding to the template is obtained according to the optimal video segment corresponding to the video pit.
  • the matching degree of the video clips corresponding to the two adjacent video pits is determined according to the continuity of the moving direction of the video clips, the relationship of increasing and decreasing of the scene, and the matching clip.
  • the continuity of the moving direction includes ensuring that the moving directions of the video clips corresponding to two adjacent video pits are in the same direction, and preventing video clips in opposite moving directions from being connected together.
  • the increasing and decreasing relationship of the scene includes, for example, from a long shot to a medium shot to a close shot, or from a close shot to a medium shot and then to a long shot, or directly from a long shot to a close shot, and so on.
  • Match editing involves bridging two shots by matching similar motion, graphics, color, etc. to achieve a coherent, fluid narrative for transitions between two video clips.
  • the determining the video segment corresponding to the video pit according to the pit information of the video pit in the template includes: determining a video segment corresponding to the video pit according to the pit label of the video pit in the template. The video segment corresponding to the video pit.
  • Each video pit in the template is provided with a pit tag, and the tag can be matched according to the pit tag of the video pit in the template and the tag of the video clip, so that the successfully matched video clip is used as the corresponding video pit. video clips.
  • the determining the video clip corresponding to the video pit according to the pit tag of the video pit in the template includes: determining the video tag of the video fragment, and combining the pit tag of the video pit with the video pit.
  • the video segment corresponding to the video tag whose bit tag matches is used as the video segment to be filled in the video pit.
  • the tag extraction is performed on the video segment to determine the video tag of the video segment, and then the video segment corresponding to the video pit is determined according to the pit tag of the video pit of the template.
  • the video clips are respectively filled into the corresponding video pits of the template, and video synthesis is performed to obtain a recommended video. and recommend recommended videos to users.
  • step S103 specifically determines whether the video duration of the video clip is greater than the duration of the video pit; if the video duration of the video clip is greater than the duration of the video pit, the video clip is Fragment extraction is performed to obtain selected fragments.
  • the duration of the video pit determines the maximum duration of the video clip that can be filled in the video slot. Therefore, when the video duration of the video clip is greater than the duration of the video slot, the video clip cannot be directly filled into the corresponding video slot. , it is necessary to extract a segment of the corresponding duration from the video segment and fill it in the corresponding video pit.
  • the video duration of the selected segment is less than or equal to the duration of the video pit.
  • the video duration of the selected segment in order to fill the selected segment into the corresponding video pit and ensure the integrity of the obtained recommended video, when determining the selected segment, can be equal to the duration of the video pit.
  • the performing segment extraction on the video segment to obtain the selected segment includes: performing segment extraction on the video segment according to the video element of the video segment to obtain the selected segment.
  • segment extraction can be performed on the video segment according to the video element of the video segment to obtain the selected segment.
  • the video elements include at least one of smiley face pictures, laughter audio, character movements, clear human voices, picture composition and aesthetic scores.
  • more exciting clips can be extracted from video clips as selected clips according to video elements, for example, clips including smiling faces or clips with high aesthetic scores are selected as selected clips.
  • step S103 is specifically filling the video clips into the corresponding video pits of the template according to the matching relationship corresponding to the template to obtain an initial video; based on the template requirements of the template, the initial video is Perform image optimization to get recommended videos.
  • the initial video After filling the video clips into the corresponding video pits according to the matching relationship corresponding to the template, the initial video is obtained, and then the initial video can be image optimized according to the template requirements, and the image-optimized video is recommended to the user as a recommended video.
  • the template requirements include at least one of transition settings, acceleration and deceleration settings, and texture special effects settings.
  • the source of the video material to be processed may be an aerial video shot by a drone, and the distance between the camera and the photographed object is relatively long during aerial photography, the content of the picture changes is small. Therefore, when filling the aerial video into the corresponding video pit, you can automatically identify the speed of the screen change, automatically adjust the speed of the aerial video according to the speed of the screen change, and then fill in the speed-adjusted aerial video into the corresponding video. in the video pit.
  • the speed of the picture change can be obtained by analyzing a plurality of consecutive frames within a preset time.
  • the aerial video when determining the matching relationship between the video pits and the video clips of the template, for the identified aerial video, can be placed in the first several video pits and/or the latter several of the template. Video pits, thereby improving the quality of the recommended video.
  • the video processing method provided by the above-mentioned embodiment, by dividing the video material to be processed, to obtain a plurality of video clips, and then according to the pit information of the video pits of the template, determine the video clips to be filled in each video pit in the template, and obtain: The matching relationship corresponding to the template is finally filled into the corresponding video pit according to the matching relationship to obtain a recommended video.
  • the to-be-processed video material is divided, and the long-duration to-be-processed video material is divided into multiple short-duration video clips, so that the video clips can be smoothly filled into the video pits to synthesize recommended videos.
  • FIG. 6 is a flowchart of steps of another video processing method provided by an embodiment of the present application.
  • the video processing method includes steps S201 to S203.
  • the video material to be processed includes multiple video clips, the template includes at least one video pit, and each video pit needs to be filled with one video clip.
  • the above video segmentation method may also be used to segment the video material to be processed to obtain multiple video segments.
  • the flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits.
  • a flow network graph is constructed according to the video clips and template pits, as shown in FIG. 7 , which is a schematic diagram of the constructed flow network graph.
  • the left vertical axis C m represents the video segment
  • the upper horizontal axis Sn represents the video pit position of the template.
  • the coordinates (x, y) represent the nodes located on the horizontal axis x and the vertical axis y.
  • N(1,1) represents the node in the upper left (S 1 , C 1 )
  • N(n,m) represents the node in the lower right (S n , C m ).
  • the number of templates may be one or multiple.
  • a flow network graph is constructed, and when the number of templates is multiple, a flow network graph is constructed for each template.
  • the matching relationship between the video clips in the template and the video pits can be determined based on each node in the streaming network graph.
  • the source node in the stream network graph is used as the starting point of the path, the path passes through a node corresponding to each video pit, and the end point of the stream network graph is used as the end point of the path, and the path corresponds to the video clip in the template and the video pit. matching relationship.
  • the determining the matching relationship between the video clips and the video pits based on the flow network diagram includes: matching a suitable video clip for the video pits of the template based on a maximum flow algorithm, Obtain an optimal path; wherein, the corresponding relationship between the video segment and the video pit in the optimal path is used as the matching relationship between the video pit of the template and the video segment.
  • Fig. 7 Please refer to Fig. 7.
  • the arrows in Fig. 7 represent paths between two adjacent nodes in the flow network graph.
  • a path from the source node S to the end point T is a matching relationship corresponding to the video segment and the video pit.
  • the maximum flow algorithm refers to that for a template, the n video pits from the video pits S 1 to S n are filled with appropriate video segments in turn, so that the energy of the entire path from the source node S to the end point T is total. maximum value.
  • the maximum flow algorithm as the template video pits to match the appropriate video clips, the problem of video clip selection is transformed into the problem of finding the maximum energy from the source node S to the end point T.
  • the matching of suitable video clips for the video pits of the template based on the maximum flow algorithm to obtain the optimal path includes: according to the energy value between two adjacent nodes in the flow network graph , and determine the optimal path corresponding to the template.
  • V can be used to represent the energy value of any path in the flow network graph, and any path includes the path between two nodes.
  • V((x,y),(x+1,y+k)) represents the energy value from node (x,y) to node (x+1,y+k).
  • the energy value between two adjacent nodes may be determined according to the energy value influence factor of each of the nodes.
  • the energy value influence factor includes the shooting quality of the video clip corresponding to each of the video pits, the degree of matching between each of the video pits and the corresponding video clip, and the video corresponding to two adjacent video pits At least one of the matching degrees of the fragments.
  • the shooting quality of the video clip corresponding to each of the video pits is determined according to the image content of the video clip and the evaluation of the video clip.
  • the image content includes lens stability, color saturation, whether there is a main shooting object and the amount of information in the lens, etc.
  • the video clip evaluation includes the aesthetic score of the video clip. Aesthetic scoring of video clips can take into account factors such as color, composition, lens movement, and scene recognition to score video clips aesthetically.
  • the shooting quality of the video clip is scored based on the image content of the video clip and the evaluation of the video clip. The higher the score, the higher the shooting quality of the video clip.
  • the degree of matching between each of the video pits and the corresponding video segment is determined according to the degree of matching between the pit music of the video pit and the video segment.
  • template music is preset in the template, and the pit music of the video pits may be fragments of the template music of the entire template. Since the template includes multiple video pits, the plurality of video pits are sequentially combined into a template, so , for the template music of the template, the template music can be split according to the sequence and duration of the video pits, so as to obtain the pit music corresponding to each video pit.
  • the music itself has a sense of rhythm
  • the segment of the pit music corresponding to each video pit in the template also has a corresponding rhythm
  • the pit music is a soothing rhythm or an explosive rhythm. Therefore, template pits and video clips can be matched according to the rhythm of the pit music and the video content of the video clips.
  • the music is bursting, and if there is a powerful burst such as a fountain in the video clip, then the two are a good match.
  • a powerful burst such as a fountain in the video clip
  • it is suitable for slow motion of characters; if it is an epic blockbuster, it is suitable for large scene time-lapse photography.
  • the degree of matching of each of the video pits and the corresponding video clips is obtained by utilizing a pre-trained music matching model, and the music matching model can output the pit music of the video pits. A match score with the video clip.
  • the matching degree between the pit music and the video clip can be learned by training the neural network, and then the neural network obtained by training can be used as a music matching model to output the matching degree score between the pit music and the video clip of the video pit. .
  • the matching degree of the video clips corresponding to the two adjacent video pits is determined according to the continuity of the moving direction of the video clips, the relationship of increasing and decreasing of the scene, and the matching clip.
  • the continuity of the moving direction includes ensuring that the moving directions of the video clips corresponding to two adjacent video pits are in the same direction, and preventing video clips in opposite moving directions from being connected together.
  • the increasing and decreasing relationship of the scene includes, for example, from a long shot to a medium shot to a close shot, or from a close shot to a medium shot and then to a long shot, or directly from a long shot to a close shot, and so on.
  • Match editing involves bridging two shots by matching similar motion, graphics, color, etc. to achieve a coherent, fluid narrative for transitions between two video clips.
  • the degree of matching of the video clips corresponding to the two adjacent video pits is obtained by using a pre-trained clip matching model, and the clip matching model can output two adjacent video pits.
  • the match degree of the bit-stuffed video clip is obtained by using a pre-trained clip matching model, and the clip matching model can output two adjacent video pits. The match degree of the bit-stuffed video clip.
  • the matching degree between the video clips corresponding to two adjacent video pits can be learned by training a neural network, and then the trained neural network can be used as a fragment matching model to output the video corresponding to the two adjacent video pits. Match score between segments.
  • the determining the energy value between two adjacent nodes according to the energy value influencing factor of each of the nodes includes: acquiring an evaluation score and a preset weight of the energy value influencing factor; The energy value between two adjacent nodes is determined by the evaluation score and preset weight of the energy value influence factor.
  • the evaluation score of the energy value influencing factor of each node and the corresponding preset weight are obtained, so as to determine the energy value between two adjacent nodes.
  • the preset weights include weights corresponding to different energy value influencing factors, which may be preset according to empirical values.
  • V a*E clip +b*E template +c*E match
  • E clip represents the score of the shooting quality of the video clip corresponding to the video pit
  • a represents the preset weight corresponding to the score of the shooting quality of the video clip corresponding to the video pit
  • E template represents the difference between the video pit and the corresponding video clip Matching degree
  • b represents the preset weight corresponding to the matching degree between the video pit and the corresponding video clip
  • E match represents the sum of the matching degrees of the video clips corresponding to the two adjacent video pits
  • c represents the adjacent two video clips.
  • the energy value of the node is set to 0.
  • the video clips are filled in the corresponding video pits, thereby synthesizing a recommended video, and recommending the recommended video to the user.
  • step S203 is specifically to determine whether the video duration of the video clip is greater than the duration of the video pit; if the video duration of the video clip is greater than the duration of the video pit, the video clip is Fragment extraction is performed to obtain selected fragments.
  • the duration of the video pit determines the maximum duration of the video clip that can be filled in the video slot. Therefore, when the video duration of the video clip is greater than the duration of the video slot, the video clip cannot be directly filled into the corresponding video slot. , it is necessary to extract a segment of the corresponding duration from the video segment and fill it in the corresponding video pit.
  • the video duration of the selected segment is less than or equal to the duration of the video pit.
  • the video duration of the selected segment in order to fill the selected segment into the corresponding video pit and ensure the integrity of the obtained recommended video, when determining the selected segment, can be equal to the duration of the video pit.
  • the performing segment extraction on the video segment to obtain the selected segment includes: performing segment extraction on the video segment according to the video element of the video segment to obtain the selected segment.
  • segment extraction can be performed on the video segment according to the video element of the video segment to obtain the selected segment.
  • the video elements include at least one of smiley face pictures, laughter audio, character movements, clear human voices, picture composition and aesthetic scores.
  • more exciting clips can be extracted from video clips as selected clips according to video elements, for example, clips including smiling faces or clips with high aesthetic scores are selected as selected clips.
  • step S203 is specifically filling the video clips into the corresponding video pits of the template according to the matching relationship to obtain an initial video; performing image optimization on the initial video based on the template requirements of the template to get recommended videos.
  • the initial video After filling the video clips into the corresponding video pits according to the matching relationship, the initial video is obtained, and then the initial video can be image-optimized according to the template requirements, and the image-optimized video is recommended to the user as a recommended video.
  • the template requirements include at least one of transition settings, acceleration and deceleration settings, and texture special effects settings.
  • the source of the video material to be processed may be an aerial video shot by a drone, and the distance between the camera and the photographed object is relatively long during aerial photography, the content of the picture changes is small. Therefore, when filling the aerial video into the corresponding video pit, you can automatically identify the speed of the screen change, automatically adjust the speed of the aerial video according to the speed of the screen change, and then fill in the speed-adjusted aerial video into the corresponding video. in the video pit.
  • the speed of the picture change can be obtained by analyzing a plurality of consecutive frames within a preset time.
  • the aerial video when determining the matching relationship between the video pits and the video clips of the template, for the identified aerial video, can be placed in the first several video pits and/or the latter several of the template. Video pits, thereby improving the quality of the recommended video.
  • the above-described embodiment provides a video processing method, which utilizes video clips and template pits to construct a flow network diagram, then determines the matching relationship between the video clips and the video pits based on the flow network diagram, and finally fills in the video clips according to the matching relationship.
  • the corresponding video pit of the template get the recommended video.
  • the matching relationship between video clips and video pits is determined by constructing a flow network graph, the selection problem of video clips is modeled as a maximum flow problem, and the accuracy of the matching relationship between the determined video clips and video pits is improved. and convenience.
  • FIG. 8 is a flowchart of steps of another video processing method provided by an embodiment of the present application.
  • the video processing method includes steps S301 to S303.
  • the template includes at least one video pit, the video information of the video material to be processed includes the video content of the video material to be processed, and the corresponding template is determined according to the video content of the video material to be processed.
  • the number of templates may be one or multiple.
  • the video material to be processed may include video material from various sources, such as video material captured by a handheld terminal, video material captured by a movable platform, video material obtained from a cloud server, and Video material obtained from the local server, etc.
  • the handheld terminal may be, for example, a mobile phone, a tablet, a motion camera, etc.
  • the movable platform may be, for example, a drone.
  • the UAV can be a rotary-wing UAV, such as a quad-rotor UAV, a hexa-rotor UAV, an octa-rotor UAV, or a fixed-wing UAV.
  • the drone has camera equipment on it.
  • the source of the video material to be processed may be an aerial video shot by a drone
  • the video material to be processed includes an aerial video
  • the First when determining the template according to the video information of the video material to be processed, the First identify whether the video material to be processed is an aerial photography material, and when it is identified that the video material to be processed is an aerial photography material, match the template of the aerial photography theme for the to-be-processed video material, thereby completing the determination of the template.
  • the template feature of the template and the video feature of the video material to be processed can be extracted, and the template feature and the video feature are matched to obtain the video material to be processed. Determine the template.
  • the template features may be the template theme, camera movement direction, scene type, target size and position of a single video frame in the video material to be processed that need to be filled in the template, and the to-be-processed video that needs to be filled in the template.
  • the video features of the video material to be processed may include the subject of the video, the direction of the camera movement, the scene, the target size and position of a single video frame in the video material to be processed, the target size and position of consecutive video frames in the video material to be processed, and the target size and position of the Process the similarity of adjacent video frames in the video material, etc.
  • a feature matching model may be pre-trained to match template features with video features, and a template may be determined for the video material to be processed according to a matching result output by the feature matching model.
  • step S301 includes step S3011 and step S3012.
  • the video tag extraction is performed on the video material to be processed, and the video tag of the video material to be processed is determined according to the video information of the video material to be processed.
  • the video tag includes the moving direction, the scene, the target size and position of a single video frame in the to-be-processed video material, the target size and position of consecutive video frames in the to-be-processed video material, the target size and position of the to-be-processed video material At least one of the similarities of adjacent video frames in the video material.
  • some algorithms for detecting the moving direction of the mirror can be used to determine the change of the moving direction of the mirror in the video material to be processed.
  • Scenes differ according to the video content in the video material to be processed. For characters, scenes include panoramic, medium, close-up and special scenes (that is, close-ups), and for objects, scenes include long-range and close-up. .
  • the target size and position of a single video frame in the video material to be processed are determined by using an object detection algorithm or a saliency detection algorithm.
  • the target size and position of the continuous video frames in the video material to be processed are determined based on a pre-trained neural network model.
  • S3012. Determine, according to the video tag of the video material to be processed, multiple templates matching the video tag.
  • the template includes a template tag, and the template tag can be preset when setting the template. After the video tag of the video material to be processed is obtained, matching can be performed according to the video tag of the video material to be processed and the template tag, so as to match the template for the video material to be processed.
  • the step of determining a plurality of matching templates according to the video tag includes steps S3012a and S3012b.
  • S3012a Determine a video theme corresponding to the video material to be processed according to the video tag;
  • S3012b Determine a plurality of templates matching the video theme according to the video theme of the video material to be processed.
  • the video theme can be determined by the video tags of a single video frame and/or continuous video frames in the video material to be processed. For example, if the target in the continuous video frame is a tower, the video theme corresponding to the video material to be processed is determined to be travel.
  • the video topic can be a topic category, and can also include subcategories under the topic category.
  • the subject of the video can be travel, food, parent-child, etc.
  • the subject of the video can also be travel-natural scenery, travel-city, travel-humanistic monuments, and so on.
  • Template tags can be template themes. After the subject of the video material to be processed is determined, the subject of the video material to be processed is matched with the template subject, so as to determine a plurality of templates matching the video subject.
  • a preset template is selected as the template corresponding to the video material to be processed.
  • the preset template may be a universal template, that is, a template that can be applied in various scenarios. It cannot be determined that the video theme corresponding to the video material to be processed may be either an unrecognized video theme or a template that does not correspond to the video theme of the video material to be processed.
  • the determining a plurality of templates matching the video theme according to the video theme of the video material to be processed includes: according to the template influence factor of the template, from a template matching the video theme Among the multiple matched templates, a template corresponding to the video material to be processed is determined.
  • the templates are screened a second time according to the template influence factor of the template, so as to determine the template corresponding to the video material to be processed.
  • the template corresponding to the to-be-processed video material determined by the secondary screening may be one or multiple.
  • the template influence factor includes at least one of music matching degree, template popularity and user preference.
  • each template is preset with template music, and the template music and the video material to be processed are matched to determine the matching degree score between the template music and the video material to be processed.
  • the music matching degree is obtained according to a pre-trained music recommendation network model, which can output the matching degree of template music of multiple templates matching the video theme and the video material to be processed Score.
  • the template popularity is determined according to the frequency of use and/or the number of likes of multiple templates matching the video theme.
  • the usage frequency and/or the number of likes for each template by all users who use the template can be obtained, and the popularity of the template can be determined according to the selection of the template by all the users who use the template.
  • the user preference is determined according to the user's selection frequency and/or satisfaction score of multiple templates matching the video theme. Wherein, after the user uses multiple times, the user's usage frequency and/or satisfaction score of each template is obtained according to the user's usage habits, thereby determining the user's preference.
  • the determining the template corresponding to the to-be-processed video material from a plurality of templates matching the video theme according to the template impact factor of the template includes: acquiring the template impact factor According to the evaluation score and preset weight of the template influence factor, determine the template score of multiple templates matching the video theme; determine the to-be-processed video material according to the template score corresponding template.
  • the preset weights include weights corresponding to different template influence factors, which may be preset according to empirical values.
  • the template score for each template is calculated as:
  • M A*E music +B*E template +C*E user
  • E music represents the score of the music matching degree
  • A represents the preset weight corresponding to the score of the music matching degree
  • E template represents the template popularity
  • B represents the preset weight corresponding to the template popularity
  • E user represents the user preference
  • C represents the user The preset weight corresponding to the preference degree.
  • material selection is performed on the video material to be processed.
  • the material to be edited is selected from the to-be-processed video material, so that the template is determined according to the selected material to be edited, and the recommended video is generated.
  • the material selection for the video material to be processed includes: material selection according to material parameters of the video material to be processed; wherein the material parameters include shooting time, shooting location, and shooting targets. at least one.
  • the material parameters may be set by the user.
  • the user wants to select the video material shot within three days from May 1st to May 3rd from the to-be-processed video material, or wants to select from the to-be-processed video material to be shot between 6:00 pm and 10:00 pm , or the user wants to select the video material at the shooting location in Xishuangbaina from the video material to be processed, or the user wants to select the video material for shooting cats from the video material to be processed.
  • the material selection of the video material to be processed when the material selection of the video material to be processed is performed according to the shooting time, for example, the material selection can be performed according to the shooting time by reading the shooting time recorded by the handheld terminal or the movable platform when shooting the video.
  • the content of the shot video material such as ambient lighting conditions, whether there are lights or billboards, etc., can be used to determine the approximate time period for shooting, and select materials.
  • the GPS information at the time of shooting will be recorded when the video is shot, and the video material to be processed can be processed according to the GPS information recorded at the time of shooting. filter.
  • the user can input the time or location, and select the material according to the time or location input by the user, and the user can also select one or more video materials to be processed.
  • the correlation between the to-be-processed video materials is selected, and the material is selected according to the analyzed correlation.
  • the performing material selection on the video material to be processed includes: according to the material parameters of the video material to be processed, clustering the video material to be processed to realize material selection; wherein, the clustering It includes at least one of time clustering, location clustering, and object clustering.
  • the material selection when the material selection is performed according to the material parameters of the video material to be processed, the material selection may be performed by means of clustering.
  • cluster selection can be performed by at least one clustering method among time clustering, location clustering, and target object clustering, which facilitates the user to quickly select a large number of materials and saves the user's time for selecting materials.
  • the performing material selection on the to-be-processed video material includes: performing material selection on the to-be-processed video material according to a user's selection operation.
  • the material selection can also be performed according to the user-defined selection of the video material to be edited.
  • all the materials to be processed may be presented to the user, and the user can customize the materials to be edited from all the materials to be processed.
  • the results of the selection or clustering can be presented to the user, and the user can then customize the selection or clustering results to select the clips that need to be edited. material.
  • the video materials may also be selected or clustered according to the user's historical preference for video materials, so as to provide the user with a personalized use experience.
  • the to-be-processed video material may also be segmented according to the video information of the to-be-processed video material to generate multiple video segments.
  • the video segmentation method provided by the above embodiments may be adopted.
  • the image quality of the to-be-processed video material may also be acquired; the to-be-processed video material is discarded according to the image quality of the to-be-processed video material.
  • the image quality includes at least one of image shake, image blur, image overexposure, image underexposure, no clear scene in the image, or no clear subject in the image.
  • Obtain the video material to be processed and then perform quality detection on the video image of the video material to be processed, and determine whether the video material to be processed has screen shaking, blurred screen, over-exposure, under-exposure, no clear scene in the image, or no clear image in the image. If at least one of the main bodies appears, it is considered that this part is a waste film, and these to-be-processed video materials are removed.
  • the video clips in the video material to be processed may also be discarded according to the image quality of the video clips.
  • de-duplication processing may also be performed on the to-be-processed video material.
  • the deduplication process includes clustering of similar materials.
  • Clustering similar materials is used to classify similar materials into one category. Select the one with the longest video duration from the clips. Or choose the material with the best image quality from the material, such as clear image, high color saturation, etc.
  • deduplication processing may also be performed on the video clips.
  • the video clip is a clip in the video material to be processed. After the template corresponding to the video material to be processed is determined, since the template includes at least one video pit, the video clip corresponding to the video pit can be determined according to the pit information of the video pit in the template, so as to obtain the corresponding template matching relationship.
  • the pit information includes at least one of pit music and pit tags.
  • the pit tag may be preset, and each video pit may be preset with a pit tag.
  • the pit music of the video pit can be a fragment in the template music of the entire template. Since the template includes a plurality of video pits, the plurality of video pits are sequentially combined into a template. Therefore, for the template music of the template, according to the video According to the sequence and duration of the pits, the template music is split to obtain the pit music corresponding to each video pit.
  • the determining the video segment corresponding to the video pit according to the pit information of the video pit in the template includes: determining a video segment corresponding to the video pit according to the pit music of the video pit in the template. The video segment corresponding to the video pit.
  • the pit music of the video pit itself has a sense of rhythm, for example, the pit music of the video pit is a soothing rhythm or an explosive rhythm, so the video pit can be classified according to the rhythm of the pit music and the video content of the video clip. Bits match the appropriate video clips.
  • the music is bursting, and if there is a powerful burst such as a fountain in the video clip, then the two are a good match.
  • a powerful burst such as a fountain in the video clip
  • it is suitable for slow motion of characters; if it is an epic blockbuster, it is suitable for large scene time-lapse photography.
  • the determining the video clip corresponding to the video pit according to the pit music of the video pit in the template includes: determining the difference between the pit music of the video pit and the video fragment in the template. Matching degree; the video segment corresponding to the video pit in the template is determined according to the matching degree.
  • a threshold of matching degree can be set, and when the matching degree exceeds the threshold, it is determined that the video pit matches the video clip; or the video clip with the highest matching degree with the video pit can be selected according to the matching degree as the video clip.
  • the matching degree of the pit music of the video pit in the template and the video clip is obtained by utilizing a pre-trained music matching model, and the music matching model can output the pit of the video pit in the template.
  • the match score for the music and video clips is obtained by utilizing a pre-trained music matching model, and the music matching model can output the pit of the video pit in the template.
  • the video clips corresponding to the video pits after determining the video clips corresponding to the video pits according to the pit information of the video pits in the template, determine a plurality of the video clips corresponding to the video pits in the template.
  • Shooting quality determine the optimal video fragment corresponding to the video pit in the template according to the shooting quality of a plurality of the video fragments; obtain the matching relationship corresponding to the template according to the optimal video fragment corresponding to the video pit in the template .
  • the shooting quality of the video clip is determined according to the image content of the video clip and the evaluation of the video clip.
  • the image content includes lens stability, color saturation, whether there is a main subject and the amount of information in the lens, etc.
  • the video clip evaluation includes the aesthetic score of the video clip.
  • the aesthetic scoring of the video clip may take into account factors such as color, composition, mirror movement, and scene recognition to perform the aesthetic score for the video clip.
  • the shooting quality of the video clip is scored based on the image content of the video clip and the evaluation of the video clip. The higher the score, the higher the shooting quality of the video clip.
  • the video clip with the highest shooting quality is selected from the video clips as the optimal video clip corresponding to the video pit in the template, so as to obtain the matching relationship corresponding to the template.
  • the video clip corresponding to the video pit is determined according to the pit information of the video pit in the template
  • the video clip corresponding to two adjacent video pits in the template is determined.
  • the matching degree between them is determined;
  • the optimal video segment corresponding to the video pit is determined according to the matching degree;
  • the matching relationship corresponding to the template is obtained according to the optimal video segment corresponding to the video pit.
  • the matching degree of the video clips corresponding to the two adjacent video pits is determined according to the continuity of the moving direction of the video clips, the relationship of increasing and decreasing of the scene, and the matching clip.
  • the continuity of the moving direction includes ensuring that the moving directions of the video clips corresponding to two adjacent video pits are in the same direction, and preventing video clips in opposite moving directions from being connected together.
  • the increasing and decreasing relationship of the scene includes, for example, from a long shot to a medium shot to a close shot, or from a close shot to a medium shot and then to a long shot, or directly from a long shot to a close shot, and so on.
  • Match editing involves bridging two shots by matching similar motion, graphics, color, etc. to achieve a coherent, fluid narrative for transitions between two video clips.
  • the determining the video segment corresponding to the video pit according to the pit information of the video pit in the template includes: determining a video segment corresponding to the video pit according to the pit label of the video pit in the template. The video segment corresponding to the video pit.
  • Each video pit in the template is provided with a pit tag, and the tag can be matched according to the pit tag of the video pit in the template and the tag of the video clip, so that the successfully matched video clip is used as the corresponding video pit. video clips.
  • the determining the video clip corresponding to the video pit according to the pit tag of the video pit in the template includes: determining the video tag of the video fragment, and combining the pit tag of the video pit with the video pit.
  • the video segment corresponding to the video tag whose bit tag matches is used as the video segment to be filled in the video pit.
  • the tag extraction is performed on the video segment to determine the video tag of the video segment, and then the video segment corresponding to the video pit is determined according to the pit tag of the video pit of the template.
  • the video clips are respectively filled into the corresponding video pits of the template, and video synthesis is performed to obtain a recommended video. and recommend recommended videos to users.
  • the step of filling the video segment into the corresponding video pit specifically includes steps S3031 to S3032.
  • the duration of the video pit determines the maximum duration of the video clip that can be filled in the video slot. Therefore, when the video duration of the video clip is greater than the duration of the video slot, the video clip cannot be directly filled into the corresponding video slot. , it is necessary to extract a segment of the corresponding duration from the video segment and fill it in the corresponding video pit.
  • the video duration of the selected segment is less than or equal to the duration of the video pit.
  • the video duration of the selected segment in order to fill the selected segment into the corresponding video pit and ensure the integrity of the obtained recommended video, when determining the selected segment, can be equal to the duration of the video pit.
  • the performing segment extraction on the video segment to obtain the selected segment includes: performing segment extraction on the video segment according to the video element of the video segment to obtain the selected segment.
  • segment extraction can be performed on the video segment according to the video element of the video segment to obtain the selected segment.
  • the video elements include at least one of smiley face pictures, laughter audio, character movements, clear human voices, picture composition and aesthetic scores.
  • more exciting clips can be extracted from video clips as selected clips according to video elements, for example, clips including smiling faces or clips with high aesthetic scores are selected as selected clips.
  • step S303 includes: filling the video clip into the corresponding video slot of the template according to the matching relationship corresponding to the template, to obtain an initial video; based on the template requirements of the template, the initial video is Perform image optimization to get recommended videos.
  • the initial video After filling the video clips into the corresponding video pits according to the matching relationship corresponding to the template, the initial video is obtained, and then the initial video can be image optimized according to the template requirements, and the image-optimized video is recommended to the user as a recommended video.
  • the template requirements include at least one of transition settings, acceleration and deceleration settings, and texture special effects settings.
  • the source of the video material to be processed may be aerial video captured by a drone, and the distance between the camera and the captured object is relatively long during aerial photography, the content of the picture changes is small. Therefore, when filling the aerial video into the corresponding video pit, you can automatically identify the speed of the screen change, automatically adjust the speed of the aerial video according to the speed of the screen change, and then fill in the speed-adjusted aerial video into the corresponding video. in the video pit.
  • the speed of the picture change can be obtained by analyzing a plurality of consecutive frames within a preset time.
  • the aerial video when determining the matching relationship between the video pits and the video clips of the template, for the identified aerial video, can be placed in the first several video pits and/or the latter several of the template. Video pits, thereby improving the quality of the recommended video.
  • a template is determined according to the video information of the video material to be processed, and then a video segment corresponding to the video pit is determined according to the pit information of the video pit, so as to obtain a matching relationship corresponding to the template, and finally according to the template
  • the corresponding matching relationship fills the video clips into the corresponding video pits of the template to obtain recommended videos.
  • FIG. 13 is a flowchart of steps of another video processing method provided by an embodiment of the present application.
  • the video processing method includes steps S401 to S404.
  • a plurality of templates are acquired, the templates are used to synthesize the video material to be processed to obtain a recommended video, and each template includes at least one video pit.
  • multiple templates can be arbitrarily obtained from a preset template library, multiple templates can also be obtained according to the user's selection operation of templates in the template library, and multiple templates can also be obtained according to the templates frequently used by the user when synthesizing videos in history Multiple templates.
  • the material to be processed is segmented to generate multiple video segments.
  • the material to be processed is divided to generate multiple video clips, and the generated video clips are used to fill in the video pits in the template to synthesize recommended videos.
  • the dividing the material to be processed to generate multiple video segments includes: dividing the video material to be processed to generate multiple video segments according to the video information of the video material to be processed.
  • the video segmentation method provided by the above embodiments may be adopted.
  • the video segment is a segment of the video material to be processed.
  • the video segment may be a video segment obtained by dividing the video material to be processed.
  • For each video pit in a template match the video clips for the video pits in each template, so as to obtain the matching relationship between the video pits and the video clips.
  • the matching relationship is used as the matching relationship corresponding to the template, and the matching score of the matching relationship is calculated.
  • the step of matching video clips for video pits to obtain a matching relationship specifically includes steps S4021 and S4022.
  • S4021 Construct multiple stream network graphs according to the video clips and the video pits of each template.
  • the flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits.
  • a stream network diagram is constructed according to the video pits and video clips of each template, wherein the left vertical axis C m in the stream network diagram represents the video clips, and the upper horizontal axis Sn represents the template video pits.
  • Each template determines the matching relationship between the respective video pits and video segments based on the respective stream network graph.
  • the determining the matching relationship between the video pits of each of the templates and the video clips based on the plurality of flow network graphs includes: based on a maximum flow algorithm, for the video pits of each of the templates Bit-matching suitable video clips to obtain an optimal path; wherein, the corresponding relationship between the video clips and the video pits in the optimal path is taken as the video pits and the video pits of each of the templates Fragment matching.
  • the video segmentation method provided in the above embodiments may be adopted.
  • the determining the matching score of the matching relationship between the video pit of each template and the video clip includes: according to the energy value between every two adjacent nodes in the optimal path , and determine the matching score of the matching relationship between the video pit of each template and the video segment.
  • the energy values between the adjacent two nodes in the optimal path are added, and the sum of the energy values in the optimal path is used as the matching relationship. match score.
  • the matching of video clips for the video pits of each of the templates, to obtain a matching relationship corresponding to each of the templates includes: according to the pit labels of the video pits of the template or the The template tag of the template classifies the video clips to obtain the classified video clips; the matching relationship corresponding to the template is determined according to the classified video clips.
  • each video clip has its corresponding video tag, and the pit tag of each video pit of the template is matched with the video tag corresponding to the video fragment, so that according to the degree of matching between the video tag and the pit tag, to Video clips are classified, and video clips are divided into multiple categories.
  • the template tag of the template can also be matched with the video tag corresponding to the video clip, so as to classify the video clip according to the matching degree between the video tag and the template tag, and divide the video clip into multiple categories.
  • the video segment corresponding to the video pit is determined according to the category of the video segment, and the matching relationship corresponding to the template is obtained.
  • the classifying the video segment according to the pit tag of the video pit of the template or the template tag of the template includes: according to the pit tag of the video pit or the template tag of the template.
  • the template tag of the template classifies a plurality of the video clips to obtain video clips of multiple class categories.
  • the video clips of the plurality of grade categories at least include video clips of the first category, video clips of the second category and video clips of the third category;
  • the highlight rating of the video clips of the second category where the highlight rating of the video clips of the second category is greater than the highlight rating of the video clips of the third category.
  • the video clips when classifying the video clips, the video clips may be classified by a highlight level. Therefore, when matching video clips for video pits, the most exciting video clips can be selected as the video clips corresponding to the video pits to obtain the matching relationship of templates.
  • the highlight level is determined according to the picture content and audio content of the video clip.
  • the content of the picture includes the composition of the picture, whether there is a clear subject, the direction of the mirror movement, and the scene.
  • Audio content includes clear vocals, laughter, cheers, and more.
  • the determining the matching relationship between the video pits of each template and the video fragments according to the classified video fragments includes: according to the pit tags of the video pits or the a template tag of a template, sorting the classified video clips; and determining the matching relationship between the video pits of each template and the video clips according to the sorting result.
  • the classified video clips are sorted according to the pit labels of the video pits and the classification level of the video clips.
  • sorting for each pit tag, it can be sorted from high to low according to the classification level of the video clips.
  • the video segment with the highest ranking is selected as the video segment to be filled in the video pit, so as to obtain the matching relationship between the video pit of the template and the video segment.
  • the video tags of video clips and their classification levels are A1, A2, A3, B1, B3, C1, and C2.
  • the classified video clips are A-label categories: A1, A2, A3, B-label categories: B1, B3, C-label categories: C1, C2.
  • the video segment to be filled in for the video pit if the pit label of the video pit is A, select the video segment A1, if the pit label of the video pit is B, then select the video segment B1, if the video If the pit tag of the pit is C, the video segment C1 is selected.
  • the classified video clips are sorted according to the template label of the template and the classification level of the video clips.
  • it can be sorted from high to low according to the classification level of the video clips.
  • the video segment with the highest ranking is selected as the video segment to be filled in the video pit, so as to obtain the matching relationship between the video pit of the template and the video segment.
  • the determining the matching relationship between the video pits of each template and the video fragments according to the classified video fragments includes: according to the sorting result, the video pits of the template are Allocating video clips, and determining the matching relationship between the video pits of each template and the video clips.
  • the video segment with the highest ranking is selected as the video segment to be filled in the video pit, so as to obtain the matching relationship between the video pit of the template and the video segment.
  • a recommended template may be determined from the obtained templates, where the recommended template is a template used for synthesizing a recommended video.
  • the step of determining the recommended template according to the matching score specifically includes steps S4031 to S4033.
  • a preset number of templates is determined from a plurality of templates according to the matching score, wherein the number of the preset number of templates may be set according to an empirical value.
  • a preset number of templates may be selected from a plurality of templates according to the matching score from high to low.
  • the target number is the number of recommended templates recommended to the user.
  • a target number is selected from the preset number of templates to be combined to obtain a plurality of template groups, and the number of templates included in each template group is the target number.
  • the number of combinations can be used to select a target number of templates and obtain a template group.
  • the preset number is n and the target number is k
  • the number of template groups that can be selected can be:
  • S4033 Determine a recommended template group from a plurality of template groups according to the template types of the target number of templates in the template group, and use the target number of templates in the recommended template group as a recommended template.
  • the respective template types of the target number of templates included in each template group are respectively determined, and a recommended template group is determined from the plurality of template groups according to the template types.
  • the templates recommended to users include a variety of styles for users to choose from.
  • step S4033 includes step S4033a and step S4033b.
  • S4033a Obtain template types of a target number of templates in multiple template groups, and determine combined scores corresponding to multiple template groups according to the template types and the matching scores.
  • each template group For each template group, obtain the template types of the target number of templates in the template group, then calculate the template richness score of the template group according to the template type, and obtain the respective matching scores of the target number of templates in the template group.
  • the combined score corresponding to each template group is determined according to the matching score and richness score.
  • the determining, according to the template type and the matching score, the combined scores corresponding to the multiple template groups includes: determining, according to the template type, a target number of templates in the multiple template groups.
  • the template richness among the template groups is determined according to the sum of the template richness among the target number of templates in the template group and the matching scores of the target number of templates in the template group to determine the combined score of a plurality of the template groups.
  • E 1 represents the template richness in the template group
  • a i represents the template type of the ith template
  • a j represents the template type of the j th template
  • the value of f( ai , a j ) represents the ith template Whether the template type of is the same as the template type of the jth template.
  • the value of f(a i , a j ) is 0, when the template type of the ith template and the template type of the jth template are different , the value of f(a i , a j ) is 1.
  • the combined score of the template group can be determined according to the matching score of each template in the template group.
  • E is the combined score of the template group
  • E1 is the template richness of the template group
  • a is the preset weight of the template richness
  • E2 is the sum of the matching scores of the templates in the template group.
  • S4033b Determine a recommended template group from a plurality of the template groups according to the combination score.
  • a recommended template group can be selected from multiple template groups according to the combined score of each template group, and the template in the recommended template group is the recommended template recommended to the user.
  • a template group with the highest combined score may be selected from a plurality of template groups as the recommended template group according to the combination score.
  • the number of the recommended videos includes a plurality of recommended videos, and the plurality of recommended videos are obtained according to a target number of templates in the recommended template group.
  • a plurality of the recommended videos are recommended to the user for selection by the user.
  • the video clips are filled into the video pits of the template, thereby generating multiple recommended videos, and recommending the multiple recommended videos to the user, so that It is up to the user to select the final video to use.
  • step S403 includes step S4031' and step S4032'.
  • S4031' Determine a preset number of templates from the multiple templates according to the matching score to form a template group.
  • a preset number of templates are determined from the multiple templates, and the preset number of templates are formed into a template group.
  • a preset number of templates may be selected from high to low according to their respective matching scores to form a template group. For example, according to the matching scores of the templates, five templates may be selected as a group from high to low to form a template group.
  • S4032' Determine the template type of the template in the template group, and determine a recommended template from the template group according to the template type.
  • the step of determining the recommended template according to the template type specifically includes steps S4032'a to S4032'c.
  • S4032'a Determine whether the number of template types is greater than a preset type threshold.
  • the preset type threshold may be preset.
  • the recommended template is determined according to the matching score of the template type and the templates of the same template type, that is, the template with the highest matching score is selected from the templates of the same template type as the recommended template .
  • the determining the recommended template according to the template type and the matching scores of templates of the same template type includes: classifying the templates in the template group according to the template type, to obtain multiple types of templates. template; determine a plurality of type optimal templates according to the matching score, the type optimal template is the template with the highest matching score in each template type; select the highest matching score template from the multiple type optimal templates template as a recommended template.
  • a template with the highest matching score is selected from the templates corresponding to the template type as the optimal template under the template type, that is, the optimal template of the type.
  • the template with the highest matching score is selected from the multiple type-optimal templates as the recommended template.
  • FIG. 19 is a schematic diagram of selecting a recommended template from a template group.
  • A, B, C, and D are four template types in the template group, and the preset type threshold is 3, that is, three types of templates need to be selected from the template group and recommended to the user.
  • n templates from A 1 to An under the A template type there are n templates from A 1 to An under the A template type, and the matching scores of these n templates are respectively A 1-99 points, A 2-98 points, A 3-96 points, etc.
  • m templates from B 1 to B m under the B template type and the matching scores of these m templates are respectively B 1 -97 points, B 2 -94 points, B 3 -89 points, etc.
  • y templates from D 1 to D y under the D template type and the matching scores of these y templates are respectively D 1 -95 points, D 2 -94 points, D 3 -93 points, etc.
  • the optimal template of the type is selected according to the matching score of the template, and the optimal template of the type A template type is A 1 , the optimal template of the B template type is B 1 , and the optimal type of the C template type is obtained.
  • the template is C 1 , and the type-optimal template of the D template type is D 1 .
  • the matching scores of template A 1 , template B 1 , template C 1 and template D 1 are compared, where A 1 -99 points, B 1 -97 points, C 1-99 points, D 1-95 points, choose three templates from template A 1 , template B 1 , template C 1 and template D 1 according to the level of matching score, template A 1 , template B 1 and template respectively C 1 , the template A 1 , the template B 1 and the template C 1 are recommended to the user as recommended templates.
  • the number of template types is less than the preset type threshold, it means that the template types in the template group cannot meet the requirements of the preset type threshold. Therefore, you can directly select the matching score for the templates of the template type in the template group. The highest template is the recommended template.
  • template A 1 and template B 1 can be used as recommended templates at this time.
  • the number of template types is equal to the preset type threshold, it means that the template types in the template group just meet the requirements of the preset type threshold. Therefore, it is also possible to directly select the matching score for templates of template types in the template group. The highest template is the recommended template.
  • template A 1 , template B 1 and template C 1 may be used as recommended templates.
  • step S403 specifically includes acquiring template types of the multiple templates; and determining a recommended template according to the template types and matching scores of the multiple templates.
  • the template type of each template is obtained, so as to determine the recommended template according to the template type and matching score of the multiple templates, so that the recommended templates of multiple template types can be recommended to the user.
  • the determining a recommended template according to template types and matching scores of the multiple templates includes: dividing the multiple templates into multiple type template groups according to the template types, and each type of The template group includes at least one of the templates; according to the matching score of the template, the template that meets the required number of types is determined from the plurality of type template groups; and the remaining templates from the plurality of templates according to the matching score of the template Template Select a template, and the number of selected templates meets the required number of templates.
  • the obtained first threshold number of templates of different types are used as the first recommended templates; wherein, the first recommended templates are in the template type to which they belong. Template with the highest match score; the first threshold is the number of type requirements.
  • the second recommended template is obtained according to the level of matching scores; wherein, the sum of the first recommended template and the second recommended template
  • the threshold is the number of template requirements.
  • the multiple templates are classified according to the template type of each template in the multiple templates to obtain multiple type template groups, each type template group corresponds to a template type, and each type template group includes at least one template.
  • the template that meets the required quantity of the type is selected from the type template group, and then the template is selected from the remaining templates according to the matching score of the remaining template until the number of selected templates meets the required quantity of the template.
  • Each type template group corresponds to a template type
  • each type template group corresponds to a template type.
  • the maximum matching score of templates under A template type is A 1 -99 points
  • the maximum matching score of templates under B template type is B 1 -97 points
  • the maximum matching score of templates under C template type For C 1 -99 points
  • the highest matching score of templates under the D template type is D 1 -95 points.
  • three templates with the highest matching scores are selected from the four template types, namely template A 1 , template B 1 and template C 1. Make the selected three templates meet the type requirements.
  • the number of templates selected is 3, and the required number of templates is 5. Therefore, according to the matching score of each template in the remaining 21 templates, 2 templates can be selected to make the number of templates selected. Meet the number of template requirements.
  • the remaining 21 templates include templates A 2 to A 6 , B 2 to B 6 , C 2 to C 6 and D 1 to D 6 .
  • template A 2 and template A 3 are selected.
  • the number of selected templates meets the required number of templates, and the selected templates that meet the required number of templates are used as recommended templates. That is, five templates, template A 1 , template A 2 , template A 3 , template B 1 , and template C 1 , are used as recommended templates.
  • the diversity of templates may be considered, and a template with a different template type from the selected template may be selected.
  • the determining the recommended template according to the template types and matching scores of the multiple templates includes: sequentially selecting templates from the multiple templates according to the matching scores of the templates, until the selected template is The type meets the required number of types; templates are selected from the remaining templates in the plurality of templates according to the matching scores of the templates, until the number of selected templates meets the required number of templates.
  • Templates are selected from multiple templates according to their matching scores, and the selected template type is determined, so that each time the selected template type is different, until the selected template type meets the required number of types. Then, from the remaining templates, templates are selected according to the matching scores of the remaining templates, until the number of selected templates meets the required number of templates.
  • template A 1 and template C can be selected from template A 1 and template C.
  • One template selected from 1 is selected as the template selected for the first time.
  • the template selected for the first time is template A 1
  • the corresponding template type is A.
  • the template with the highest matching score is selected from the remaining 23 templates except template A 1 , and the template selected for the second time is template C 1 , and the corresponding template type is C.
  • the template with the highest matching score is continued to be selected from the remaining 22 templates except template A 1 and template C 1.
  • the template with the highest matching score is template A 2
  • the corresponding template type is A
  • the template A 1 and Among the remaining 22 templates other than template C 1 the template with the smallest difference between the matching score and the matching score of template A 2 is selected as template B 1
  • the corresponding template type is B. Since template B 1 is different from the selected template A 1 and template C 1 , template B 1 is used as the template selected for the third time, and the templates selected at this time are template A 1 and template C 1 and template B 1 , there are three template types, which meet the required number of types.
  • the template with the highest matching score is selected from the remaining 21 templates except template A 1 , template C 1 and template B 1 , and the selected template is template A 2 .
  • the number of templates is four, which does not meet the required number of templates.
  • the template with the highest matching score is selected from the remaining 20 templates, and the selected template is template A 3 .
  • the number of selected templates is five, which meets the required number of templates.
  • the selected templates that meet the required number of templates are used as recommended templates, that is, five templates of template A 1 , template A 2 , template A 3 , template B 1 and template C 1 are used as recommended templates.
  • the video clips are filled in the corresponding video pits to obtain the recommended video.
  • step S404 is specifically to determine whether the video duration of the video clip is greater than the duration of the video pit; if the video duration of the video clip is greater than the duration of the video pit, the video clip is Fragment extraction is performed to obtain selected fragments.
  • the duration of the video pit determines the maximum duration of the video clip that can be filled in the video slot. Therefore, when the video duration of the video clip is greater than the duration of the video slot, the video clip cannot be directly filled into the corresponding video slot. , it is necessary to extract a segment of the corresponding duration from the video segment and fill it in the corresponding video pit.
  • the video duration of the selected segment is less than or equal to the duration of the video pit.
  • the video duration of the selected segment in order to fill the selected segment into the corresponding video pit and ensure the integrity of the obtained recommended video, when determining the selected segment, can be equal to the duration of the video pit.
  • the performing segment extraction on the video segment to obtain the selected segment includes: performing segment extraction on the video segment according to the video element of the video segment to obtain the selected segment.
  • segment extraction can be performed on the video segment according to the video element of the video segment to obtain the selected segment.
  • the video elements include at least one of smiley face pictures, laughter audio, character movements, clear human voices, picture composition and aesthetic scores.
  • more exciting clips can be extracted from video clips as selected clips according to video elements, for example, clips including smiling faces or clips with high aesthetic scores are selected as selected clips.
  • step S404 is specifically to fill the video clip into the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template to obtain an initial video; based on the template requirements of the recommended template Perform image optimization on the initial video to get the recommended video.
  • the initial video After filling the video clips into the corresponding video pits according to the matching relationship corresponding to the recommended template, the initial video is obtained, and then the initial video can be image-optimized according to the template requirements, and the image-optimized video is recommended to the user as a recommended video.
  • the template requirements include at least one of transition settings, acceleration and deceleration settings, and texture special effects settings.
  • the source of the video material to be processed may be an aerial video shot by a drone, and the distance between the camera and the photographed object is relatively long during aerial photography, the content of the picture changes is small. Therefore, when filling the aerial video into the corresponding video pit, you can automatically identify the speed of the screen change, automatically adjust the speed of the aerial video according to the speed of the screen change, and then fill in the speed-adjusted aerial video into the corresponding video. in the video pit.
  • the speed of the picture change can be obtained by analyzing a plurality of consecutive frames within a preset time.
  • the aerial video when determining the matching relationship between the video pits and the video clips of the template, for the identified aerial video, can be placed in the first several video pits and/or the latter several of the template. Video pits, thereby improving the quality of the recommended video.
  • the video processing method provided by the above-mentioned embodiment, by obtaining a plurality of templates, and matching video clips for the video pits of each template, obtaining the matching relationship corresponding to each template, thereby determining the matching score corresponding to the matching relationship of each template. , according to the matching score, a recommended template is determined from multiple templates, and finally a recommended video is synthesized according to the matching relationship corresponding to the recommended template.
  • the recommended template is determined according to the matching score, and the recommended video is synthesized based on the recommended template, which can automatically determine the appropriate recommended template for the video clip, reduce the workload of the user when editing the video, and improve the diversity of the synthesized recommended video.
  • FIG. 20 is a schematic block diagram of a video processing apparatus provided by an embodiment of the present application.
  • the video processing apparatus 500 further includes at least one or more processors 501 and a memory 502 .
  • the processor 501 may be, for example, a micro-controller unit (Micro-controller Unit, MCU), a central processing unit (Central Processing Unit, CPU), or a digital signal processor (Digital Signal Processor, DSP) or the like.
  • MCU Micro-controller Unit
  • CPU Central Processing Unit
  • DSP Digital Signal Processor
  • the memory 502 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) magnetic disk, an optical disk, a U disk, or a removable hard disk, and the like.
  • ROM Read-Only Memory
  • the memory 502 is used for storing a computer program; the processor 501 is used for executing the computer program, and when executing the computer program, executes any one of the video processing methods provided in the embodiments of the present application, so as to reduce the need for users to The workload of video editing, providing a variety of recommended videos.
  • FIG. 21 is a schematic block diagram of a terminal device provided by an embodiment of the present application. As shown in FIG. 21 , the terminal device 600 further includes at least one or more processors 601 and a memory 602 .
  • the terminal equipment includes terminals such as a mobile phone, a remote control, a PC, and a tablet computer.
  • the processor 601 may be, for example, a Micro-controller Unit (MCU), a Central Processing Unit (CPU), a Digital Signal Processor (DSP), or the like.
  • MCU Micro-controller Unit
  • CPU Central Processing Unit
  • DSP Digital Signal Processor
  • the memory 602 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) magnetic disk, an optical disk, a U disk, or a removable hard disk, and the like.
  • ROM Read-Only Memory
  • the memory 602 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) magnetic disk, an optical disk, a U disk, or a removable hard disk, and the like.
  • the memory 602 is used to store a computer program; the processor 601 is used to execute the computer program, and when executing the computer program, execute any one of the video processing methods provided in the embodiments of this application, so as to reduce the user's need for The workload of video editing, providing a variety of recommended videos.
  • Embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, the computer program includes program instructions, and the processor executes the program instructions to implement the above implementation The steps of any one of the video processing methods provided in the example.
  • the computer-readable storage medium may be an internal storage unit of the terminal device described in any of the foregoing embodiments, such as a memory or memory of the terminal device.
  • the computer-readable storage medium may also be an external storage device of the terminal device, such as a plug-in hard disk equipped on the terminal device, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) ) card, Flash Card, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

A video processing method, a video processing apparatus, a terminal device, and a storage medium. The method comprises: determining a template according to video information of a video material to be processed, the template comprising at least one video gap (S301); determining, according to gap information of the video gap in the template, a video clip corresponding to the video gap to obtain a corresponding matching relationship of the template, the video clip being a clip in said video material (S302); and inserting, according to the corresponding matching relationship of the template, the video clip into the corresponding video gap of the template to obtain a recommendation video (S303).

Description

视频处理方法、视频处理装置、终端设备以及存储介质Video processing method, video processing device, terminal device and storage medium 技术领域technical field
本申请涉及视频处理技术领域,尤其涉及一种视频处理方法、视频处理装置、终端设备以及存储介质。The present application relates to the technical field of video processing, and in particular, to a video processing method, a video processing apparatus, a terminal device, and a storage medium.
背景技术Background technique
目前,视频内容已经成为自媒体的主流,用户通过拍摄短视频来进行日常生活的分享。为了得到内容丰富的短视频,用户可以将通过拍摄装置(无人机、手持云台、相机或手机)拍摄的各种视频素材进行自由组合,并将多个视频素材进行剪辑合并成一个视频,发布于社交网站。目前,大部分视频剪辑方案还是需要用户参与,缺少对视频进行自动剪辑的有效解决方案。At present, video content has become the mainstream of self-media, and users share their daily life by shooting short videos. In order to get a short video with rich content, users can freely combine various video materials shot by a shooting device (drone, handheld gimbal, camera or mobile phone), and edit and merge multiple video materials into one video. Post on social networking sites. At present, most video editing solutions still require user participation, and there is no effective solution for automatic video editing.
发明内容SUMMARY OF THE INVENTION
基于此,本申请提供了一种视频处理方法、视频处理装置、终端设备以及存储介质,该视频处理方法基于模板对待处理视频素材进行处理,旨在降低用户在进行视频剪辑时的工作量,提供多样化的推荐视频。Based on this, the present application provides a video processing method, a video processing device, a terminal device, and a storage medium. The video processing method processes the video material to be processed based on a template, aiming at reducing the workload of the user when performing video editing, providing Diverse recommended videos.
第一方面,本申请提供了一种视频处理方法,包括:In a first aspect, the present application provides a video processing method, including:
根据待处理视频素材的视频信息确定模板,所述模板至少包括一个视频坑位;Determine a template according to the video information of the video material to be processed, and the template includes at least one video pit;
根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,得到所述模板对应的匹配关系,其中,所述视频片段为所述待处理视频素材中的片段;Determine the video clip corresponding to the video pit according to the pit information of the video pit in the template, and obtain the matching relationship corresponding to the template, wherein the video fragment is a fragment in the video material to be processed;
根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。According to the matching relationship corresponding to the template, the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
第二方面,本申请还提供了一种视频处理方法,包括:In a second aspect, the present application also provides a video processing method, including:
根据待处理视频素材的视频片段和模板的视频坑位构建流网络图;Build a stream network diagram according to the video clips of the video material to be processed and the video pits of the template;
基于所述流网络图确定所述视频片段与所述视频坑位对应的匹配关系;Determine the matching relationship corresponding to the video segment and the video pit based on the stream network graph;
根据所述匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频;Filling the video clip into the corresponding video pit of the template according to the matching relationship to obtain a recommended video;
其中,所述流网络图包括多个节点,每个所述节点对应一个所述视频片段和一个所述视频坑位的匹配关系。The flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits.
第三方面,本申请还提供了一种视频处理方法,包括:In a third aspect, the application also provides a video processing method, including:
获取多个模板,所述模板至少包括一个视频坑位;Obtain a plurality of templates, and the template includes at least one video pit;
为每个所述模板的视频坑位匹配视频片段,得到每个所述模板对应的匹配关系,并确定每个所述模板对应的匹配关系的匹配得分,其中所述视频片段为待处理视频素材的片段;Matching video clips for the video pits of each of the templates, obtaining a matching relationship corresponding to each of the templates, and determining a matching score of the matching relationship corresponding to each of the templates, wherein the video clips are the video material to be processed fragment;
根据所述匹配得分从所述多个模板中确定推荐模板;determining a recommended template from the plurality of templates according to the matching score;
根据所述推荐模板对应的匹配关系将所述视频片段填入所述推荐模板的对应视频坑位,得到推荐视频。The video clip is filled in the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template to obtain a recommended video.
第四方面,本申请还提供了一种视频处理方法,包括:In a fourth aspect, the present application also provides a video processing method, including:
根据待处理视频素材的视频信息,对所述待处理视频素材进行分割生成多个视频片段;According to the video information of the video material to be processed, the to-be-processed video material is divided to generate a plurality of video segments;
根据所述模板的视频坑位的坑位信息,确定待填入所述模板中各个视频坑位的视频片段,得到所述模板对应的匹配关系;According to the pit information of the video pits of the template, determine the video clips to be filled in each video pit in the template, and obtain the matching relationship corresponding to the template;
根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。According to the matching relationship corresponding to the template, the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
第五方面,本申请还提供了一种视频处理装置,所述视频处理装置包括处理器和存储器;In a fifth aspect, the present application further provides a video processing device, the video processing device comprising a processor and a memory;
所述存储器用于存储计算机程序;the memory is used to store computer programs;
所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:The processor is configured to execute the computer program and when executing the computer program, realize:
根据待处理视频素材的视频信息确定模板,所述模板至少包括一个视频坑位;Determine a template according to the video information of the video material to be processed, and the template includes at least one video pit;
根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,得到所述模板对应的匹配关系,其中,所述视频片段为所述待处理视频素 材中的片段;Determine the video clip corresponding to the video pit according to the pit information of the video pit in the template, and obtain the matching relationship corresponding to the template, wherein the video fragment is the fragment in the video material to be processed;
根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。According to the matching relationship corresponding to the template, the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
第六方面,本申请还提供了一种视频处理装置,所述视频处理装置包括处理器和存储器;In a sixth aspect, the present application further provides a video processing apparatus, the video processing apparatus comprising a processor and a memory;
所述存储器用于存储计算机程序;the memory is used to store computer programs;
所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:The processor is configured to execute the computer program and when executing the computer program, realize:
根据待处理视频素材的视频片段和模板的视频坑位构建流网络图;Build a stream network diagram according to the video clips of the video material to be processed and the video pits of the template;
基于所述流网络图确定所述视频片段与所述视频坑位对应的匹配关系;Determine the matching relationship corresponding to the video segment and the video pit based on the stream network graph;
根据所述匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频;Filling the video clip into the corresponding video pit of the template according to the matching relationship to obtain a recommended video;
其中,所述流网络图包括多个节点,每个所述节点对应一个所述视频片段和一个所述视频坑位的匹配关系。The flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits.
第七方面,本申请还提供了一种视频处理装置,所述视频处理装置包括处理器和存储器;In a seventh aspect, the present application further provides a video processing device, the video processing device comprising a processor and a memory;
所述存储器用于存储计算机程序;the memory is used to store computer programs;
所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:The processor is configured to execute the computer program and when executing the computer program, realize:
获取多个模板,所述模板至少包括一个视频坑位;Obtain a plurality of templates, and the template includes at least one video pit;
为每个所述模板的视频坑位匹配视频片段,得到每个所述模板对应的匹配关系,并确定每个所述模板对应的匹配关系的匹配得分,其中所述视频片段为待处理视频素材的片段;Matching video clips for the video pits of each of the templates, obtaining a matching relationship corresponding to each of the templates, and determining a matching score of the matching relationship corresponding to each of the templates, wherein the video clips are the video material to be processed fragment;
根据所述匹配得分从所述多个模板中确定推荐模板;determining a recommended template from the plurality of templates according to the matching score;
根据所述推荐模板对应的匹配关系将所述视频片段填入所述推荐模板的对应视频坑位,得到推荐视频。The video clip is filled in the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template to obtain a recommended video.
第八方面,本申请还提供了一种视频处理装置,所述视频处理装置包括处理器和存储器;In an eighth aspect, the present application further provides a video processing device, the video processing device comprising a processor and a memory;
所述存储器用于存储计算机程序;the memory is used to store computer programs;
所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:The processor is configured to execute the computer program and when executing the computer program, realize:
根据待处理视频素材的视频信息,对所述待处理视频素材进行分割生成多 个视频片段;According to the video information of the video material to be processed, the to-be-processed video material is divided to generate a plurality of video segments;
根据所述模板的视频坑位的坑位信息,确定待填入所述模板中各个视频坑位的视频片段,得到所述模板对应的匹配关系;According to the pit information of the video pits of the template, determine the video clips to be filled in each video pit in the template, and obtain the matching relationship corresponding to the template;
根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。According to the matching relationship corresponding to the template, the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
第九方面,本申请还提供了一种终端设备,所述终端设备包括处理器和存储器;In a ninth aspect, the present application further provides a terminal device, the terminal device comprising a processor and a memory;
所述存储器用于存储计算机程序;the memory is used to store computer programs;
所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:The processor is configured to execute the computer program and when executing the computer program, realize:
根据待处理视频素材的视频信息确定模板,所述模板至少包括一个视频坑位;Determine a template according to the video information of the video material to be processed, and the template includes at least one video pit;
根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,得到所述模板对应的匹配关系,其中,所述视频片段为所述待处理视频素材中的片段;Determine the video clip corresponding to the video pit according to the pit information of the video pit in the template, and obtain the matching relationship corresponding to the template, wherein the video fragment is a fragment in the video material to be processed;
根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。According to the matching relationship corresponding to the template, the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
第十方面,本申请还提供了一种终端设备,所述终端设备包括处理器和存储器;In a tenth aspect, the present application further provides a terminal device, the terminal device comprising a processor and a memory;
所述存储器用于存储计算机程序;the memory is used to store computer programs;
所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:The processor is configured to execute the computer program and when executing the computer program, realize:
根据待处理视频素材的视频片段和模板的视频坑位构建流网络图;Build a stream network diagram according to the video clips of the video material to be processed and the video pits of the template;
基于所述流网络图确定所述视频片段与所述视频坑位对应的匹配关系;Determine the matching relationship corresponding to the video segment and the video pit based on the stream network graph;
根据所述匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频;Filling the video clip into the corresponding video pit of the template according to the matching relationship to obtain a recommended video;
其中,所述流网络图包括多个节点,每个所述节点对应一个所述视频片段和一个所述视频坑位的匹配关系。The flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits.
第十一方面,本申请还提供了一种终端设备,所述终端设备包括处理器和存储器;In an eleventh aspect, the present application further provides a terminal device, where the terminal device includes a processor and a memory;
所述存储器用于存储计算机程序;the memory is used to store computer programs;
所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:The processor is configured to execute the computer program and when executing the computer program, realize:
获取多个模板,所述模板至少包括一个视频坑位;Obtain a plurality of templates, and the template includes at least one video pit;
为每个所述模板的视频坑位匹配视频片段,得到每个所述模板对应的匹配关系,并确定每个所述模板对应的匹配关系的匹配得分,其中所述视频片段为待处理视频素材的片段;Matching video clips for the video pits of each of the templates, obtaining a matching relationship corresponding to each of the templates, and determining a matching score of the matching relationship corresponding to each of the templates, wherein the video clips are the video material to be processed fragment;
根据所述匹配得分从所述多个模板中确定推荐模板;determining a recommended template from the plurality of templates according to the matching score;
根据所述推荐模板对应的匹配关系将所述视频片段填入所述推荐模板的对应视频坑位,得到推荐视频。The video clip is filled in the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template to obtain a recommended video.
第十二方面,本申请还提供了一种终端设备,所述终端设备包括处理器和存储器;In a twelfth aspect, the present application further provides a terminal device, the terminal device comprising a processor and a memory;
所述存储器用于存储计算机程序;the memory is used to store computer programs;
所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:The processor is configured to execute the computer program and when executing the computer program, realize:
根据待处理视频素材的视频信息,对所述待处理视频素材进行分割生成多个视频片段;According to the video information of the video material to be processed, the to-be-processed video material is divided to generate a plurality of video segments;
根据所述模板的视频坑位的坑位信息,确定待填入所述模板中各个视频坑位的视频片段,得到所述模板对应的匹配关系;According to the pit information of the video pits of the template, determine the video clips to be filled in each video pit in the template, and obtain the matching relationship corresponding to the template;
根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。According to the matching relationship corresponding to the template, the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
第十三方面,本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如上所述的视频处理方法的步骤。In a thirteenth aspect, the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the processor enables the processor to realize the video as described above The steps of the processing method.
本申请实施例提供了一种视频处理方法、视频处理装置、终端设备以及存储介质,对待处理视频素材进行快速成片,降低视频剪辑的工作量,增加推荐视频的多样性,提高用户体验。Embodiments of the present application provide a video processing method, a video processing device, a terminal device, and a storage medium, which can quickly form a video material to be processed, reduce the workload of video editing, increase the diversity of recommended videos, and improve user experience.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not limiting of the present application.
附图说明Description of drawings
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要 使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. For those of ordinary skill, other drawings can also be obtained from these drawings without any creative effort.
图1是本申请实施例提供的一种视频处理方法的步骤流程图;1 is a flowchart of steps of a video processing method provided by an embodiment of the present application;
图2是图1中提供的一种视频处理方法的子步骤流程图;Fig. 2 is the sub-step flow chart of a kind of video processing method provided in Fig. 1;
图3是本申请实施例提供的对第一视频片段进行聚类分割的步骤流程图;3 is a flowchart of steps for clustering and segmenting a first video segment provided by an embodiment of the present application;
图4是本申请实施例提供的图像特征网络模型的训练流程图;Fig. 4 is the training flow chart of the image feature network model provided by the embodiment of the present application;
图5是本申请实施例提供的相似度计算结果的示意图;5 is a schematic diagram of a similarity calculation result provided by an embodiment of the present application;
图6是本申请实施例提供的另一种视频处理方法的步骤流程图;6 is a flowchart of steps of another video processing method provided by an embodiment of the present application;
图7是本申请实施例构建的流网络图的示意图;7 is a schematic diagram of a flow network diagram constructed in an embodiment of the present application;
图8是本申请实施例提供的另一种视频处理方法的步骤流程图;8 is a flowchart of steps of another video processing method provided by an embodiment of the present application;
图9是本申请实施例提供的待处理视频素材来源的示意图;9 is a schematic diagram of a source of video material to be processed provided by an embodiment of the present application;
图10是图8中提供的一种视频处理方法的子步骤流程图;Fig. 10 is the sub-step flow chart of a kind of video processing method provided in Fig. 8;
图11是本申请实施例提供的根据视频标签确定匹配的多个模板的步骤流程图;FIG. 11 is a flowchart of steps for determining a plurality of matching templates according to video tags provided by an embodiment of the present application;
图12是本申请实施例提供的将视频片段填入对应的视频坑位的步骤流程图;12 is a flowchart of steps for filling video clips into corresponding video pits provided by an embodiment of the present application;
图13是本申请实施例提供的另一种视频处理方法的步骤流程图;13 is a flowchart of steps of another video processing method provided by an embodiment of the present application;
图14是本申请实施例提供的为视频坑位匹配视频片段得到匹配关系的步骤流程图;14 is a flowchart of steps for obtaining a matching relationship for video pits matching video clips provided by an embodiment of the present application;
图15是本申请实施例提供的根据匹配得分确定推荐模板的步骤流程图;15 is a flowchart of steps for determining a recommended template according to a matching score provided by an embodiment of the present application;
图16是图15中提供的根据匹配得分确定推荐模板的子步骤流程图;Figure 16 is a flowchart of the sub-steps provided in Figure 15 for determining a recommended template according to the matching score;
图17是图13中提供的一种视频处理方法的子步骤流程图;Fig. 17 is the sub-step flow chart of a kind of video processing method provided in Fig. 13;
图18是本申请实施例提供的根据模板类型确定推荐模板的步骤流程图;18 is a flowchart of steps for determining a recommended template according to a template type provided by an embodiment of the present application;
图19是本申请实施例提供的从模板组中选择推荐模板的示意图;19 is a schematic diagram of selecting a recommended template from a template group provided by an embodiment of the present application;
图20是本申请实施例提供的一种视频处理装置的示意性框图;FIG. 20 is a schematic block diagram of a video processing apparatus provided by an embodiment of the present application;
图21是本申请实施例提供的一种终端设备的示意性框图。FIG. 21 is a schematic block diagram of a terminal device provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
附图中所示的流程图仅是示例说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解、组合或部分合并,因此实际执行的顺序有可能根据实际情况改变。The flowcharts shown in the figures are for illustration only, and do not necessarily include all contents and operations/steps, nor do they have to be performed in the order described. For example, some operations/steps can also be decomposed, combined or partially combined, so the actual execution order may be changed according to the actual situation.
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and features in the embodiments may be combined with each other without conflict.
请参阅图1,图1是本申请实施例提供的一种视频处理方法的步骤流程图。该视频处理方法可以应用在终端设备或云端设备中,用于将待处理视频素材和预设的模板进行合成。其中,终端设备包括手机、平板和笔记本电脑等。Please refer to FIG. 1. FIG. 1 is a flowchart of steps of a video processing method provided by an embodiment of the present application. The video processing method can be applied to a terminal device or a cloud device for synthesizing the video material to be processed and a preset template. Among them, terminal devices include mobile phones, tablets, and notebook computers.
具体地,如图1所示,该视频处理方法包括步骤S101至步骤S103。Specifically, as shown in FIG. 1 , the video processing method includes steps S101 to S103.
S101、根据待处理视频素材的视频信息,对所述待处理视频素材进行分割生成多个视频片段。S101. According to the video information of the video material to be processed, divide the video material to be processed to generate a plurality of video segments.
其中,待处理视频素材用于与预设的模板进行合成,从而为用户生成推荐视频,预设的模板中包括至少一个视频坑位,用于填充视频片段。根据待处理视频素材的视频信息来对待处理视频素材进行分割,得到多个视频片段,以便于将视频片段填入模板的视频坑位中,生成推荐视频。The to-be-processed video material is used for synthesizing with a preset template to generate a recommended video for the user, and the preset template includes at least one video pit for filling the video clips. The to-be-processed video material is segmented according to the video information of the to-be-processed video material to obtain a plurality of video clips, so that the video clips can be filled into the video pits of the template to generate a recommended video.
在一实施例中,待处理视频素材可以包括多种来源的视频素材,例如可以是通过手持端拍摄的视频素材、通过可移动平台拍摄的视频素材、从云端服务器获取的视频素材以及从本地服务器获取的视频素材等。In one embodiment, the video material to be processed may include video material from various sources, such as video material captured by a handheld terminal, video material captured by a mobile platform, video material obtained from a cloud server, and video material obtained from a local server. Obtained video material, etc.
其中,手持端例如可以是手机、平板和运动相机等,可移动平台例如可以是无人机等。其中,无人机可以为旋翼型无人机,例如四旋翼无人机、六旋翼无人机、八旋翼无人机,也可以是固定翼无人机。该无人机上带有摄像设备。Wherein, the handheld terminal may be, for example, a mobile phone, a tablet, a motion camera, etc., and the movable platform may be, for example, a drone. The UAV may be a rotary-wing UAV, such as a quad-rotor UAV, a hexa-rotor UAV, an octa-rotor UAV, or a fixed-wing UAV. The drone has camera equipment on it.
通过对不同来源的视频素材进行汇总,并通过混剪将不同来源的视频素材剪辑在一起,增加得到的推荐视频的视频多样性。By summarizing the video materials from different sources, and editing the video materials from different sources together by mixing and cutting, the video diversity of the recommended videos is increased.
在一实施例中,视频信息包括运镜方向和场景信息中的至少一项。请参阅图2,步骤S101包括步骤S1011和步骤S1012。In one embodiment, the video information includes at least one of a moving direction and scene information. Referring to FIG. 2, step S101 includes step S1011 and step S1012.
S1011、根据所述待处理视频素材的视频信息对所述待处理视频素材进行分割,得到多个第一视频片段。S1011. Divide the to-be-processed video material according to the video information of the to-be-processed video material to obtain a plurality of first video segments.
其中,在根据待处理视频素材的运镜方向对待处理视频素材进行初次分割时,得到的每个第一视频片段内的运镜方向相同或相似,不存在运镜方向的变化。Wherein, when the video material to be processed is segmented for the first time according to the moving direction of the video material to be processed, the moving direction of the camera in each obtained first video segment is the same or similar, and there is no change in the moving direction of the camera.
具体地,可以根据一待处理视频素材中运镜方向的变化,来对该待处理视频素材进行分割,生成多个第一视频片段。例如,待处理视频素材中包括连续的前后运镜,可以根据前后运镜方向的变化,将待处理视频素材分割为两个第一视频片段,一个第一视频片段的运镜方向为向前,另一个第一视频片段的运镜方向为向后,每一个第一视频片段内的运镜方向都是相同的。在具体实施过程中,可以通过一些检测运镜方向的算法来判断待处理视频素材中运镜方向的变化。Specifically, the to-be-processed video material may be segmented according to the change of the mirror movement direction in the to-be-processed video material to generate a plurality of first video segments. For example, the video material to be processed includes continuous front and rear mirror movements, and the video material to be processed can be divided into two first video clips according to the change of the front and rear mirror movement directions, and the mirror movement direction of one first video clip is forward. The camera movement direction of the other first video clip is backward, and the camera movement direction in each first video clip is the same. In the specific implementation process, some algorithms for detecting the moving direction of the mirror can be used to determine the change of the moving direction of the mirror in the video material to be processed.
其中,在根据待处理视频素材的场景信息对待处理视频素材进行初次分割时,得到的每个第一视频片段内的场景类似。例如,待处理视频素材中包括雪山图像,可以根据待处理视频素材中是否包括有雪山图像及雪山图像的类似图像,将待处理视频素材分割为多个第一视频片段。Wherein, when the to-be-processed video material is segmented for the first time according to the scene information of the to-be-processed video material, the obtained scenes in each first video segment are similar. For example, the to-be-processed video material includes a snowy mountain image, and the to-be-processed video material may be divided into multiple first video segments according to whether the to-be-processed video material includes a snowy mountain image and an image similar to the snowy mountain image.
另外,在对待处理视频素材进行分割时,还可以根据待处理视频素材中同一主体来对待处理视频素材进行分割。其中,该主体可以为目标人物或者宠物等等。例如,待处理视频素材中持续出现了一只猫咪,此时可以根据待处理视频素材的镜头中是否有该猫咪来对待处理视频素材进行分割,将待处理视频素材分割为多个第一视频片段。In addition, when the to-be-processed video material is divided, the to-be-processed video material may also be divided according to the same subject in the to-be-processed video material. Wherein, the subject may be a target person or a pet or the like. For example, if a cat continues to appear in the to-be-processed video material, at this time, the to-be-processed video material can be divided according to whether the cat is in the footage of the to-be-processed video material, and the to-be-processed video material is divided into multiple first video segments .
S1012、对所述第一视频片段进行聚类分割,得到多个第二视频片段。S1012. Perform cluster segmentation on the first video segment to obtain a plurality of second video segments.
在得到第一视频片段后,可以对第一视频片段进行第二次分割,得到多个第二视频片段,将第二视频片段作为待填入模板的视频坑位的视频片段。其中,聚类分割包括利用相似场景进行聚类。After the first video clip is obtained, the first video clip may be segmented a second time to obtain a plurality of second video clips, and the second video clips are used as the video clips to be filled in the video pits of the template. Among them, cluster segmentation includes clustering using similar scenes.
需要说明的是,可以选择性的对第一视频片段进行聚类分割,也即当第一视频片段满足一定条件时,才对第一视频片段进行聚类分割,否则,可以不对第一视频片段进行聚类分割。It should be noted that the first video segment can be selectively clustered and segmented, that is, the first video segment can be clustered and segmented only when the first video segment satisfies certain conditions; otherwise, the first video segment may not be clustered and segmented. Perform cluster segmentation.
在一实施例中,在对所述第一视频片段进行聚类分割,得到多个第二视频 片段之前,确定多个所述第一视频片段中是否存在有视频时长大于预设时长的第一视频片段;若存在有视频时长大于所述预设时长的第一视频片段,执行所述对所述第一视频片段进行聚类分割的步骤。In one embodiment, before clustering and segmenting the first video clips to obtain a plurality of second video clips, it is determined whether there is a first video clip whose video duration is longer than a preset duration in the plurality of first video clips. A video clip; if there is a first video clip with a video duration greater than the preset duration, perform the step of clustering and segmenting the first video clip.
在对第一视频片段进行聚类分割之前,判断得到的多个第一视频片段中是否存在有视频时长大于预设时长的第一视频片段。其中,预设时长是一个经验值,可根据经验或模板的视频坑位的时长进行调整。Before clustering and segmenting the first video clips, it is determined whether there is a first video clip with a video duration greater than a preset duration among the obtained first video clips. The preset duration is an experience value, which can be adjusted according to experience or the duration of video pits of the template.
若多个第一视频片段中有至少一个视频时长大于预设时长的第一视频片段,则对视频时长大于预设时长的这些第一视频片段进行第二次分割。而对于视频时长小于或等于预设时长的第一视频片段,可以不对其进行第二次分割。If there is at least one first video clip whose video duration is greater than the preset duration among the plurality of first video clips, the second division is performed on these first video clips whose video duration is greater than the preset duration. However, for the first video segment whose video duration is less than or equal to the preset duration, the second division may not be performed.
在一实施例中,对第一视频片段进行聚类分割的步骤请参阅图3,具体包括步骤S1012a至步骤S1012c。In an embodiment, the steps of clustering and segmenting the first video segment refer to FIG. 3 , which specifically includes steps S1012a to S1012c.
S1012a、确定滑动窗口和聚类中心。S1012a, determine the sliding window and the cluster center.
在对第一视频片段进行聚类分割时,首先确定滑动窗口和聚类中心,其中,所述滑动窗口用于确定待处理的当前视频帧,所述聚类中心用于确定所述第一视频片段的视频分割点。在对第一视频片段进行聚类分割的过程中,所述聚类中心包括所述第一视频片段的第一帧视频帧的图像特征。When performing cluster segmentation on the first video segment, first determine a sliding window and a cluster center, wherein the sliding window is used to determine the current video frame to be processed, and the cluster center is used to determine the first video The video split point of the clip. In the process of clustering and segmenting the first video segment, the cluster center includes image features of the first video frame of the first video segment.
其中,第一视频片段的第一帧视频帧的图像特征可以包括第一视频片段的第一帧视频帧的图像编码特征。其中,图像特征为根据预先训练好的图像特征网络模型得到的。图像特征网络模型能够输出第一视频片段中各个视频帧的图像特征。The image features of the first video frame of the first video clip may include image coding features of the first video frame of the first video clip. Among them, the image features are obtained according to the pre-trained image feature network model. The image feature network model can output image features of each video frame in the first video segment.
具体地,对于一第一视频片段而言,滑动窗口越大,在进行聚类分割时进行相似聚类处理的视频帧越少,分割速度也就越快;滑动窗口越小,在进行聚类分割时进行相似聚类处理的视频帧越多,分割速度也就越慢。因此,可以基于此原理设置滑动窗口的大小。在一实施方式中,滑动窗口的大小等于1。Specifically, for a first video segment, the larger the sliding window is, the fewer video frames are subjected to similar clustering processing during cluster segmentation, and the faster the segmentation speed is; the smaller the sliding window, the faster the segmentation speed is. The more video frames that are processed for similarity clustering during segmentation, the slower the segmentation speed will be. Therefore, the size of the sliding window can be set based on this principle. In one embodiment, the size of the sliding window is equal to one.
在一实施例中,滑动窗口的大小与第一视频片段的时长相关。在第一视频片段的时长较长时,为了快速对第一视频片段进行聚类分割,可以为滑动窗口设置一较大值,以提高聚类分割的速度,比如设置滑动窗口的大小为3。而在第一视频片段的时长较短时,可以为滑动窗口设置一较小值,比如设置滑动窗口的大小为1。In one embodiment, the size of the sliding window is related to the duration of the first video segment. When the duration of the first video segment is long, in order to quickly perform cluster segmentation on the first video segment, a larger value may be set for the sliding window to improve the speed of cluster segmentation, for example, setting the size of the sliding window to 3. When the duration of the first video clip is short, a smaller value may be set for the sliding window, for example, the size of the sliding window is set to 1.
在一实施例中,滑动窗口的大小与用户设置的期望分割速度相关。当用户期望对第一视频片段进行快速分割时,可以为滑动窗口设置一较大值,提高分割速度。当用户期望对第一视频片段进行缓慢分割时,可以为滑动窗口设置一较小值,降低分割速度。In one embodiment, the size of the sliding window is related to the desired segmentation speed set by the user. When the user desires to quickly segment the first video segment, a larger value can be set for the sliding window to improve the segmentation speed. When the user desires to segment the first video segment slowly, a smaller value can be set for the sliding window to reduce the segmentation speed.
在一实施例中,滑动窗口的大小与分割的精细程度相关。滑动窗口越小,在进行聚类分割时,所处理的视频帧越多,分割越精细,因此,可以根据对分割精细程度的需求,来设置滑动窗口的大小。In one embodiment, the size of the sliding window is related to the fineness of the segmentation. The smaller the sliding window is, the more video frames are processed and the finer the segmentation is when performing cluster segmentation. Therefore, the size of the sliding window can be set according to the requirement of the segmentation fineness.
在一实施例中,可以预先训练图像特征网络模型。训练流程图可以为图4所示,训练过程可以为:In one embodiment, the image feature network model may be pre-trained. The training flow chart can be shown in Figure 4, and the training process can be as follows:
准备训练集,对训练集中的视频素材进行标注,将训练集中同一场景的视频素材标注为一类。在训练时,将三个视频素材分别输入三个卷积神经网络,卷积神经网络例如可以是CNN网络,其中,三个视频素材中有两个视频素材为同一类,一个视频素材为其他类。Prepare the training set, label the video materials in the training set, and label the video materials of the same scene in the training set as one category. During training, the three video materials are respectively input into three convolutional neural networks. For example, the convolutional neural network can be a CNN network. Among the three video materials, two video materials belong to the same category, and one video material belongs to other categories. .
通过卷积神经网络生成三个视频素材的图像特征,并利用triplet loss来度量同一类的视频素材之间的图像距离是否小于不同类的视频素材之间的图像距离,由此生成对应的损失值。基于该损失值对卷积神经网络进行迭代训练,并不断调整卷积神经网络的权重,使得卷积神经网络能够学到有判别力的图像特征。Generate image features of three video materials through a convolutional neural network, and use triplet loss to measure whether the image distance between video materials of the same type is smaller than the image distance between different types of video materials, thereby generating the corresponding loss value . Based on the loss value, the convolutional neural network is iteratively trained, and the weights of the convolutional neural network are continuously adjusted, so that the convolutional neural network can learn discriminative image features.
在卷积神经网络训练完成后,将训练完成的卷积神经网络作为图像特征网络模型,用于输出第一视频片段中各个视频帧的图像特征。After the training of the convolutional neural network is completed, the trained convolutional neural network is used as an image feature network model for outputting image features of each video frame in the first video segment.
S1012b、基于所述聚类中心,根据所述滑动窗口对所述第一视频片段的视频帧进行聚类分析,确定视频分割点。S1012b. Based on the cluster center, perform cluster analysis on the video frames of the first video segment according to the sliding window, and determine a video segmentation point.
在确定聚类中心后,根据滑动窗口确定待处理的当前视频帧,然后将待处理的当前视频帧和聚类中心进行聚类分析,从而确定视频分割点。After the cluster center is determined, the current video frame to be processed is determined according to the sliding window, and then the current video frame to be processed and the cluster center are subjected to cluster analysis to determine the video segmentation point.
例如,在第一次聚类时,设置聚类中心为第一帧视频帧的图像特征,并进行第一次滑动。当滑动窗口为1时,第一次滑动时待处理的当前视频帧为第二帧视频帧,当滑动窗口为2时,第一次滑动时待处理的当前视频帧为第三帧视频帧,当滑动窗口为N时,第一次滑动时待处理的当前视频帧为第N+1帧视频帧。For example, in the first clustering, the cluster center is set as the image feature of the first video frame, and the first sliding is performed. When the sliding window is 1, the current video frame to be processed during the first sliding is the second video frame; when the sliding window is 2, the current video frame to be processed during the first sliding is the third video frame, When the sliding window is N, the current video frame to be processed during the first sliding is the N+1 th video frame.
在一实施例中,步骤S1012b具体为:根据所述滑动窗口确定当前视频帧,并确定所述当前视频帧的图像特征与所述聚类中心的相似度;若所述相似度小于预设阈值,则将所述当前视频帧作为视频分割点,并重新确定聚类中心;根据重新确定的聚类中心,继续确定视频分割点,直至所述第一视频片段的最后一个视频帧。In one embodiment, step S1012b is specifically: determining the current video frame according to the sliding window, and determining the similarity between the image feature of the current video frame and the cluster center; if the similarity is less than a preset threshold , the current video frame is taken as the video segmentation point, and the cluster center is re-determined; according to the re-determined cluster center, the video segmentation point is continued to be determined until the last video frame of the first video segment.
具体地,在对第一视频片段进行第一次聚类时,初始化聚类中心,并进行第一次滑动。将聚类中心设置为第一视频片段的第一帧视频帧的图像特征,记为C 0。滑动窗口大小记为N,此时第一次滑动时确定的待处理的当前视频帧为第一视频片段的第N+1帧视频帧,第m次滑动时确定的待处理的当前视频帧为第一视频片段的第m*N+1帧视频帧。其中,m是滑动窗口的滑动次数。 Specifically, when the first video segment is clustered for the first time, the cluster center is initialized, and the first sliding is performed. The cluster center is set as the image feature of the first video frame of the first video segment, denoted as C 0 . The size of the sliding window is denoted as N, the current video frame to be processed determined during the first sliding is the N+1 th video frame of the first video clip, and the current video frame to be processed determined during the mth sliding is The m*N+1 th video frame of the first video clip. Among them, m is the sliding number of the sliding window.
在第一次滑动后,将待处理的当前视频帧,也即第一视频片段的第N+1帧输入预先训练好的图像特征网络模型中,得到当前帧的图像特征F N+1,计算聚类中心的图像特征C 0与当前帧的图像特征F N+1之间的相似度。 After the first sliding, the current video frame to be processed, that is, the N+1th frame of the first video clip, is input into the pre-trained image feature network model to obtain the image feature F N+1 of the current frame, and calculate The similarity between the image feature C 0 of the cluster center and the image feature F N+1 of the current frame.
若相似度小于预设阈值,则认为第一视频片段的第一帧视频帧与第一视频片段的第N+1帧视频帧之间的视频内容变化较大,则可以将第一视频片段的第N+1帧作为视频分割点,将第一视频片段的第N+1帧之前的视频帧切割为一个第二视频片段。然后将第一视频片段的第N+1帧的图像特征F N+1作为重新确定的聚类中心,继续确定下一个视频分割点,直至到达第一视频片段的最后一个视频帧。 If the similarity is less than the preset threshold, it is considered that the video content between the first video frame of the first video clip and the N+1 th video frame of the first video clip has changed greatly, and the The N+1th frame is used as a video division point, and the video frame before the N+1th frame of the first video segment is cut into a second video segment. Then, the image feature F N+1 of the N+1th frame of the first video segment is used as the re-determined clustering center, and the next video segmentation point is continued to be determined until the last video frame of the first video segment is reached.
其中,相似度可以是图像特征之间的余弦相似度,将计算得到的图像特征之间的余弦相似度作为聚类中心的图像特征C 0与当前帧的图像特征F N+1之间的相似度。预设阈值可以是预先设置好的一个经验值。 The similarity may be the cosine similarity between image features, and the calculated cosine similarity between image features is taken as the similarity between the image feature C 0 of the cluster center and the image feature F N+1 of the current frame Spend. The preset threshold may be a preset empirical value.
在一实施例中,所述重新确定聚类中心,包括:将所述当前视频帧的图像特征作为重新确定的聚类中心。In an embodiment, the re-determining the cluster center includes: using the image feature of the current video frame as the re-determined cluster center.
当相似度小于预设阈值时,将当前视频帧作为视频分割点,把当前视频帧之前的视频帧切割为一个第二视频片段。此时,在对余下的第一视频片段进行聚类分析时,由于当前视频帧是余下的第一视频片段的第一帧,因此,将当前视频帧的图像特征作为重新确定的聚类中心。When the similarity is less than the preset threshold, the current video frame is used as the video dividing point, and the video frame before the current video frame is cut into a second video segment. At this time, when performing cluster analysis on the remaining first video clips, since the current video frame is the first frame of the remaining first video clips, the image feature of the current video frame is used as the re-determined clustering center.
在一实施例中,所述确定所述当前视频帧的图像特征与所述聚类中心的相 似度之后,若所述相似度大于或等于预设阈值,则更新所述聚类中心;根据更新后的所述聚类中心继续确定当前视频帧的图像特征与更新后的所述聚类中心的相似度。In one embodiment, after the similarity between the image feature of the current video frame and the cluster center is determined, if the similarity is greater than or equal to a preset threshold, the cluster center is updated; according to the update The subsequent cluster centers continue to determine the similarity between the image features of the current video frame and the updated cluster centers.
若在第一次滑动后,计算出聚类中心的图像特征C 0与当前帧的图像特征F N+1之间的相似度大于预设阈值,则认为第一视频片段的第一帧视频帧与第一视频片段的第N+1帧视频帧之间的视频内容变化不大,那么更新聚类中心,并根据更新后的聚类中心确定此时待处理的当前视频帧的图像特征与更新后的聚类中心的相似度。 If after the first sliding, the similarity between the image feature C 0 of the cluster center and the image feature F N+1 of the current frame is calculated to be greater than the preset threshold, it is considered that the first video frame of the first video segment is the first video frame. The video content does not change much from the N+1th video frame of the first video clip, then update the cluster center, and determine the image characteristics and update of the current video frame to be processed at this time according to the updated cluster center. the similarity of the cluster centers.
例如,在第一次滑动后,可以根据第一视频片段的第一帧视频帧的图像特征与第一视频片段的第N+1帧视频帧图像特征,计算从第一帧视频帧到第N+1帧视频帧之间的图像特征平均值,并将计算得到的平均值作为更新后的聚类中心的图像特征。For example, after the first sliding, according to the image features of the first video frame of the first video clip and the image features of the N+1 th video frame of the first video clip, the calculation method from the first video frame to the Nth video frame can be calculated. +1 average of image features between video frames, and the calculated average is used as the image feature of the updated cluster center.
在一实施例中,所述更新所述聚类中心,包括:获取所述当前视频帧的图像特征;根据所述当前视频帧的图像特征和所述聚类中心,确定更新后的聚类中心。In one embodiment, the updating of the cluster centers includes: acquiring the image features of the current video frame; determining the updated cluster centers according to the image features of the current video frame and the cluster centers. .
获取当前视频帧的图像特征,然后根据聚类中心的图像特征和当前视频帧的图像特征计算从第一帧视频帧到当前视频帧之间的图像特征平均值,将计算得到的平均值作为更新后的聚类中心的图像特征。Obtain the image features of the current video frame, and then calculate the average value of the image features from the first video frame to the current video frame according to the image features of the cluster center and the current video frame, and use the calculated average as the update. The image features of the cluster centers after.
其中,确定更新后的聚类中心的公式可以为:Among them, the formula for determining the updated cluster center can be:
Figure PCTCN2020142432-appb-000001
Figure PCTCN2020142432-appb-000001
其中,m为聚类次数,N为滑动窗口的大小,C m为第m次聚类后更新后的聚类中心C的图像特征,F m*N+1为第m次聚类时当前视频帧的图像特征。 Among them, m is the number of clustering times, N is the size of the sliding window, C m is the image feature of the updated cluster center C after the m-th clustering, and F m*N+1 is the current video during the m-th clustering Image features of the frame.
例如,第一视频片段中共有10个视频帧,当滑动窗口N的值取1时,在对第一视频片段进行第一次聚类时,初始化聚类中心C为第一帧,且聚类中心的图像特征为C 0,并进行第一次滑动,此时第一次滑动确定的当前视频帧为第二帧,第二帧的图像特征为F 2,比较聚类中心的图像特征C 0与第二帧的图像特征F 2之间的相似度,若相似度大于预设阈值,则重新确定聚类中心,此时第一次聚类后的聚类中心C的图像特征C 1=(C 0*1*1+F 2)/2。 For example, there are 10 video frames in the first video clip. When the value of the sliding window N is 1, when the first video clip is clustered for the first time, the cluster center C is initialized as the first frame, and the clustering The image feature of the center is C 0 , and the first sliding is performed. At this time, the current video frame determined by the first sliding is the second frame, and the image feature of the second frame is F 2 , and the image feature C 0 of the cluster center is compared. The similarity with the image feature F2 of the second frame, if the similarity is greater than the preset threshold, the cluster center is re-determined, and the image feature C1 of the cluster center C after the first clustering at this time = ( C 0 *1*1+F 2 )/2.
然后开始第二次聚类,并进行第二次滑动,此时第二次滑动确定的当前视频帧为第三帧,第三帧的图像特征为F 3,比较第一次聚类后的聚类中心C的图像特征C 1与第三帧的图像特征F 3之间的相似度,若相似度大于预设阈值,则重新确定聚类中心,此时第二次聚类后的聚类中心C的图像特征C 2=(C 1*2*1+F 3)/3。以此类推,直至第一视频片段的第10个视频帧。 Then start the second clustering and perform the second sliding. At this time, the current video frame determined by the second sliding is the third frame, and the image feature of the third frame is F 3 . Compare the clustering after the first clustering. The similarity between the image feature C 1 of the class center C and the image feature F 3 of the third frame, if the similarity is greater than the preset threshold, the cluster center is re-determined, at this time the cluster center after the second clustering The image feature of C is C 2 =(C 1 *2*1+F 3 )/3. And so on, until the 10th video frame of the first video clip.
S1012c、根据所述视频分割点对所述第一视频片段进行视频分割。S1012c: Perform video segmentation on the first video segment according to the video segmentation point.
在确定视频分割点后,根据所述视频分割点对第一视频片段进行视频分割,得到多个第二视频片段。在进行视频分割时,以当前视频帧为视频分割点,将当前视频帧之前的视频帧作为一个第二视频片段,进行分割。After the video division point is determined, video division is performed on the first video segment according to the video division point to obtain a plurality of second video segments. When performing video segmentation, the current video frame is used as the video segmentation point, and the video frame before the current video frame is used as a second video segment to perform segmentation.
举例来说,如图5所示,为相似度计算结果的示意图。第一视频片段共有2300帧,第490帧为一个视频分割点,第1100帧为一个视频分割点,那么在进行视频分割时,分别在第490帧和第1100帧对第一视频片段进行分割,得到三个第二视频片段,分别是第一视频片段的第1帧至第489帧作为一个第二视频片段,第490帧至第1099帧作为一个第二视频片段,第1100帧至第2300帧作为一个第二视频片段。For example, as shown in FIG. 5 , it is a schematic diagram of a similarity calculation result. The first video clip has a total of 2300 frames, the 490th frame is a video split point, and the 1100th frame is a video split point, then when the video is split, the first video clip is split at the 490th frame and the 1100th frame respectively, Three second video clips are obtained, which are the 1st to 489th frames of the first video clip as a second video clip, the 490th to 1099th frames as a second video clip, and the 1100th to 2300th frames. as a second video clip.
S102、根据所述模板的视频坑位的坑位信息,确定待填入所述模板中各个视频坑位的视频片段,得到所述模板对应的匹配关系。S102. Determine, according to the pit information of the video pits of the template, video clips to be filled in each video pit in the template, and obtain a matching relationship corresponding to the template.
其中,模板对应的匹配关系为模板中的视频坑位与对应的视频片段之间的对应关系。The matching relationship corresponding to the template is the corresponding relationship between the video pits in the template and the corresponding video clips.
在确定出待处理视频素材对应的模板后,由于模板中至少包括一个视频坑位,因此,可以根据模板中视频坑位的坑位信息来确定与视频坑位对应的视频片段,从而得到模板对应的匹配关系。After the template corresponding to the video material to be processed is determined, since the template includes at least one video pit, the video clip corresponding to the video pit can be determined according to the pit information of the video pit in the template, so as to obtain the corresponding template matching relationship.
具体地,坑位信息包括坑位音乐和坑位标签中的至少一种。在一些实施方式中,坑位标签可以是预先设置好的,每个视频坑位都可以预先设置好一个坑位标签。Specifically, the pit information includes at least one of pit music and pit tags. In some embodiments, the pit tag may be preset, and each video pit may be preset with a pit tag.
其中,坑位标签包括运镜方向、景别、坑位主题、待填入视频坑位的视频片段的视频主题、待填入视频坑位的视频片段中单一视频帧的目标物大小和位置、待填入视频坑位的视频片段中连续视频帧的目标物大小和位置、待填入视频坑位的视频片段中相邻视频帧的相似度中的至少一种。Wherein, the pit label includes the direction of the mirror movement, the scene, the theme of the pit, the video theme of the video clip to be filled in the video pit, the target size and position of a single video frame in the video clip to be filled in the video pit, At least one of the target size and position of consecutive video frames in the video segment to be filled in the video pit, and the similarity of adjacent video frames in the video segment to be filled in the video pit.
视频坑位的坑位音乐可以是整个模板的模板音乐中的片段,由于模板中包括多个视频坑位,多个视频坑位依次组合为一个模板,因此,对于模板的模板音乐,可以根据视频坑位的顺序和时长,对模板音乐进行拆分,从而得到每个视频坑位对应的坑位音乐。The pit music of the video pit can be a fragment in the template music of the entire template. Since the template includes a plurality of video pits, the plurality of video pits are sequentially combined into a template. Therefore, for the template music of the template, according to the video According to the sequence and duration of the pits, the template music is split to obtain the pit music corresponding to each video pit.
在一实施例中,可以对模板进行分割,将模板分割出多个视频坑位,每个视频坑位中需要填入一个视频片段。其中,可以根据模板的模板音乐对模板进行分割,还可以根据分割出的多个视频片段对模板进行分割。In one embodiment, the template may be divided into multiple video pits, and each video pit needs to be filled with one video segment. The template may be segmented according to the template music of the template, and the template may also be segmented according to a plurality of segmented video clips.
例如,可以将模板的模板音乐按照一定的时间间隔分为多段,每段音乐对应一个视频坑位,该段音乐的时长与视频坑位的时长相等。For example, the template music of the template can be divided into multiple segments according to a certain time interval, each segment of music corresponds to a video pit, and the duration of the music is equal to the duration of the video pit.
再例如,可以根据模板的模板音乐的音乐结构,按照主歌、副歌、过渡句、流行句、间奏等分割为多段,每段音乐对应一个视频坑位,该段音乐的时长与视频坑位的时长相等。Another example, according to the music structure of the template music of the template, according to the main song, the chorus, the transition sentence, the popular sentence, the interlude, etc., it is divided into multiple sections, each piece of music corresponds to a video pit, and the duration of the music is the same as the video pit. The duration of the bits is equal.
再例如,可以根据模板的模板音乐的节奏分为多段,每段音乐对应一个视频坑位,该段音乐的时长与视频坑位的时长相等。For another example, the template music can be divided into multiple segments according to the rhythm of the template, each segment of music corresponds to a video pit, and the duration of the music is equal to the duration of the video pit.
具体地,由于在对待处理视频素材进行分割,得到的多个视频片段的时长可能有所不同,因此,可以根据视频片段的时长对模板进行分割,使分割得到的视频坑位的时长能够恰好等于视频片段的时长。Specifically, since the video material to be processed is divided, the lengths of the obtained multiple video clips may be different. Therefore, the template can be divided according to the time length of the video clips, so that the length of the video pits obtained from the division can be exactly equal to The duration of the video clip.
例如,若对待处理视频素材进行分割时,将30秒的待处理视频素材分割为了三个视频片段,三个视频片段的时长分别为5秒、15秒和10秒。此时,可以对模板按照三个视频片段的时长分割出相应的视频坑位,以便于将视频片段填入相应的视频坑位中。For example, if the to-be-processed video material is divided, the 30-second to-be-processed video material is divided into three video clips, and the durations of the three video clips are 5 seconds, 15 seconds, and 10 seconds respectively. At this time, the template may be divided into corresponding video pits according to the duration of the three video clips, so as to fill the video clips into the corresponding video pits.
其中,在根据视频片段的时长对模板进行分割时,由于有多个视频片段,因此,可以根据视频片段的精彩等级,为模板进行分割。Wherein, when the template is divided according to the duration of the video clips, since there are multiple video clips, the template can be divided according to the highlight level of the video clips.
例如,一个模板中至少要填充入一段精彩等级最高的视频片段,因此,可以在模板中至少分割出一个与精彩等级最高的视频片段的时长相同的视频坑位,用于填入该精彩等级最高的视频片段。For example, a template needs to be filled with at least one video segment with the highest level of excitement. Therefore, at least one video slot with the same duration as the video segment with the highest level of excitement can be segmented into the template for filling in the segment with the highest level of excitement. video clips.
在另一实施例中,可以根据模板的运镜、景别或场景等对模板进行分割。In another embodiment, the template may be segmented according to the movement of the template, scene or scene, and the like.
在一实施例中,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,包括:根据所述模板中视频坑位的坑位音乐确定与所述 视频坑位对应的视频片段。In an embodiment, the determining the video segment corresponding to the video pit according to the pit information of the video pit in the template includes: determining a video segment corresponding to the video pit according to the pit music of the video pit in the template. The video segment corresponding to the video pit.
由于视频坑位的坑位音乐本身具有节奏感,例如是该视频坑位的坑位音乐是舒缓的节奏或爆发的节奏。因此可以根据坑位音乐的节奏感和视频片段的视频内容为视频坑位匹配合适的视频片段。Since the pit music of the video pit itself has a sense of rhythm, for example, the pit music of the video pit is a soothing rhythm or an explosive rhythm. Therefore, suitable video clips can be matched for video pits according to the rhythm of the pit music and the video content of the video clips.
例如,音乐这一段是爆发的,视频片段中如果有喷泉等有力量爆发的画面,那么两者很匹配。另外如果是舒缓的节奏,适合人物慢动作;如果是史诗大片,适合大场景延时摄影等。For example, the music is bursting, and if there is a powerful burst such as a fountain in the video clip, then the two are a good match. In addition, if it is a soothing rhythm, it is suitable for slow motion of characters; if it is an epic blockbuster, it is suitable for large scene time-lapse photography.
在一实施例中,所述根据所述模板中视频坑位的坑位音乐确定与所述视频坑位对应的视频片段,包括:确定所述模板中视频坑位的坑位音乐与视频片段的匹配程度;根据所述匹配程度确定所述模板中视频坑位对应的视频片段。In one embodiment, the determining the video clip corresponding to the video pit according to the pit music of the video pit in the template includes: determining the difference between the pit music of the video pit and the video fragment in the template. Matching degree; the video segment corresponding to the video pit in the template is determined according to the matching degree.
将视频坑位的坑位音乐的节奏感与视频片段进行匹配,确定视频坑位的坑位音乐与视频片段之间的匹配程度,从而根据匹配程度确定模板中视频坑位对应的视频片段。Match the rhythm of the pit music of the video pits with the video clips to determine the matching degree between the pit music of the video pits and the video clips, so as to determine the video clips corresponding to the video pits in the template according to the matching degree.
具体地,可以设置一匹配程度的阈值,当匹配程度超过该阈值时,确定该视频坑位与视频片段相匹配;或者可以根据匹配程度的高低,选择与视频坑位匹配程度最高的视频片段作为视频坑位对应的视频片段。Specifically, a threshold of matching degree can be set, and when the matching degree exceeds the threshold, it is determined that the video pit matches the video clip; or the video clip with the highest matching degree with the video pit can be selected according to the matching degree as the video clip. The video segment corresponding to the video pit.
在一实施例中,所述模板中视频坑位的坑位音乐与视频片段的匹配程度为利用预先训练的音乐匹配模型得到的,所述音乐匹配模型能够输出所述模板中视频坑位的坑位音乐与视频片段的匹配度得分。In one embodiment, the matching degree of the pit music of the video pit in the template and the video clip is obtained by utilizing a pre-trained music matching model, and the music matching model can output the pit of the video pit in the template. The match score for the music and video clips.
通过训练神经网络学习坑位音乐与视频片段之间的匹配度,然后将训练得到的神经网络作为音乐匹配模型,来输出视频坑位的坑位音乐与视频片段之间的匹配度得分。By training a neural network to learn the matching degree between the pit music and the video clip, and then using the trained neural network as a music matching model to output the matching score between the pit music of the video pit and the video clip.
在一实施例中,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段之后,确定所述模板中视频坑位对应的多个所述视频片段的拍摄质量;根据多个所述视频片段的拍摄质量确定所述模板中视频坑位对应的最优视频片段;根据所述模板中视频坑位对应的最优视频片段得到所述模板对应的匹配关系。In one embodiment, after determining the video clips corresponding to the video pits according to the pit information of the video pits in the template, determine a plurality of the video clips corresponding to the video pits in the template. Shooting quality; determine the optimal video fragment corresponding to the video pit in the template according to the shooting quality of a plurality of the video fragments; obtain the matching relationship corresponding to the template according to the optimal video fragment corresponding to the video pit in the template .
其中,视频片段的拍摄质量为根据所述视频片段的图像内容和视频片段评价确定的。图像内容包括镜头稳定性、色彩饱和度、是否有主要拍摄物体和镜 头内信息量等,视频片段评价包括视频片段的美学打分。The shooting quality of the video clip is determined according to the image content of the video clip and the evaluation of the video clip. The image content includes lens stability, color saturation, whether there is a main shooting object and the amount of information in the lens, etc. The video clip evaluation includes the aesthetic score of the video clip.
具体地,视频片段的美学打分可以考虑到色彩、构图、运镜和景别等因素来对视频片段进行美学打分。基于视频片段的图像内容和视频片段评价对视频片段的拍摄质量进行打分,分值越大,说明该视频片段的拍摄质量越高。Specifically, the aesthetic scoring of the video clip may take into account factors such as color, composition, mirror movement, and scene recognition to perform the aesthetic score for the video clip. The shooting quality of the video clip is scored based on the image content of the video clip and the evaluation of the video clip. The higher the score, the higher the shooting quality of the video clip.
根据多个视频片段的拍摄质量,从视频片段中选择出拍摄质量最高的视频片段作为模板中视频坑位对应的最优视频片段,从而得到模板对应的匹配关系。According to the shooting quality of the multiple video clips, the video clip with the highest shooting quality is selected from the video clips as the optimal video clip corresponding to the video pit in the template, so as to obtain the matching relationship corresponding to the template.
在一实施例中,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段之后,确定所述模板中相邻两个视频坑位对应的视频片段之间的匹配度;根据所述匹配度确定与所述视频坑位对应的最优视频片段;根据所述视频坑位对应的最优视频片段得到所述模板对应的匹配关系。In one embodiment, after the video clip corresponding to the video pit is determined according to the pit information of the video pit in the template, the video clip corresponding to two adjacent video pits in the template is determined. The matching degree between them is determined; the optimal video segment corresponding to the video pit is determined according to the matching degree; the matching relationship corresponding to the template is obtained according to the optimal video segment corresponding to the video pit.
所述相邻两个所述视频坑位对应的视频片段的匹配度为根据所述视频片段的运镜方向连贯性、景别的递增递减关系和匹配剪辑确定的。The matching degree of the video clips corresponding to the two adjacent video pits is determined according to the continuity of the moving direction of the video clips, the relationship of increasing and decreasing of the scene, and the matching clip.
其中,运镜方向的连贯性包括保证相邻两个视频坑位所对应视频片段的运镜方向是同方向的,避免相反运镜方向的视频片段被连接在一起。景别的递增递减关系例如包括从远景到中景再到近景,或者从近景到中景再到远景,或者直接从远景到近景等等。匹配剪辑包括通过相似的动作、图形、色彩等匹配衔接两个镜头,以实现连贯、流畅的叙事,用于两个视频片段之间的转场过渡。The continuity of the moving direction includes ensuring that the moving directions of the video clips corresponding to two adjacent video pits are in the same direction, and preventing video clips in opposite moving directions from being connected together. The increasing and decreasing relationship of the scene includes, for example, from a long shot to a medium shot to a close shot, or from a close shot to a medium shot and then to a long shot, or directly from a long shot to a close shot, and so on. Match editing involves bridging two shots by matching similar motion, graphics, color, etc. to achieve a coherent, fluid narrative for transitions between two video clips.
在一实施例中,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,包括:根据所述模板中视频坑位的坑位标签确定与所述视频坑位对应的视频片段。In one embodiment, the determining the video segment corresponding to the video pit according to the pit information of the video pit in the template includes: determining a video segment corresponding to the video pit according to the pit label of the video pit in the template. The video segment corresponding to the video pit.
模板中的每个视频坑位都设置有坑位标签,可以根据模板中视频坑位的坑位标签以及视频片段的标签,来进行标签的匹配,从而将匹配成功的视频片段作为视频坑位对应的视频片段。Each video pit in the template is provided with a pit tag, and the tag can be matched according to the pit tag of the video pit in the template and the tag of the video clip, so that the successfully matched video clip is used as the corresponding video pit. video clips.
在一实施例中,所述根据所述模板中视频坑位的坑位标签确定与所述视频坑位对应的视频片段,包括:确定视频片段的视频标签,将与所述视频坑位的坑位标签相匹配的视频标签对应的视频片段作为待填入所述视频坑位的视频片段。In one embodiment, the determining the video clip corresponding to the video pit according to the pit tag of the video pit in the template includes: determining the video tag of the video fragment, and combining the pit tag of the video pit with the video pit. The video segment corresponding to the video tag whose bit tag matches is used as the video segment to be filled in the video pit.
对视频片段进行标签提取,从而确定视频片段的视频标签,然后根据模板的视频坑位的坑位标签来确定视频坑位对应的视频片段。The tag extraction is performed on the video segment to determine the video tag of the video segment, and then the video segment corresponding to the video pit is determined according to the pit tag of the video pit of the template.
S103、根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。S103. Fill the video clip into the corresponding video slot of the template according to the matching relationship corresponding to the template to obtain a recommended video.
在得到匹配关系后,根据该匹配关系,将视频片段分别填入模板的对应视频坑位,进行视频合成,得到推荐视频。并将推荐视频推荐给用户。After the matching relationship is obtained, according to the matching relationship, the video clips are respectively filled into the corresponding video pits of the template, and video synthesis is performed to obtain a recommended video. and recommend recommended videos to users.
在一实施例中,步骤S103具体为确定所述视频片段的视频时长是否大于所述视频坑位的时长;若所述视频片段的视频时长大于所述视频坑位的时长,对所述视频片段进行片段提取,得到挑选片段。In one embodiment, step S103 specifically determines whether the video duration of the video clip is greater than the duration of the video pit; if the video duration of the video clip is greater than the duration of the video pit, the video clip is Fragment extraction is performed to obtain selected fragments.
在根据匹配关系将视频片段填入对应的视频坑位时,判断视频片段的视频时长是否大于视频坑位的时长。其中,视频坑位的时长决定了该视频坑位可以填入的视频片段的最大时长,因此,当视频片段的视频时长大于视频坑位的时长时,视频片段无法直接填入对应的视频坑位,需要从视频片段中提取相应时长的片段填入对应的视频坑位中。When filling the video clips into the corresponding video pits according to the matching relationship, it is determined whether the video duration of the video clips is greater than the duration of the video pits. Among them, the duration of the video pit determines the maximum duration of the video clip that can be filled in the video slot. Therefore, when the video duration of the video clip is greater than the duration of the video slot, the video clip cannot be directly filled into the corresponding video slot. , it is necessary to extract a segment of the corresponding duration from the video segment and fill it in the corresponding video pit.
其中,所述挑选片段的视频时长小于或等于所述视频坑位的时长。在具体实施过程中,为了将挑选片段填入对应的视频坑位,并保证得到的推荐视频的完整度,在确定挑选片段时,可以使挑选片段的视频时长等于视频坑位的时长。Wherein, the video duration of the selected segment is less than or equal to the duration of the video pit. In the specific implementation process, in order to fill the selected segment into the corresponding video pit and ensure the integrity of the obtained recommended video, when determining the selected segment, the video duration of the selected segment can be equal to the duration of the video pit.
在一实施例中,所述对所述视频片段进行片段提取,得到挑选片段,包括:根据所述视频片段的视频元素,对所述视频片段进行片段提取,得到挑选片段。In an embodiment, the performing segment extraction on the video segment to obtain the selected segment includes: performing segment extraction on the video segment according to the video element of the video segment to obtain the selected segment.
在对视频片段进行片段提取时,可以根据视频片段的视频元素对视频片段进行片段提取,得到挑选片段。When segment extraction is performed on the video segment, segment extraction can be performed on the video segment according to the video element of the video segment to obtain the selected segment.
其中,所述视频元素包括笑脸画面、笑声音频、人物动作、清晰人声、画面构图和美学打分中的至少一种。在提取挑选片段时,可以从视频片段中根据视频元素提取较为精彩的片段作为挑选片段,例如将包括笑脸画面的片段或者美学打分较高的片段作为挑选片段等。Wherein, the video elements include at least one of smiley face pictures, laughter audio, character movements, clear human voices, picture composition and aesthetic scores. When extracting selected clips, more exciting clips can be extracted from video clips as selected clips according to video elements, for example, clips including smiling faces or clips with high aesthetic scores are selected as selected clips.
在一实施例中,步骤S103具体为根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到初始视频;基于所述模板的模板要求对所述初始视频进行图像优化,得到推荐视频。In one embodiment, step S103 is specifically filling the video clips into the corresponding video pits of the template according to the matching relationship corresponding to the template to obtain an initial video; based on the template requirements of the template, the initial video is Perform image optimization to get recommended videos.
在根据模板对应的匹配关系将视频片段填入对应的视频坑位后,得到初始视频,然后可以根据模板要求来对初始视频进行图像优化,将图像优化后的视频作为推荐视频推荐给用户。其中,所述模板要求包括转场设置、加减速设置、 贴图特效设置中的至少一种。After filling the video clips into the corresponding video pits according to the matching relationship corresponding to the template, the initial video is obtained, and then the initial video can be image optimized according to the template requirements, and the image-optimized video is recommended to the user as a recommended video. Wherein, the template requirements include at least one of transition settings, acceleration and deceleration settings, and texture special effects settings.
在一实施例中,由于待处理视频素材的来源可以是无人机拍摄的航拍视频,而航拍时摄像头与拍摄物体之间的距离较远,画面变化内容小。因此,在将航拍视频填入对应的视频坑位时,可以先自动识别画面变化的速度,根据画面变化的速度自动给航拍视频做调速处理,再将调速处理后的航拍视频填入对应的视频坑位中。所述画面变化的速度可以根据预设时间内多个连续帧进行分析得到。In one embodiment, since the source of the video material to be processed may be an aerial video shot by a drone, and the distance between the camera and the photographed object is relatively long during aerial photography, the content of the picture changes is small. Therefore, when filling the aerial video into the corresponding video pit, you can automatically identify the speed of the screen change, automatically adjust the speed of the aerial video according to the speed of the screen change, and then fill in the speed-adjusted aerial video into the corresponding video. in the video pit. The speed of the picture change can be obtained by analyzing a plurality of consecutive frames within a preset time.
在一实施例中,在确定模板的视频坑位与视频片段之间的匹配关系时,对于识别出的航拍视频,可以将航拍视频放在模板的前几个视频坑位和/或后几个视频坑位,从而提高得到的推荐视频的质量。In one embodiment, when determining the matching relationship between the video pits and the video clips of the template, for the identified aerial video, the aerial video can be placed in the first several video pits and/or the latter several of the template. Video pits, thereby improving the quality of the recommended video.
上述实施例提供的视频处理方法,通过对待处理视频素材进行分割,得到多个视频片段,然后根据模板的视频坑位的坑位信息,确定待填入模板中各个视频坑位的视频片段,得到模板对应的匹配关系,最终根据该匹配关系将视频片段填入对应的视频坑位中,得到推荐视频。对待处理视频素材进行分割,将时长较长的待处理视频素材分割为多个时长较短的视频片段,从而使视频片段能够顺利填入视频坑位中,合成推荐视频。The video processing method provided by the above-mentioned embodiment, by dividing the video material to be processed, to obtain a plurality of video clips, and then according to the pit information of the video pits of the template, determine the video clips to be filled in each video pit in the template, and obtain: The matching relationship corresponding to the template is finally filled into the corresponding video pit according to the matching relationship to obtain a recommended video. The to-be-processed video material is divided, and the long-duration to-be-processed video material is divided into multiple short-duration video clips, so that the video clips can be smoothly filled into the video pits to synthesize recommended videos.
请参阅图6,图6是本申请实施例提供的另一种视频处理方法的步骤流程图。Please refer to FIG. 6. FIG. 6 is a flowchart of steps of another video processing method provided by an embodiment of the present application.
具体地,如图6所示,该视频处理方法包括步骤S201至步骤S203。Specifically, as shown in FIG. 6 , the video processing method includes steps S201 to S203.
S201、根据待处理视频素材的视频片段和模板的视频坑位构建流网络图。S201. Construct a stream network diagram according to the video clips of the video material to be processed and the video pits of the template.
待处理视频素材中包括多个视频片段,模板中包括有至少一个视频坑位,每个视频坑位内都需要填充一个视频片段。The video material to be processed includes multiple video clips, the template includes at least one video pit, and each video pit needs to be filled with one video clip.
在一实施例中,也可以采用上述视频分割方法对待处理视频素材进行分割得到多个视频片段。In an embodiment, the above video segmentation method may also be used to segment the video material to be processed to obtain multiple video segments.
其中,所述流网络图包括多个节点,每个所述节点对应一个所述视频片段和一个所述视频坑位的匹配关系。The flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits.
根据视频片段和模板坑位构建流网络图,如图7所示,为构建的流网络图的示意图。其中,左侧纵轴C m表示视频片段,上侧横轴S n表示模板的视频坑位。在流网络图中添加源节点和终点,其中,S节点为源节点,T节点为终点。 在流网络图中,坐标(x,y)表示位于横轴x,纵轴y上的节点。例如,N(1,1)表示左上方(S 1,C 1)的节点,N(n,m)表示右下方(S n,C m)的节点。 A flow network graph is constructed according to the video clips and template pits, as shown in FIG. 7 , which is a schematic diagram of the constructed flow network graph. Wherein, the left vertical axis C m represents the video segment, and the upper horizontal axis Sn represents the video pit position of the template. Add a source node and an end point to the flow network diagram, where the S node is the source node and the T node is the end point. In the flow network diagram, the coordinates (x, y) represent the nodes located on the horizontal axis x and the vertical axis y. For example, N(1,1) represents the node in the upper left (S 1 , C 1 ), and N(n,m) represents the node in the lower right (S n , C m ).
需要说明的是,模板的数量可以为一个,也可以为多个。当模板数量为一个时,构建一个流网络图,当模板数量为多个时,为每一个模板构建一个流网络图。It should be noted that the number of templates may be one or multiple. When the number of templates is one, a flow network graph is constructed, and when the number of templates is multiple, a flow network graph is constructed for each template.
S202、基于所述流网络图确定所述视频片段与所述视频坑位对应的匹配关系。S202. Determine, based on the flow network graph, a matching relationship corresponding to the video segment and the video pit.
在绘制出流网络图后,可以基于流网络图中的各个节点确定模板中的视频片段与视频坑位对应的匹配关系。以流网络图中的源节点作为路径的起点,路径经过每一个视频坑位对应的一个节点,以流网络图的终点作为路径的终点,该路径即为模板中的视频片段与视频坑位对应的匹配关系。After the streaming network graph is drawn, the matching relationship between the video clips in the template and the video pits can be determined based on each node in the streaming network graph. The source node in the stream network graph is used as the starting point of the path, the path passes through a node corresponding to each video pit, and the end point of the stream network graph is used as the end point of the path, and the path corresponds to the video clip in the template and the video pit. matching relationship.
在一实施例中,所述基于所述流网络图确定所述视频片段与所述视频坑位对应的匹配关系,包括:基于最大流算法为所述模板的视频坑位匹配合适的视频片段,得到最优路径;其中,将所述最优路径中所述视频片段与所述视频坑位之间的对应关系作为所述模板的视频坑位与所述视频片段的匹配关系。In one embodiment, the determining the matching relationship between the video clips and the video pits based on the flow network diagram includes: matching a suitable video clip for the video pits of the template based on a maximum flow algorithm, Obtain an optimal path; wherein, the corresponding relationship between the video segment and the video pit in the optimal path is used as the matching relationship between the video pit of the template and the video segment.
请参阅图7,图7中的箭头表示流网络图中相邻两节点之间的路径。从源节点S到终点T的一条路径为视频片段与视频坑位对应的一个匹配关系。Please refer to Fig. 7. The arrows in Fig. 7 represent paths between two adjacent nodes in the flow network graph. A path from the source node S to the end point T is a matching relationship corresponding to the video segment and the video pit.
其中,最大流算法是指对于一个模板,依次将视频坑位S 1至S n的n个视频坑位中填入合适的视频片段,使得从源节点S到终点T的整条路径的能量总值最大。利用最大流算法为模板的视频坑位匹配合适的视频片段,从而将视频片段的选择问题转化为求取从源节点S到终点T的最大能量问题。 Among them, the maximum flow algorithm refers to that for a template, the n video pits from the video pits S 1 to S n are filled with appropriate video segments in turn, so that the energy of the entire path from the source node S to the end point T is total. maximum value. Using the maximum flow algorithm as the template video pits to match the appropriate video clips, the problem of video clip selection is transformed into the problem of finding the maximum energy from the source node S to the end point T.
在一实施例中,所述基于最大流算法为所述模板的视频坑位匹配合适的视频片段,得到最优路径,包括:根据所述流网络图中相邻两个节点之间的能量值,确定所述模板对应的最优路径。In one embodiment, the matching of suitable video clips for the video pits of the template based on the maximum flow algorithm to obtain the optimal path includes: according to the energy value between two adjacent nodes in the flow network graph , and determine the optimal path corresponding to the template.
其中,可以用V表示流网络图中任意一条路径的能量值,任意一条路径包括两个节点之间的路径。例如,V((x,y),(x+1,y+k))表示节点(x,y)到节点(x+1,y+k)的能量值。通过计算从源节点S到终点T的一条路径中每两个相邻节点之间的能量值,将能量总值最大的路线为最优路径,确定模板对应的最优路径。Among them, V can be used to represent the energy value of any path in the flow network graph, and any path includes the path between two nodes. For example, V((x,y),(x+1,y+k)) represents the energy value from node (x,y) to node (x+1,y+k). By calculating the energy value between every two adjacent nodes in a path from the source node S to the end point T, the route with the largest energy total value is the optimal path, and the optimal path corresponding to the template is determined.
在一实施例中,可以根据每个所述节点的能量值影响因子确定相邻两个节点之间的能量值。In one embodiment, the energy value between two adjacent nodes may be determined according to the energy value influence factor of each of the nodes.
其中,能量值影响因子包括每个所述视频坑位对应的视频片段的拍摄质量、每个所述视频坑位与对应的视频片段的匹配程度和相邻两个所述视频坑位对应的视频片段的匹配度中的至少一种。The energy value influence factor includes the shooting quality of the video clip corresponding to each of the video pits, the degree of matching between each of the video pits and the corresponding video clip, and the video corresponding to two adjacent video pits At least one of the matching degrees of the fragments.
具体地,每个所述视频坑位对应的视频片段的拍摄质量为根据所述视频片段的图像内容和视频片段评价确定的。Specifically, the shooting quality of the video clip corresponding to each of the video pits is determined according to the image content of the video clip and the evaluation of the video clip.
其中,图像内容包括镜头稳定性、色彩饱和度、是否有主要拍摄物体和镜头内信息量等,视频片段评价包括视频片段的美学打分。视频片段的美学打分可以考虑到色彩、构图、运镜和景别等因素来对视频片段进行美学打分。基于视频片段的图像内容和视频片段评价对视频片段的拍摄质量进行打分,分值越大,说明该视频片段的拍摄质量越高。Among them, the image content includes lens stability, color saturation, whether there is a main shooting object and the amount of information in the lens, etc. The video clip evaluation includes the aesthetic score of the video clip. Aesthetic scoring of video clips can take into account factors such as color, composition, lens movement, and scene recognition to score video clips aesthetically. The shooting quality of the video clip is scored based on the image content of the video clip and the evaluation of the video clip. The higher the score, the higher the shooting quality of the video clip.
具体地,所述每个所述视频坑位与对应的视频片段的匹配程度为根据所述视频坑位的坑位音乐与所述视频片段的匹配度确定的。Specifically, the degree of matching between each of the video pits and the corresponding video segment is determined according to the degree of matching between the pit music of the video pit and the video segment.
其中,模板中预设有模板音乐,视频坑位的坑位音乐可以是整个模板的模板音乐中的片段,由于模板中包括多个视频坑位,多个视频坑位依次组合为一个模板,因此,对于模板的模板音乐,可以根据视频坑位的顺序和时长,对模板音乐进行拆分,从而得到每个视频坑位对应的坑位音乐。Wherein, template music is preset in the template, and the pit music of the video pits may be fragments of the template music of the entire template. Since the template includes multiple video pits, the plurality of video pits are sequentially combined into a template, so , for the template music of the template, the template music can be split according to the sequence and duration of the video pits, so as to obtain the pit music corresponding to each video pit.
音乐本身具有节奏感,对于模板中每个视频坑位所对应的坑位音乐的片段也具有相应的节奏感,例如坑位音乐是舒缓的节奏或爆发的节奏。因此可以根据坑位音乐的节奏感和视频片段的视频内容对模板坑位和视频片段进行匹配。The music itself has a sense of rhythm, and the segment of the pit music corresponding to each video pit in the template also has a corresponding rhythm, for example, the pit music is a soothing rhythm or an explosive rhythm. Therefore, template pits and video clips can be matched according to the rhythm of the pit music and the video content of the video clips.
例如,音乐这一段是爆发的,视频片段中如果有喷泉等有力量爆发的画面,那么两者很匹配。另外如果是舒缓的节奏,适合人物慢动作;如果是史诗大片,适合大场景延时摄影等。For example, the music is bursting, and if there is a powerful burst such as a fountain in the video clip, then the two are a good match. In addition, if it is a soothing rhythm, it is suitable for slow motion of characters; if it is an epic blockbuster, it is suitable for large scene time-lapse photography.
在一实施例中,所述每个所述视频坑位与对应的视频片段的匹配程度为利用预先训练的音乐匹配模型得到的,所述音乐匹配模型能够输出所述视频坑位的坑位音乐与所述视频片段的匹配度得分。In one embodiment, the degree of matching of each of the video pits and the corresponding video clips is obtained by utilizing a pre-trained music matching model, and the music matching model can output the pit music of the video pits. A match score with the video clip.
其中,可以通过训练神经网络学习坑位音乐与视频片段之间的匹配度,然后将训练得到的神经网络作为音乐匹配模型,来输出视频坑位的坑位音乐与视 频片段之间的匹配度得分。Among them, the matching degree between the pit music and the video clip can be learned by training the neural network, and then the neural network obtained by training can be used as a music matching model to output the matching degree score between the pit music and the video clip of the video pit. .
所述相邻两个所述视频坑位对应的视频片段的匹配度为根据所述视频片段的运镜方向连贯性、景别的递增递减关系和匹配剪辑确定的。The matching degree of the video clips corresponding to the two adjacent video pits is determined according to the continuity of the moving direction of the video clips, the relationship of increasing and decreasing of the scene, and the matching clip.
其中,运镜方向的连贯性包括保证相邻两个视频坑位所对应视频片段的运镜方向是同方向的,避免相反运镜方向的视频片段被连接在一起。景别的递增递减关系例如包括从远景到中景再到近景,或者从近景到中景再到远景,或者直接从远景到近景等等。匹配剪辑包括通过相似的动作、图形、色彩等匹配衔接两个镜头,以实现连贯、流畅的叙事,用于两个视频片段之间的转场过渡。The continuity of the moving direction includes ensuring that the moving directions of the video clips corresponding to two adjacent video pits are in the same direction, and preventing video clips in opposite moving directions from being connected together. The increasing and decreasing relationship of the scene includes, for example, from a long shot to a medium shot to a close shot, or from a close shot to a medium shot and then to a long shot, or directly from a long shot to a close shot, and so on. Match editing involves bridging two shots by matching similar motion, graphics, color, etc. to achieve a coherent, fluid narrative for transitions between two video clips.
在一实施例中,所述相邻两个所述视频坑位对应的视频片段的匹配度为利用预先训练的片段匹配模型得到的,所述片段匹配模型能够输出相邻两个所述视频坑位填入的视频片段的匹配度。In one embodiment, the degree of matching of the video clips corresponding to the two adjacent video pits is obtained by using a pre-trained clip matching model, and the clip matching model can output two adjacent video pits. The match degree of the bit-stuffed video clip.
其中,可以通过训练神经网络学习相邻两个视频坑位对应的视频片段之间的匹配度,然后将训练得到的神经网络作为片段匹配模型,来输出相邻两个视频坑位所对应的视频片段之间的匹配度得分。Among them, the matching degree between the video clips corresponding to two adjacent video pits can be learned by training a neural network, and then the trained neural network can be used as a fragment matching model to output the video corresponding to the two adjacent video pits. Match score between segments.
在一实施例中,所述根据每个所述节点的能量值影响因子确定相邻两个节点之间的能量值,包括:获取所述能量值影响因子的评价分数和预设权重;根据所述能量值影响因子的评价分数和预设权重确定相邻两个节点之间的能量值。In an embodiment, the determining the energy value between two adjacent nodes according to the energy value influencing factor of each of the nodes includes: acquiring an evaluation score and a preset weight of the energy value influencing factor; The energy value between two adjacent nodes is determined by the evaluation score and preset weight of the energy value influence factor.
对于流网络图中的每一个节点,获取每个节点的能量值影响因子的评价分数和对应的预设权重,从而确定相邻两个节点之间的能量值。其中,预设权重包括不同的能量值影响因子所对应的权重,可以是根据经验值预先设置好的。For each node in the flow network graph, the evaluation score of the energy value influencing factor of each node and the corresponding preset weight are obtained, so as to determine the energy value between two adjacent nodes. The preset weights include weights corresponding to different energy value influencing factors, which may be preset according to empirical values.
例如,流网络图中任意一条路径中的能量值V的计算公式为:For example, the formula for calculating the energy value V in any path in the flow network graph is:
V=a*E clip+b*E template+c*E match V=a*E clip +b*E template +c*E match
其中,E clip表示视频坑位对应的视频片段的拍摄质量的得分,a表示视频坑位对应的视频片段的拍摄质量的得分对应的预设权重,E template表示视频坑位与对应的视频片段的匹配程度,b表示视频坑位与对应的视频片段的匹配程度对应的预设权重,E match表示两两相邻的两个视频坑位对应的视频片段的匹配度之和,c表示相邻两个视频坑位对应的视频片段的匹配度对应的预设权重。 Wherein, E clip represents the score of the shooting quality of the video clip corresponding to the video pit, a represents the preset weight corresponding to the score of the shooting quality of the video clip corresponding to the video pit, and E template represents the difference between the video pit and the corresponding video clip Matching degree, b represents the preset weight corresponding to the matching degree between the video pit and the corresponding video clip, E match represents the sum of the matching degrees of the video clips corresponding to the two adjacent video pits, and c represents the adjacent two video clips. The preset weight corresponding to the matching degree of the video clips corresponding to the video pits.
需要说明的是,在计算相邻两个节点之间的能量值时,若视频片段的时长 小于填入的视频坑位的时长,则设置该节点的能量值为0。It should be noted that when calculating the energy value between two adjacent nodes, if the duration of the video clip is less than the duration of the filled video pit, the energy value of the node is set to 0.
S203、根据所述匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。S203. Fill the video clip into the corresponding video slot of the template according to the matching relationship to obtain a recommended video.
在确定出视频片段与视频坑位之间的匹配关系后,根据该匹配关系,将视频片段填入对应的视频坑位,从而合成出推荐视频,并将推荐视频推荐给用户。After the matching relationship between the video clips and the video pits is determined, according to the matching relationship, the video clips are filled in the corresponding video pits, thereby synthesizing a recommended video, and recommending the recommended video to the user.
在一实施例中,步骤S203具体为确定所述视频片段的视频时长是否大于所述视频坑位的时长;若所述视频片段的视频时长大于所述视频坑位的时长,对所述视频片段进行片段提取,得到挑选片段。In an embodiment, step S203 is specifically to determine whether the video duration of the video clip is greater than the duration of the video pit; if the video duration of the video clip is greater than the duration of the video pit, the video clip is Fragment extraction is performed to obtain selected fragments.
在根据匹配关系将视频片段填入对应的视频坑位时,判断视频片段的视频时长是否大于视频坑位的时长。其中,视频坑位的时长决定了该视频坑位可以填入的视频片段的最大时长,因此,当视频片段的视频时长大于视频坑位的时长时,视频片段无法直接填入对应的视频坑位,需要从视频片段中提取相应时长的片段填入对应的视频坑位中。When filling the video clips into the corresponding video pits according to the matching relationship, it is determined whether the video duration of the video clips is greater than the duration of the video pits. Among them, the duration of the video pit determines the maximum duration of the video clip that can be filled in the video slot. Therefore, when the video duration of the video clip is greater than the duration of the video slot, the video clip cannot be directly filled into the corresponding video slot. , it is necessary to extract a segment of the corresponding duration from the video segment and fill it in the corresponding video pit.
其中,所述挑选片段的视频时长小于或等于所述视频坑位的时长。在具体实施过程中,为了将挑选片段填入对应的视频坑位,并保证得到的推荐视频的完整度,在确定挑选片段时,可以使挑选片段的视频时长等于视频坑位的时长。Wherein, the video duration of the selected segment is less than or equal to the duration of the video pit. In the specific implementation process, in order to fill the selected segment into the corresponding video pit and ensure the integrity of the obtained recommended video, when determining the selected segment, the video duration of the selected segment can be equal to the duration of the video pit.
在一实施例中,所述对所述视频片段进行片段提取,得到挑选片段,包括:根据所述视频片段的视频元素,对所述视频片段进行片段提取,得到挑选片段。In an embodiment, the performing segment extraction on the video segment to obtain the selected segment includes: performing segment extraction on the video segment according to the video element of the video segment to obtain the selected segment.
在对视频片段进行片段提取时,可以根据视频片段的视频元素对视频片段进行片段提取,得到挑选片段。When segment extraction is performed on the video segment, segment extraction can be performed on the video segment according to the video element of the video segment to obtain the selected segment.
其中,所述视频元素包括笑脸画面、笑声音频、人物动作、清晰人声、画面构图和美学打分中的至少一种。在提取挑选片段时,可以从视频片段中根据视频元素提取较为精彩的片段作为挑选片段,例如将包括笑脸画面的片段或者美学打分较高的片段作为挑选片段等。Wherein, the video elements include at least one of smiley face pictures, laughter audio, character movements, clear human voices, picture composition and aesthetic scores. When extracting selected clips, more exciting clips can be extracted from video clips as selected clips according to video elements, for example, clips including smiling faces or clips with high aesthetic scores are selected as selected clips.
在一实施例中,步骤S203具体为根据所述匹配关系将所述视频片段填入所述模板的对应视频坑位,得到初始视频;基于所述模板的模板要求对所述初始视频进行图像优化,得到推荐视频。In one embodiment, step S203 is specifically filling the video clips into the corresponding video pits of the template according to the matching relationship to obtain an initial video; performing image optimization on the initial video based on the template requirements of the template to get recommended videos.
在根据匹配关系将视频片段填入对应的视频坑位后,得到初始视频,然后可以根据模板要求来对初始视频进行图像优化,将图像优化后的视频作为推荐 视频推荐给用户。其中,所述模板要求包括转场设置、加减速设置、贴图特效设置中的至少一种。After filling the video clips into the corresponding video pits according to the matching relationship, the initial video is obtained, and then the initial video can be image-optimized according to the template requirements, and the image-optimized video is recommended to the user as a recommended video. The template requirements include at least one of transition settings, acceleration and deceleration settings, and texture special effects settings.
在一实施例中,由于待处理视频素材的来源可以是无人机拍摄的航拍视频,而航拍时摄像头与拍摄物体之间的距离较远,画面变化内容小。因此,在将航拍视频填入对应的视频坑位时,可以先自动识别画面变化的速度,根据画面变化的速度自动给航拍视频做调速处理,再将调速处理后的航拍视频填入对应的视频坑位中。所述画面变化的速度可以根据预设时间内多个连续帧进行分析得到。In one embodiment, since the source of the video material to be processed may be an aerial video shot by a drone, and the distance between the camera and the photographed object is relatively long during aerial photography, the content of the picture changes is small. Therefore, when filling the aerial video into the corresponding video pit, you can automatically identify the speed of the screen change, automatically adjust the speed of the aerial video according to the speed of the screen change, and then fill in the speed-adjusted aerial video into the corresponding video. in the video pit. The speed of the picture change can be obtained by analyzing a plurality of consecutive frames within a preset time.
在一实施例中,在确定模板的视频坑位与视频片段之间的匹配关系时,对于识别出的航拍视频,可以将航拍视频放在模板的前几个视频坑位和/或后几个视频坑位,从而提高得到的推荐视频的质量。In one embodiment, when determining the matching relationship between the video pits and the video clips of the template, for the identified aerial video, the aerial video can be placed in the first several video pits and/or the latter several of the template. Video pits, thereby improving the quality of the recommended video.
上述实施例提供了一种视频处理方法,利用视频片段和模板坑位构建流网络图,再基于流网络图确定视频片段与视频坑位之间的匹配关系,最终根据匹配关系将视频片段填入模板的对应视频坑位,得到推荐视频。采用构建流网络图的方式确定视频片段与视频坑位之间的匹配关系,将视频片段的选择问题建模成最大流问题,提高确定的视频片段与视频坑位之间的匹配关系的准确度与便捷度。The above-described embodiment provides a video processing method, which utilizes video clips and template pits to construct a flow network diagram, then determines the matching relationship between the video clips and the video pits based on the flow network diagram, and finally fills in the video clips according to the matching relationship. The corresponding video pit of the template, get the recommended video. The matching relationship between video clips and video pits is determined by constructing a flow network graph, the selection problem of video clips is modeled as a maximum flow problem, and the accuracy of the matching relationship between the determined video clips and video pits is improved. and convenience.
请参阅图8,图8是本申请实施例提供的另一种视频处理方法的步骤流程图。Please refer to FIG. 8 , which is a flowchart of steps of another video processing method provided by an embodiment of the present application.
具体地,如图8所示,该视频处理方法包括步骤S301至步骤S303。Specifically, as shown in FIG. 8 , the video processing method includes steps S301 to S303.
S301、根据待处理视频素材的视频信息确定模板。S301. Determine a template according to video information of the video material to be processed.
其中,所述模板至少包括一个视频坑位,待处理视频素材的视频信息包括待处理视频素材的视频内容,根据待处理视频素材的视频内容确定出对应的模板。在具体实施过程中,模板的数量可以为一个,也可以为多个。The template includes at least one video pit, the video information of the video material to be processed includes the video content of the video material to be processed, and the corresponding template is determined according to the video content of the video material to be processed. In a specific implementation process, the number of templates may be one or multiple.
具体地,如图9所示,待处理视频素材可以包括多种来源的视频素材,例如可以是通过手持端拍摄的视频素材、通过可移动平台拍摄的视频素材、从云端服务器获取的视频素材以及从本地服务器获取的视频素材等。Specifically, as shown in FIG. 9 , the video material to be processed may include video material from various sources, such as video material captured by a handheld terminal, video material captured by a movable platform, video material obtained from a cloud server, and Video material obtained from the local server, etc.
其中,手持端例如可以是手机、平板和运动相机等,可移动平台例如可以是无人机等。其中,无人机可以为旋翼型无人机,例如四旋翼无人机、六旋翼 无人机、八旋翼无人机,也可以是固定翼无人机。该无人机上带有摄像设备。Wherein, the handheld terminal may be, for example, a mobile phone, a tablet, a motion camera, etc., and the movable platform may be, for example, a drone. Among them, the UAV can be a rotary-wing UAV, such as a quad-rotor UAV, a hexa-rotor UAV, an octa-rotor UAV, or a fixed-wing UAV. The drone has camera equipment on it.
通过对不同来源的视频素材进行汇总,并通过混剪将不同来源的视频素材剪辑在一起,增加得到的推荐视频的视频多样性。By summarizing the video materials from different sources, and editing the video materials from different sources together by mixing and cutting, the video diversity of the recommended videos is increased.
在一实施例中,由于待处理视频素材的来源可以是无人机拍摄的航拍视频,因此,当待处理视频素材中包括航拍视频时,在根据待处理视频素材的视频信息确定模板时,可以先识别待处理视频素材是否为航拍素材,当识别出待处理视频素材为航拍素材时,为待处理视频素材匹配航拍主题的模板,从而完成模板的确定。In one embodiment, since the source of the video material to be processed may be an aerial video shot by a drone, when the video material to be processed includes an aerial video, when determining the template according to the video information of the video material to be processed, the First identify whether the video material to be processed is an aerial photography material, and when it is identified that the video material to be processed is an aerial photography material, match the template of the aerial photography theme for the to-be-processed video material, thereby completing the determination of the template.
在一实施例中,在根据待处理视频素材的视频信息确定模板时,可以提取模板的模板特征,以及待处理视频素材的视频特征,通过将模板特征和视频特征进行匹配,为待处理视频素材确定模板。In one embodiment, when the template is determined according to the video information of the video material to be processed, the template feature of the template and the video feature of the video material to be processed can be extracted, and the template feature and the video feature are matched to obtain the video material to be processed. Determine the template.
其中,模板特征可以是模板的模板主题、运镜方向、景别、模板内所需填入的待处理视频素材中单一视频帧的目标物大小和位置、模板内所需填入的待处理视频素材中连续视频帧的目标物大小和位置和模板内所需填入的待处理视频素材中相邻视频帧的相似度等。The template features may be the template theme, camera movement direction, scene type, target size and position of a single video frame in the video material to be processed that need to be filled in the template, and the to-be-processed video that needs to be filled in the template. The target size and position of consecutive video frames in the material and the similarity between adjacent video frames in the video material to be processed that need to be filled in the template, etc.
待处理视频素材的视频特征可以包括视频主题、运镜方向、景别、待处理视频素材中单一视频帧的目标物大小和位置、待处理视频素材中连续视频帧的目标物大小和位置和待处理视频素材中相邻视频帧的相似度等。The video features of the video material to be processed may include the subject of the video, the direction of the camera movement, the scene, the target size and position of a single video frame in the video material to be processed, the target size and position of consecutive video frames in the video material to be processed, and the target size and position of the Process the similarity of adjacent video frames in the video material, etc.
示例性地,可以预先训练特征匹配模型,来将模板特征和视频特征进行匹配,并根据特征匹配模型输出的匹配结果为待处理视频素材确定模板。Exemplarily, a feature matching model may be pre-trained to match template features with video features, and a template may be determined for the video material to be processed according to a matching result output by the feature matching model.
在一实施例中,请参阅图10,步骤S301包括步骤S3011和步骤S3012。In an embodiment, please refer to FIG. 10 , step S301 includes step S3011 and step S3012.
S3011、根据待处理视频素材的视频信息确定所述待处理视频素材的视频标签。S3011. Determine a video tag of the to-be-processed video material according to the video information of the to-be-processed video material.
对待处理视频素材进行视频标签提取,根据待处理视频素材的视频信息确定待处理视频素材的视频标签。The video tag extraction is performed on the video material to be processed, and the video tag of the video material to be processed is determined according to the video information of the video material to be processed.
其中,视频标签包括运镜方向、景别、所述待处理视频素材中单一视频帧的目标物大小和位置、所述待处理视频素材中连续视频帧的目标物大小和位置、所述待处理视频素材中相邻视频帧的相似度中的至少一种。Wherein, the video tag includes the moving direction, the scene, the target size and position of a single video frame in the to-be-processed video material, the target size and position of consecutive video frames in the to-be-processed video material, the target size and position of the to-be-processed video material At least one of the similarities of adjacent video frames in the video material.
具体地,对于运镜方向,可以通过一些检测运镜方向的算法来判断待处理 视频素材中运镜方向的变化。景别根据待处理视频素材中的视频内容不同也有所不同,对于人物来说,景别有全景、中景、近景和特景(也即特写),对于物体来说,景别有远景和近景。所述待处理视频素材中单一视频帧的目标物大小和位置为利用物体检测算法或显著性检测算法确定的。所述待处理视频素材中连续视频帧的目标物大小和位置为基于预先训练的神经网络模型确定的。Specifically, for the moving direction of the mirror, some algorithms for detecting the moving direction of the mirror can be used to determine the change of the moving direction of the mirror in the video material to be processed. Scenes differ according to the video content in the video material to be processed. For characters, scenes include panoramic, medium, close-up and special scenes (that is, close-ups), and for objects, scenes include long-range and close-up. . The target size and position of a single video frame in the video material to be processed are determined by using an object detection algorithm or a saliency detection algorithm. The target size and position of the continuous video frames in the video material to be processed are determined based on a pre-trained neural network model.
S3012、根据所述待处理视频素材的视频标签,确定与所述视频标签相匹配的多个模板。S3012. Determine, according to the video tag of the video material to be processed, multiple templates matching the video tag.
模板包括模板标签,模板标签可以是在设置模板时预先设置好的。在得到待处理视频素材的视频标签后,即可根据待处理视频素材的视频标签和模板标签进行匹配,从而为待处理视频素材匹配模板。The template includes a template tag, and the template tag can be preset when setting the template. After the video tag of the video material to be processed is obtained, matching can be performed according to the video tag of the video material to be processed and the template tag, so as to match the template for the video material to be processed.
在一实施例中,请参阅图11,根据视频标签确定匹配的多个模板的步骤包括步骤S3012a和步骤S3012b。S3012a、根据所述视频标签确定所述待处理视频素材对应的视频主题;S3012b、根据所述待处理视频素材的视频主题,确定与所述视频主题相匹配的多个模板。In one embodiment, please refer to FIG. 11 , the step of determining a plurality of matching templates according to the video tag includes steps S3012a and S3012b. S3012a. Determine a video theme corresponding to the video material to be processed according to the video tag; S3012b. Determine a plurality of templates matching the video theme according to the video theme of the video material to be processed.
根据提取出的视频标签确定待处理视频素材对应的视频主题。可以通过待处理视频素材中单一视频帧和/或连续视频帧的视频标签来确定视频主题,例如连续视频帧中的目标物为塔楼,则确定待处理视频素材对应的视频主题为旅行。Determine the video theme corresponding to the video material to be processed according to the extracted video tag. The video theme can be determined by the video tags of a single video frame and/or continuous video frames in the video material to be processed. For example, if the target in the continuous video frame is a tower, the video theme corresponding to the video material to be processed is determined to be travel.
其中,视频主题可以是一个主题大类,也可以包括主题大类下的小类。例如,视频主题可以是旅行、美食、亲子等等,在旅行这一主题大类下,视频主题还可以是旅行-自然风光、旅行-城市、旅行-人文古迹等等。Among them, the video topic can be a topic category, and can also include subcategories under the topic category. For example, the subject of the video can be travel, food, parent-child, etc. Under the subject category of travel, the subject of the video can also be travel-natural scenery, travel-city, travel-humanistic monuments, and so on.
模板标签可以是模板主题。在确定出待处理视频素材的主题后,将待处理视频素材的主题与模板主题进行匹配,从而确定出与视频主题相匹配的多个模板。Template tags can be template themes. After the subject of the video material to be processed is determined, the subject of the video material to be processed is matched with the template subject, so as to determine a plurality of templates matching the video subject.
在一实施例中,若无法确定所述待处理视频素材对应的视频主题,则选择预设模板作为所述待处理视频素材对应的模板。In one embodiment, if the video theme corresponding to the video material to be processed cannot be determined, a preset template is selected as the template corresponding to the video material to be processed.
其中,预设模板可以是万能模板,也即可以应用在多种场景下的模板。无法确定待处理视频素材对应的视频主题既可以是未识别到明确的视频主题,还可以是没有与待处理视频素材的视频主题对应的模板。The preset template may be a universal template, that is, a template that can be applied in various scenarios. It cannot be determined that the video theme corresponding to the video material to be processed may be either an unrecognized video theme or a template that does not correspond to the video theme of the video material to be processed.
在一实施例中,所述根据所述待处理视频素材的视频主题,确定与所述视频主题相匹配的多个模板,包括:根据所述模板的模板影响因子,从与所述视频主题相匹配的多个模板中,确定所述待处理视频素材对应的模板。In one embodiment, the determining a plurality of templates matching the video theme according to the video theme of the video material to be processed includes: according to the template influence factor of the template, from a template matching the video theme Among the multiple matched templates, a template corresponding to the video material to be processed is determined.
在根据视频主题对模板进行粗筛选,确定出与视频主题相匹配的多个模板后,根据模板的模板影响因子,对模板进行二次筛选,从而确定出待处理视频素材对应的模板。其中,二次筛选确定的待处理视频素材对应的模板可以是一个,也可以是多个。After rough screening of templates according to the video theme, after multiple templates matching the video theme are determined, the templates are screened a second time according to the template influence factor of the template, so as to determine the template corresponding to the video material to be processed. Wherein, the template corresponding to the to-be-processed video material determined by the secondary screening may be one or multiple.
其中,所述模板影响因子包括音乐匹配度、模板热度和用户喜好度中的至少一种。Wherein, the template influence factor includes at least one of music matching degree, template popularity and user preference.
具体地,每个模板都预先设置有模板音乐,将模板音乐与待处理视频素材进行匹配,确定模板音乐与待处理视频素材之间的匹配度得分。其中,音乐匹配度为根据预先训练好的音乐推荐网络模型得到的,所述音乐推荐网络模型能够输出与所述视频主题相匹配的多个模板的模板音乐与所述待处理视频素材的匹配度得分。Specifically, each template is preset with template music, and the template music and the video material to be processed are matched to determine the matching degree score between the template music and the video material to be processed. The music matching degree is obtained according to a pre-trained music recommendation network model, which can output the matching degree of template music of multiple templates matching the video theme and the video material to be processed Score.
所述模板热度为根据与所述视频主题相匹配的多个模板被使用的频次和/或点赞数确定的。其中,可以获取所有使用模板的用户对各个模板的使用频次和/或点赞数,根据所有使用模板的用户对模板的选择情况来确定模板热度。The template popularity is determined according to the frequency of use and/or the number of likes of multiple templates matching the video theme. The usage frequency and/or the number of likes for each template by all users who use the template can be obtained, and the popularity of the template can be determined according to the selection of the template by all the users who use the template.
所述用户喜好度为根据所述用户对与所述视频主题相匹配的多个模板的选择频次和/或满意度打分确定的。其中,在用户多次使用后,根据用户的使用习惯,获取用户对各个模板的使用频次和/或满意度打分,由此来确定用户喜好度。The user preference is determined according to the user's selection frequency and/or satisfaction score of multiple templates matching the video theme. Wherein, after the user uses multiple times, the user's usage frequency and/or satisfaction score of each template is obtained according to the user's usage habits, thereby determining the user's preference.
在一实施例中,所述根据所述模板的模板影响因子,从与所述视频主题相匹配的多个模板中,确定所述待处理视频素材对应的模板,包括:获取所述模板影响因子的评价分数和预设权重;根据所述模板影响因子的评价分数和预设权重,确定与所述视频主题相匹配的多个模板的模板打分;根据所述模板打分确定所述待处理视频素材对应的模板。In one embodiment, the determining the template corresponding to the to-be-processed video material from a plurality of templates matching the video theme according to the template impact factor of the template includes: acquiring the template impact factor According to the evaluation score and preset weight of the template influence factor, determine the template score of multiple templates matching the video theme; determine the to-be-processed video material according to the template score corresponding template.
获取每个模板的模板影响因子的评价分数和对应的预设权重,从而确定每个模板的模板打分,根据每个模板的模板打分来从多个模板中确定待处理视频素材对应的模板。其中,预设权重包括不同的模板影响因子所对应的权重,可 以是根据经验值预先设置好的。Obtain the evaluation score of the template influence factor of each template and the corresponding preset weight, so as to determine the template score of each template, and determine the template corresponding to the video material to be processed from multiple templates according to the template score of each template. The preset weights include weights corresponding to different template influence factors, which may be preset according to empirical values.
例如,每个模板的模板打分的计算公式为:For example, the template score for each template is calculated as:
M=A*E music+B*E template+C*E user M=A*E music +B*E template +C*E user
其中,E music表示音乐匹配度的得分,A表示音乐匹配度的得分对应的预设权重,E template表示模板热度,B表示模板热度对应的预设权重,E user表示用户喜好度,C表示用户喜好度对应的预设权重。 Among them, E music represents the score of the music matching degree, A represents the preset weight corresponding to the score of the music matching degree, E template represents the template popularity, B represents the preset weight corresponding to the template popularity, E user represents the user preference, C represents the user The preset weight corresponding to the preference degree.
在一实施例中,对待处理视频素材进行素材选择。In one embodiment, material selection is performed on the video material to be processed.
具体地,通过对待处理视频素材进行素材选择,从待处理视频素材中选择出需要进行剪辑的素材,从而根据这些选择出的需要进行剪辑的素材确定模板,并生成推荐视频。Specifically, by selecting the material of the video material to be processed, the material to be edited is selected from the to-be-processed video material, so that the template is determined according to the selected material to be edited, and the recommended video is generated.
在一实施方式中,所述对待处理视频素材进行素材选择,包括:根据所述待处理视频素材的素材参数进行素材选择;其中,所述素材参数包括拍摄时间、拍摄地点、拍摄目标物中的至少一种。In one embodiment, the material selection for the video material to be processed includes: material selection according to material parameters of the video material to be processed; wherein the material parameters include shooting time, shooting location, and shooting targets. at least one.
具体地,在根据素材参数来进行素材选择时,素材参数可以是用户自己设置的。例如,用户想要从待处理视频素材中选择五月一号至五月三号三天之内拍摄的视频素材,或者想要从待处理视频素材中选择在晚上六点至十点之间拍摄的视频素材,又或者用户想要从待处理视频素材中选择在拍摄地点在西双百纳的视频素材,又或者用户想要从待处理视频素材中选择拍摄猫咪的视频素材。Specifically, when the material selection is performed according to the material parameters, the material parameters may be set by the user. For example, the user wants to select the video material shot within three days from May 1st to May 3rd from the to-be-processed video material, or wants to select from the to-be-processed video material to be shot between 6:00 pm and 10:00 pm , or the user wants to select the video material at the shooting location in Xishuangbaina from the video material to be processed, or the user wants to select the video material for shooting cats from the video material to be processed.
其中,在根据拍摄时间来对待处理视频素材进行素材选择时,例如可以通过读取手持端或可移动平台在拍摄视频时所记录的拍摄时间,来根据拍摄时进行素材选择。例如还可以基于视频拍摄的实时性,通过拍摄的视频素材的内容,比如环境光照情况、是否有灯或广告牌亮起等等来确定拍摄的大致时间段,进行素材选择。Wherein, when the material selection of the video material to be processed is performed according to the shooting time, for example, the material selection can be performed according to the shooting time by reading the shooting time recorded by the handheld terminal or the movable platform when shooting the video. For example, based on the real-time nature of video shooting, the content of the shot video material, such as ambient lighting conditions, whether there are lights or billboards, etc., can be used to determine the approximate time period for shooting, and select materials.
在根据拍摄地点来对待处理视频素材进行素材选择时,例如在GPS定位服务开启的情况下,在拍摄视频时会记录拍摄时的GPS信息,可以根据拍摄时记录的GPS信息来对待处理视频素材进行筛选。When selecting the material for the video material to be processed according to the shooting location, for example, when the GPS location service is enabled, the GPS information at the time of shooting will be recorded when the video is shot, and the video material to be processed can be processed according to the GPS information recorded at the time of shooting. filter.
在根据素材参数对待处理视频素材进行选择时,可以由用户输入时间或地点,根据用户输入的时间或地点来进行素材选择,还可以由用户选择一个或多个待处理视频素材,通过分析用户选择的待处理视频素材之间的相关性,根据 分析出的相关性进行素材选择。When selecting the video material to be processed according to the material parameters, the user can input the time or location, and select the material according to the time or location input by the user, and the user can also select one or more video materials to be processed. The correlation between the to-be-processed video materials is selected, and the material is selected according to the analyzed correlation.
在一实施方式中,所述对待处理视频素材进行素材选择,包括:根据所述待处理视频素材的素材参数,对所述待处理视频素材进行聚类以实现素材选择;其中,所述聚类包括时间聚类、地点聚类、目标物聚类中的至少一种。In one embodiment, the performing material selection on the video material to be processed includes: according to the material parameters of the video material to be processed, clustering the video material to be processed to realize material selection; wherein, the clustering It includes at least one of time clustering, location clustering, and object clustering.
其中,在根据待处理视频素材的素材参数来进行素材选择时,可以通过聚类的方式来进行素材选择。例如可以通过时间聚类、地点聚类、目标物聚类中的至少一种聚类方式来进行聚类选择,方便用户快速选择大批量的素材,节省用户选择素材的时间。Wherein, when the material selection is performed according to the material parameters of the video material to be processed, the material selection may be performed by means of clustering. For example, cluster selection can be performed by at least one clustering method among time clustering, location clustering, and target object clustering, which facilitates the user to quickly select a large number of materials and saves the user's time for selecting materials.
在一实施方式中,所述对待处理视频素材进行素材选择,包括:根据用户的选择操作对待处理视频素材进行素材选择。In an implementation manner, the performing material selection on the to-be-processed video material includes: performing material selection on the to-be-processed video material according to a user's selection operation.
其中,也可以根据用户自己自定义的选择需要剪辑的视频素材,进行素材选择。在具体实施过程中,可以向用户呈现所有的待处理素材,由用户从所有的待处理素材中自定义选择需要进行剪辑的素材。Among them, the material selection can also be performed according to the user-defined selection of the video material to be edited. In the specific implementation process, all the materials to be processed may be presented to the user, and the user can customize the materials to be edited from all the materials to be processed.
另外,在对待处理视频素材根据素材参数进行选择或进行聚类后,可以将选择或聚类后的结果呈现给用户,由用户在选择或聚类后的结果中再自定义选择需要进行剪辑的素材。In addition, after the video materials to be processed are selected or clustered according to the material parameters, the results of the selection or clustering can be presented to the user, and the user can then customize the selection or clustering results to select the clips that need to be edited. material.
此外,在一实施方式中,还可以根据用户历史对视频素材的选择喜好,对视频素材进行选择或聚类,以为用户提供个性化的使用体验。In addition, in one embodiment, the video materials may also be selected or clustered according to the user's historical preference for video materials, so as to provide the user with a personalized use experience.
在一实施例中,还可以根据待处理视频素材的视频信息,对所述待处理视频素材进行分割生成多个视频片段。可例如采用上述实施例提供的视频分割方式。In an embodiment, the to-be-processed video material may also be segmented according to the video information of the to-be-processed video material to generate multiple video segments. For example, the video segmentation method provided by the above embodiments may be adopted.
在一实施例中,还可以获取所述待处理视频素材的图像质量;根据所述待处理视频素材的图像质量对所述待处理视频素材进行废片去除。In an embodiment, the image quality of the to-be-processed video material may also be acquired; the to-be-processed video material is discarded according to the image quality of the to-be-processed video material.
其中,图像质量包括画面抖动、画面模糊、画面过曝、画面欠曝、图像中无明确场景或图像中无明确主体中的至少一种。获取待处理视频素材,然后对待处理视频素材的视频图像进行质量检测,判断待处理视频素材中是否出现画面抖动、画面模糊、画面过曝、画面欠曝、图像中无明确场景或图像中无明确主体中的至少一种,若出现,则认为该部分为废片,将这些待处理视频素材进行去除。The image quality includes at least one of image shake, image blur, image overexposure, image underexposure, no clear scene in the image, or no clear subject in the image. Obtain the video material to be processed, and then perform quality detection on the video image of the video material to be processed, and determine whether the video material to be processed has screen shaking, blurred screen, over-exposure, under-exposure, no clear scene in the image, or no clear image in the image. If at least one of the main bodies appears, it is considered that this part is a waste film, and these to-be-processed video materials are removed.
需要说明的是,对于待处理视频素材中的视频片段,也可以根据视频片段的图像质量来对视频片段进行废片去除。It should be noted that, for the video clips in the video material to be processed, the video clips may also be discarded according to the image quality of the video clips.
在一实施例中,还可以对所述待处理视频素材进行去重处理。其中,所述去重处理包括相似素材聚类。In an embodiment, de-duplication processing may also be performed on the to-be-processed video material. Wherein, the deduplication process includes clustering of similar materials.
在拍摄视频素材的过程中,为了得到满意的视频,通常会对同一景色拍摄多次,这导致待处理视频素材中存在很多重复素材,使用相似素材聚类,将相似的素材归为一类,从素材中选择视频时长最长的素材。或者从素材中选择图像质量最好的素材,例如图像清晰、色彩饱和度较高等。In the process of shooting video materials, in order to obtain a satisfactory video, the same scene is usually shot multiple times, which leads to a lot of duplicate materials in the video materials to be processed. Clustering similar materials is used to classify similar materials into one category. Select the one with the longest video duration from the clips. Or choose the material with the best image quality from the material, such as clear image, high color saturation, etc.
需要说明的是,对于待处理视频素材中的视频片段,也可以对视频片段进行去重处理。It should be noted that, for the video clips in the video material to be processed, deduplication processing may also be performed on the video clips.
S302、根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,得到所述模板对应的匹配关系。S302. Determine a video segment corresponding to the video pit according to the pit information of the video pit in the template, and obtain a matching relationship corresponding to the template.
其中,所述视频片段为所述待处理视频素材中的片段。在确定出待处理视频素材对应的模板后,由于模板中至少包括一个视频坑位,因此,可以根据模板中视频坑位的坑位信息来确定与视频坑位对应的视频片段,从而得到模板对应的匹配关系。The video clip is a clip in the video material to be processed. After the template corresponding to the video material to be processed is determined, since the template includes at least one video pit, the video clip corresponding to the video pit can be determined according to the pit information of the video pit in the template, so as to obtain the corresponding template matching relationship.
具体地,坑位信息包括坑位音乐和坑位标签中的至少一种。在一些实施方式中,坑位标签可以是预先设置好的,每个视频坑位都可以预先设置好一个坑位标签。Specifically, the pit information includes at least one of pit music and pit tags. In some embodiments, the pit tag may be preset, and each video pit may be preset with a pit tag.
视频坑位的坑位音乐可以是整个模板的模板音乐中的片段,由于模板中包括多个视频坑位,多个视频坑位依次组合为一个模板,因此,对于模板的模板音乐,可以根据视频坑位的顺序和时长,对模板音乐进行拆分,从而得到每个视频坑位对应的坑位音乐。The pit music of the video pit can be a fragment in the template music of the entire template. Since the template includes a plurality of video pits, the plurality of video pits are sequentially combined into a template. Therefore, for the template music of the template, according to the video According to the sequence and duration of the pits, the template music is split to obtain the pit music corresponding to each video pit.
在一实施例中,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,包括:根据所述模板中视频坑位的坑位音乐确定与所述视频坑位对应的视频片段。In an embodiment, the determining the video segment corresponding to the video pit according to the pit information of the video pit in the template includes: determining a video segment corresponding to the video pit according to the pit music of the video pit in the template. The video segment corresponding to the video pit.
由于视频坑位的坑位音乐本身具有节奏感,例如是该视频坑位的坑位音乐是舒缓的节奏或爆发的节奏,因此可以根据坑位音乐的节奏感和视频片段的视频内容为视频坑位匹配合适的视频片段。Since the pit music of the video pit itself has a sense of rhythm, for example, the pit music of the video pit is a soothing rhythm or an explosive rhythm, so the video pit can be classified according to the rhythm of the pit music and the video content of the video clip. Bits match the appropriate video clips.
例如,音乐这一段是爆发的,视频片段中如果有喷泉等有力量爆发的画面,那么两者很匹配。另外如果是舒缓的节奏,适合人物慢动作;如果是史诗大片,适合大场景延时摄影等。For example, the music is bursting, and if there is a powerful burst such as a fountain in the video clip, then the two are a good match. In addition, if it is a soothing rhythm, it is suitable for slow motion of characters; if it is an epic blockbuster, it is suitable for large scene time-lapse photography.
在一实施例中,所述根据所述模板中视频坑位的坑位音乐确定与所述视频坑位对应的视频片段,包括:确定所述模板中视频坑位的坑位音乐与视频片段的匹配程度;根据所述匹配程度确定所述模板中视频坑位对应的视频片段。In one embodiment, the determining the video clip corresponding to the video pit according to the pit music of the video pit in the template includes: determining the difference between the pit music of the video pit and the video fragment in the template. Matching degree; the video segment corresponding to the video pit in the template is determined according to the matching degree.
将视频坑位的坑位音乐的节奏感与视频片段进行匹配,确定视频坑位的坑位音乐与视频片段之间的匹配程度,从而根据匹配程度确定模板中视频坑位对应的视频片段。Match the rhythm of the pit music of the video pits with the video clips to determine the matching degree between the pit music of the video pits and the video clips, so as to determine the video clips corresponding to the video pits in the template according to the matching degree.
具体地,可以设置一匹配程度的阈值,当匹配程度超过该阈值时,确定该视频坑位与视频片段相匹配;或者可以根据匹配程度的高低,选择与视频坑位匹配程度最高的视频片段作为视频坑位对应的视频片段。Specifically, a threshold of matching degree can be set, and when the matching degree exceeds the threshold, it is determined that the video pit matches the video clip; or the video clip with the highest matching degree with the video pit can be selected according to the matching degree as the video clip. The video segment corresponding to the video pit.
在一实施例中,所述模板中视频坑位的坑位音乐与视频片段的匹配程度为利用预先训练的音乐匹配模型得到的,所述音乐匹配模型能够输出所述模板中视频坑位的坑位音乐与视频片段的匹配度得分。In one embodiment, the matching degree of the pit music of the video pit in the template and the video clip is obtained by utilizing a pre-trained music matching model, and the music matching model can output the pit of the video pit in the template. The match score for the music and video clips.
通过训练神经网络学习坑位音乐与视频片段之间的匹配度,然后将训练得到的神经网络作为音乐匹配模型,来输出视频坑位的坑位音乐与视频片段之间的匹配度得分。By training a neural network to learn the matching degree between the pit music and the video clip, and then using the trained neural network as a music matching model to output the matching score between the pit music of the video pit and the video clip.
在一实施例中,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段之后,确定所述模板中视频坑位对应的多个所述视频片段的拍摄质量;根据多个所述视频片段的拍摄质量确定所述模板中视频坑位对应的最优视频片段;根据所述模板中视频坑位对应的最优视频片段得到所述模板对应的匹配关系。In one embodiment, after determining the video clips corresponding to the video pits according to the pit information of the video pits in the template, determine a plurality of the video clips corresponding to the video pits in the template. Shooting quality; determine the optimal video fragment corresponding to the video pit in the template according to the shooting quality of a plurality of the video fragments; obtain the matching relationship corresponding to the template according to the optimal video fragment corresponding to the video pit in the template .
其中,视频片段的拍摄质量为根据所述视频片段的图像内容和视频片段评价确定的。图像内容包括镜头稳定性、色彩饱和度、是否有主要拍摄物体和镜头内信息量等,视频片段评价包括视频片段的美学打分。The shooting quality of the video clip is determined according to the image content of the video clip and the evaluation of the video clip. The image content includes lens stability, color saturation, whether there is a main subject and the amount of information in the lens, etc. The video clip evaluation includes the aesthetic score of the video clip.
具体地,视频片段的美学打分可以考虑到色彩、构图、运镜和景别等因素来对视频片段进行美学打分。基于视频片段的图像内容和视频片段评价对视频片段的拍摄质量进行打分,分值越大,说明该视频片段的拍摄质量越高。Specifically, the aesthetic scoring of the video clip may take into account factors such as color, composition, mirror movement, and scene recognition to perform the aesthetic score for the video clip. The shooting quality of the video clip is scored based on the image content of the video clip and the evaluation of the video clip. The higher the score, the higher the shooting quality of the video clip.
根据多个视频片段的拍摄质量,从视频片段中选择出拍摄质量最高的视频片段作为模板中视频坑位对应的最优视频片段,从而得到模板对应的匹配关系。According to the shooting quality of the multiple video clips, the video clip with the highest shooting quality is selected from the video clips as the optimal video clip corresponding to the video pit in the template, so as to obtain the matching relationship corresponding to the template.
在一实施例中,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段之后,确定所述模板中相邻两个视频坑位对应的视频片段之间的匹配度;根据所述匹配度确定与所述视频坑位对应的最优视频片段;根据所述视频坑位对应的最优视频片段得到所述模板对应的匹配关系。In one embodiment, after the video clip corresponding to the video pit is determined according to the pit information of the video pit in the template, the video clip corresponding to two adjacent video pits in the template is determined. The matching degree between them is determined; the optimal video segment corresponding to the video pit is determined according to the matching degree; the matching relationship corresponding to the template is obtained according to the optimal video segment corresponding to the video pit.
所述相邻两个所述视频坑位对应的视频片段的匹配度为根据所述视频片段的运镜方向连贯性、景别的递增递减关系和匹配剪辑确定的。The matching degree of the video clips corresponding to the two adjacent video pits is determined according to the continuity of the moving direction of the video clips, the relationship of increasing and decreasing of the scene, and the matching clip.
其中,运镜方向的连贯性包括保证相邻两个视频坑位所对应视频片段的运镜方向是同方向的,避免相反运镜方向的视频片段被连接在一起。景别的递增递减关系例如包括从远景到中景再到近景,或者从近景到中景再到远景,或者直接从远景到近景等等。匹配剪辑包括通过相似的动作、图形、色彩等匹配衔接两个镜头,以实现连贯、流畅的叙事,用于两个视频片段之间的转场过渡。The continuity of the moving direction includes ensuring that the moving directions of the video clips corresponding to two adjacent video pits are in the same direction, and preventing video clips in opposite moving directions from being connected together. The increasing and decreasing relationship of the scene includes, for example, from a long shot to a medium shot to a close shot, or from a close shot to a medium shot and then to a long shot, or directly from a long shot to a close shot, and so on. Match editing involves bridging two shots by matching similar motion, graphics, color, etc. to achieve a coherent, fluid narrative for transitions between two video clips.
在一实施例中,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,包括:根据所述模板中视频坑位的坑位标签确定与所述视频坑位对应的视频片段。In one embodiment, the determining the video segment corresponding to the video pit according to the pit information of the video pit in the template includes: determining a video segment corresponding to the video pit according to the pit label of the video pit in the template. The video segment corresponding to the video pit.
模板中的每个视频坑位都设置有坑位标签,可以根据模板中视频坑位的坑位标签以及视频片段的标签,来进行标签的匹配,从而将匹配成功的视频片段作为视频坑位对应的视频片段。Each video pit in the template is provided with a pit tag, and the tag can be matched according to the pit tag of the video pit in the template and the tag of the video clip, so that the successfully matched video clip is used as the corresponding video pit. video clips.
在一实施例中,所述根据所述模板中视频坑位的坑位标签确定与所述视频坑位对应的视频片段,包括:确定视频片段的视频标签,将与所述视频坑位的坑位标签相匹配的视频标签对应的视频片段作为待填入所述视频坑位的视频片段。In one embodiment, the determining the video clip corresponding to the video pit according to the pit tag of the video pit in the template includes: determining the video tag of the video fragment, and combining the pit tag of the video pit with the video pit. The video segment corresponding to the video tag whose bit tag matches is used as the video segment to be filled in the video pit.
对视频片段进行标签提取,从而确定视频片段的视频标签,然后根据模板的视频坑位的坑位标签来确定视频坑位对应的视频片段。The tag extraction is performed on the video segment to determine the video tag of the video segment, and then the video segment corresponding to the video pit is determined according to the pit tag of the video pit of the template.
S303、根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。S303. Fill the video clip into the corresponding video slot of the template according to the matching relationship corresponding to the template to obtain a recommended video.
在得到匹配关系后,根据该匹配关系,将视频片段分别填入模板的对应视频坑位,进行视频合成,得到推荐视频。并将推荐视频推荐给用户。After the matching relationship is obtained, according to the matching relationship, the video clips are respectively filled into the corresponding video pits of the template, and video synthesis is performed to obtain a recommended video. and recommend recommended videos to users.
在一实施例中,请参阅图12,将视频片段填入对应的视频坑位的步骤具体包括步骤S3031至步骤S3032。In an embodiment, please refer to FIG. 12 , the step of filling the video segment into the corresponding video pit specifically includes steps S3031 to S3032.
S3031、确定所述视频片段的视频时长是否大于所述视频坑位的时长。S3031. Determine whether the video duration of the video clip is greater than the duration of the video pit.
在根据匹配关系将视频片段填入对应的视频坑位时,判断视频片段的视频时长是否大于视频坑位的时长。其中,视频坑位的时长决定了该视频坑位可以填入的视频片段的最大时长,因此,当视频片段的视频时长大于视频坑位的时长时,视频片段无法直接填入对应的视频坑位,需要从视频片段中提取相应时长的片段填入对应的视频坑位中。When filling the video clips into the corresponding video pits according to the matching relationship, it is determined whether the video duration of the video clips is greater than the duration of the video pits. Among them, the duration of the video pit determines the maximum duration of the video clip that can be filled in the video slot. Therefore, when the video duration of the video clip is greater than the duration of the video slot, the video clip cannot be directly filled into the corresponding video slot. , it is necessary to extract a segment of the corresponding duration from the video segment and fill it in the corresponding video pit.
S3032、若所述视频片段的视频时长大于所述视频坑位的时长,对所述视频片段进行片段提取,得到挑选片段。S3032. If the video duration of the video segment is greater than the duration of the video pit, perform segment extraction on the video segment to obtain a selected segment.
其中,所述挑选片段的视频时长小于或等于所述视频坑位的时长。在具体实施过程中,为了将挑选片段填入对应的视频坑位,并保证得到的推荐视频的完整度,在确定挑选片段时,可以使挑选片段的视频时长等于视频坑位的时长。Wherein, the video duration of the selected segment is less than or equal to the duration of the video pit. In the specific implementation process, in order to fill the selected segment into the corresponding video pit and ensure the integrity of the obtained recommended video, when determining the selected segment, the video duration of the selected segment can be equal to the duration of the video pit.
在一实施例中,所述对所述视频片段进行片段提取,得到挑选片段,包括:根据所述视频片段的视频元素,对所述视频片段进行片段提取,得到挑选片段。In an embodiment, the performing segment extraction on the video segment to obtain the selected segment includes: performing segment extraction on the video segment according to the video element of the video segment to obtain the selected segment.
在对视频片段进行片段提取时,可以根据视频片段的视频元素对视频片段进行片段提取,得到挑选片段。When segment extraction is performed on the video segment, segment extraction can be performed on the video segment according to the video element of the video segment to obtain the selected segment.
其中,所述视频元素包括笑脸画面、笑声音频、人物动作、清晰人声、画面构图和美学打分中的至少一种。在提取挑选片段时,可以从视频片段中根据视频元素提取较为精彩的片段作为挑选片段,例如将包括笑脸画面的片段或者美学打分较高的片段作为挑选片段等。Wherein, the video elements include at least one of smiley face pictures, laughter audio, character movements, clear human voices, picture composition and aesthetic scores. When extracting selected clips, more exciting clips can be extracted from video clips as selected clips according to video elements, for example, clips including smiling faces or clips with high aesthetic scores are selected as selected clips.
在一实施例中,步骤S303包括:根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到初始视频;基于所述模板的模板要求对所述初始视频进行图像优化,得到推荐视频。In one embodiment, step S303 includes: filling the video clip into the corresponding video slot of the template according to the matching relationship corresponding to the template, to obtain an initial video; based on the template requirements of the template, the initial video is Perform image optimization to get recommended videos.
在根据模板对应的匹配关系将视频片段填入对应的视频坑位后,得到初始视频,然后可以根据模板要求来对初始视频进行图像优化,将图像优化后的视频作为推荐视频推荐给用户。其中,所述模板要求包括转场设置、加减速设置、贴图特效设置中的至少一种。After filling the video clips into the corresponding video pits according to the matching relationship corresponding to the template, the initial video is obtained, and then the initial video can be image optimized according to the template requirements, and the image-optimized video is recommended to the user as a recommended video. The template requirements include at least one of transition settings, acceleration and deceleration settings, and texture special effects settings.
在一实施例中,由于待处理视频素材的来源可以是无人机拍摄的航拍视频, 而航拍时摄像头与拍摄物体之间的距离较远,画面变化内容小。因此,在将航拍视频填入对应的视频坑位时,可以先自动识别画面变化的速度,根据画面变化的速度自动给航拍视频做调速处理,再将调速处理后的航拍视频填入对应的视频坑位中。所述画面变化的速度可以根据预设时间内多个连续帧进行分析得到。In one embodiment, since the source of the video material to be processed may be aerial video captured by a drone, and the distance between the camera and the captured object is relatively long during aerial photography, the content of the picture changes is small. Therefore, when filling the aerial video into the corresponding video pit, you can automatically identify the speed of the screen change, automatically adjust the speed of the aerial video according to the speed of the screen change, and then fill in the speed-adjusted aerial video into the corresponding video. in the video pit. The speed of the picture change can be obtained by analyzing a plurality of consecutive frames within a preset time.
在一实施例中,在确定模板的视频坑位与视频片段之间的匹配关系时,对于识别出的航拍视频,可以将航拍视频放在模板的前几个视频坑位和/或后几个视频坑位,从而提高得到的推荐视频的质量。In one embodiment, when determining the matching relationship between the video pits and the video clips of the template, for the identified aerial video, the aerial video can be placed in the first several video pits and/or the latter several of the template. Video pits, thereby improving the quality of the recommended video.
上述实施例提供的一种视频处理方法,根据待处理视频素材的视频信息确定模板,再根据视频坑位的坑位信息确定视频坑位对应的视频片段,得到模板对应的匹配关系,最终根据模板对应的匹配关系将视频片段填入模板的对应视频坑位中,得到推荐视频。根据视频信息确定模板,提高生成的推荐视频的多样性,确定视频坑位与视频片段之间的匹配关系,根据匹配关系合成推荐视频,减少用户剪辑素材的工作量,降低剪片的门槛。In a video processing method provided by the above embodiment, a template is determined according to the video information of the video material to be processed, and then a video segment corresponding to the video pit is determined according to the pit information of the video pit, so as to obtain a matching relationship corresponding to the template, and finally according to the template The corresponding matching relationship fills the video clips into the corresponding video pits of the template to obtain recommended videos. Determine templates according to video information, improve the diversity of generated recommended videos, determine the matching relationship between video pits and video clips, and synthesize recommended videos according to the matching relationship, reduce the workload of users to edit materials, and lower the threshold for clipping.
请参阅图13,图13是本申请实施例提供的另一种视频处理方法的步骤流程图。Please refer to FIG. 13. FIG. 13 is a flowchart of steps of another video processing method provided by an embodiment of the present application.
具体地,如图13所示,该视频处理方法包括步骤S401至步骤S404。Specifically, as shown in FIG. 13 , the video processing method includes steps S401 to S404.
S401、获取多个模板,所述模板至少包括一个视频坑位。S401. Acquire multiple templates, where the templates include at least one video pit.
获取多个模板,模板用于与待处理视频素材进行合成,得到推荐视频,每个模板中至少包括一个视频坑位。A plurality of templates are acquired, the templates are used to synthesize the video material to be processed to obtain a recommended video, and each template includes at least one video pit.
具体地,可以从预先设置的模板库中任意获取多个模板,也可以根据用户在模板库中对于模板的选择操作获取多个模板,还可以根据用户历史合成视频时所经常使用的模板来获取多个模板。Specifically, multiple templates can be arbitrarily obtained from a preset template library, multiple templates can also be obtained according to the user's selection operation of templates in the template library, and multiple templates can also be obtained according to the templates frequently used by the user when synthesizing videos in history Multiple templates.
在一实施例中,对待处理素材进行分割生成多个视频片段。In one embodiment, the material to be processed is segmented to generate multiple video segments.
对待处理素材进行分割,生成多个视频片段,生成的视频片段用于填入模板中的视频坑位,以合成推荐视频。The material to be processed is divided to generate multiple video clips, and the generated video clips are used to fill in the video pits in the template to synthesize recommended videos.
在一实施例中,所述对待处理素材进行分割生成多个视频片段,包括:根据待处理视频素材的视频信息,对所述待处理视频素材进行分割生成多个视频片段。可例如采用上述实施例提供的视频分割方式。In an embodiment, the dividing the material to be processed to generate multiple video segments includes: dividing the video material to be processed to generate multiple video segments according to the video information of the video material to be processed. For example, the video segmentation method provided by the above embodiments may be adopted.
S402、为每个所述模板的视频坑位匹配视频片段,得到每个所述模板对应的匹配关系,并确定每个所述模板对应的匹配关系的匹配得分。S402. Match video segments for the video pits of each of the templates, obtain a matching relationship corresponding to each of the templates, and determine a matching score of the matching relationship corresponding to each of the templates.
其中,所述视频片段为待处理视频素材的片段。在一些实施方式中,视频片段可以是对待处理视频素材进行分割得到的视频片段。The video segment is a segment of the video material to be processed. In some embodiments, the video segment may be a video segment obtained by dividing the video material to be processed.
对于一个模板中的各个视频坑位而言,为每个模板中的视频坑位分别匹配视频片段,从而得到视频坑位与视频片段之间的匹配关系,将视频坑位与视频片段之间的匹配关系作为模板对应的匹配关系,并计算该匹配关系的匹配得分。For each video pit in a template, match the video clips for the video pits in each template, so as to obtain the matching relationship between the video pits and the video clips. The matching relationship is used as the matching relationship corresponding to the template, and the matching score of the matching relationship is calculated.
在一实施例中,请参阅图14,为视频坑位匹配视频片段得到匹配关系的步骤具体包括步骤S4021和步骤S4022。In one embodiment, please refer to FIG. 14 , the step of matching video clips for video pits to obtain a matching relationship specifically includes steps S4021 and S4022.
S4021、根据所述视频片段和每个所述模板的视频坑位构建多个流网络图。S4021. Construct multiple stream network graphs according to the video clips and the video pits of each template.
其中,所述流网络图包括多个节点,每个所述节点对应一个所述视频片段和一个所述视频坑位的匹配关系。根据每个模板的视频坑位和视频片段构建流网络图,其中,流网络图中的左侧纵轴C m表示视频片段,上侧横轴S n表示模板的视频坑位。 The flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits. A stream network diagram is constructed according to the video pits and video clips of each template, wherein the left vertical axis C m in the stream network diagram represents the video clips, and the upper horizontal axis Sn represents the template video pits.
S4022、基于多个所述流网络图确定每个所述模板的视频坑位与所述视频片段的匹配关系。S4022. Determine the matching relationship between the video pits of each template and the video clips based on the multiple stream network graphs.
每个模板都基于各自的流网络图来确定各自的视频坑位与视频片段之间的匹配关系。Each template determines the matching relationship between the respective video pits and video segments based on the respective stream network graph.
在一实施例中,所述基于多个所述流网络图确定每个所述模板的视频坑位与所述视频片段的匹配关系,包括:基于最大流算法为每个所述模板的视频坑位匹配合适的视频片段,得到最优路径;其中,将所述最优路径中所述视频片段与所述视频坑位之间的对应关系作为每个所述模板的视频坑位与所述视频片段的匹配关系。可例如采用上述实施例提供的视频分割方式。In one embodiment, the determining the matching relationship between the video pits of each of the templates and the video clips based on the plurality of flow network graphs includes: based on a maximum flow algorithm, for the video pits of each of the templates Bit-matching suitable video clips to obtain an optimal path; wherein, the corresponding relationship between the video clips and the video pits in the optimal path is taken as the video pits and the video pits of each of the templates Fragment matching. For example, the video segmentation method provided in the above embodiments may be adopted.
在一实施例中,所述确定每个所述模板的视频坑位与所述视频片段的匹配关系的匹配得分,包括:根据所述最优路径中每相邻两个节点之间的能量值,确定每个所述模板的视频坑位与所述视频片段的匹配关系的匹配得分。In one embodiment, the determining the matching score of the matching relationship between the video pit of each template and the video clip includes: according to the energy value between every two adjacent nodes in the optimal path , and determine the matching score of the matching relationship between the video pit of each template and the video segment.
根据最优路径中每相邻两个节点之间的能量值,将最优路径中的相邻两节点之间的能量值相加,将最优路径中的能量值之和作为该匹配关系的匹配得分。According to the energy value between each two adjacent nodes in the optimal path, the energy values between the adjacent two nodes in the optimal path are added, and the sum of the energy values in the optimal path is used as the matching relationship. match score.
在一实施例中,所述为每个所述模板的视频坑位匹配视频片段,得到每个 所述模板对应的匹配关系,包括:根据所述模板的视频坑位的坑位标签或所述模板的模板标签对所述视频片段进行分类,得到分类后的视频片段;根据所述分类后的视频片段确定所述模板对应的匹配关系。In one embodiment, the matching of video clips for the video pits of each of the templates, to obtain a matching relationship corresponding to each of the templates, includes: according to the pit labels of the video pits of the template or the The template tag of the template classifies the video clips to obtain the classified video clips; the matching relationship corresponding to the template is determined according to the classified video clips.
具体地,对于每个视频片段都具有其对应的视频标签,将模板的各个视频坑位的坑位标签与视频片段对应的视频标签进行匹配,从而根据视频标签和坑位标签的匹配度,对视频片段进行分类,将视频片段分为多个类别。Specifically, each video clip has its corresponding video tag, and the pit tag of each video pit of the template is matched with the video tag corresponding to the video fragment, so that according to the degree of matching between the video tag and the pit tag, to Video clips are classified, and video clips are divided into multiple categories.
另外,还可以将模板的模板标签与视频片段对应的视频标签进行匹配,从而根据视频标签和模板标签的匹配度,对视频片段进行分类,将视频片段分为多个类别。In addition, the template tag of the template can also be matched with the video tag corresponding to the video clip, so as to classify the video clip according to the matching degree between the video tag and the template tag, and divide the video clip into multiple categories.
然后根据视频片段的类别来确定视频坑位所对应的视频片段,得到模板对应的匹配关系。Then, the video segment corresponding to the video pit is determined according to the category of the video segment, and the matching relationship corresponding to the template is obtained.
在一实施例中,所述根据所述模板的视频坑位的坑位标签或所述模板的模板标签对所述视频片段进行分类,包括:根据所述视频坑位的坑位标签或所述模板的模板标签对多个所述视频片段进行等级划分,得到多个等级类别的视频片段。In one embodiment, the classifying the video segment according to the pit tag of the video pit of the template or the template tag of the template includes: according to the pit tag of the video pit or the template tag of the template. The template tag of the template classifies a plurality of the video clips to obtain video clips of multiple class categories.
其中,所述多个等级类别的视频片段至少包括第一类别的视频片段、第二类别的视频片段和第三类别的视频片段;其中,所述第一类别的视频片段的精彩等级大于所述第二类别的视频片段的精彩等级,所述第二类别的视频片段的精彩等级大于所述第三类别的视频片段的精彩等级。Wherein, the video clips of the plurality of grade categories at least include video clips of the first category, video clips of the second category and video clips of the third category; The highlight rating of the video clips of the second category, where the highlight rating of the video clips of the second category is greater than the highlight rating of the video clips of the third category.
具体地,在对视频片段进行分类时,可以是对视频片段进行精彩等级的分类。从而在为视频坑位匹配视频片段时,可以选择最精彩的视频片段作为视频坑位对应的视频片段,得到模板的匹配关系。Specifically, when classifying the video clips, the video clips may be classified by a highlight level. Therefore, when matching video clips for video pits, the most exciting video clips can be selected as the video clips corresponding to the video pits to obtain the matching relationship of templates.
其中,所述精彩等级是根据所述视频片段的画面内容和音频内容确定的。例如,画面内容中包括画面构图、是否有明确的拍摄主体、运镜方向以及景别等等。音频内容包括是否有清晰的人声、笑声和欢呼声等等。The highlight level is determined according to the picture content and audio content of the video clip. For example, the content of the picture includes the composition of the picture, whether there is a clear subject, the direction of the mirror movement, and the scene. Audio content includes clear vocals, laughter, cheers, and more.
在一实施例中,所述根据所述分类后的视频片段确定每个所述模板的视频坑位与所述视频片段的匹配关系,包括:根据所述视频坑位的坑位标签或所述模板的模板标签,对所述分类后的视频片段进行排序;以及根据所述排序结果确定每个所述模板的视频坑位与所述视频片段的匹配关系。In an embodiment, the determining the matching relationship between the video pits of each template and the video fragments according to the classified video fragments includes: according to the pit tags of the video pits or the a template tag of a template, sorting the classified video clips; and determining the matching relationship between the video pits of each template and the video clips according to the sorting result.
具体地,根据视频坑位的坑位标签,以及视频片段的分类等级对分类后的视频片段进行排序。在排序时,对于每一个坑位标签,可以根据视频片段的分类等级从高到低进行排序。对于每一个模板的每一个视频坑位,选择排序靠前的视频片段作为待填入该视频坑位的视频片段,从而得到模板的视频坑位与视频片段之间的匹配关系。Specifically, the classified video clips are sorted according to the pit labels of the video pits and the classification level of the video clips. When sorting, for each pit tag, it can be sorted from high to low according to the classification level of the video clips. For each video pit of each template, the video segment with the highest ranking is selected as the video segment to be filled in the video pit, so as to obtain the matching relationship between the video pit of the template and the video segment.
例如,坑位标签为ABC三种,视频片段的视频标签及其分类等级为A1、A2、A3、B1、B3、C1、C2。分类后的视频片段为A标签类别:A1、A2、A3,B标签类别:B1、B3,C标签类别:C1、C2。For example, there are three types of pit tags, ABC, and the video tags of video clips and their classification levels are A1, A2, A3, B1, B3, C1, and C2. The classified video clips are A-label categories: A1, A2, A3, B-label categories: B1, B3, C-label categories: C1, C2.
在为视频坑位选择待填入的视频片段时,若视频坑位的坑位标签为A,则选择视频片段A1,若视频坑位的坑位标签为B,则选择视频片段B1,若视频坑位的坑位标签为C,则选择视频片段C1。When selecting the video segment to be filled in for the video pit, if the pit label of the video pit is A, select the video segment A1, if the pit label of the video pit is B, then select the video segment B1, if the video If the pit tag of the pit is C, the video segment C1 is selected.
同样的,根据模板的模板标签,以及视频片段的分类等级对分类后的视频片段进行排序。在排序时,对于每一个模板标签,可以根据视频片段的分类等级从高到低进行排序。对于每一个模板的每一个视频坑位,选择排序靠前的视频片段作为待填入该视频坑位的视频片段,从而得到模板的视频坑位与视频片段之间的匹配关系。Similarly, the classified video clips are sorted according to the template label of the template and the classification level of the video clips. When sorting, for each template tag, it can be sorted from high to low according to the classification level of the video clips. For each video pit of each template, the video segment with the highest ranking is selected as the video segment to be filled in the video pit, so as to obtain the matching relationship between the video pit of the template and the video segment.
在一实施例中,所述根据所述分类后的视频片段确定每个所述模板的视频坑位与所述视频片段的匹配关系,包括:根据所述排序结果为所述模板的视频坑位分配视频片段,确定每个所述模板的视频坑位与所述视频片段的匹配关系。In an embodiment, the determining the matching relationship between the video pits of each template and the video fragments according to the classified video fragments includes: according to the sorting result, the video pits of the template are Allocating video clips, and determining the matching relationship between the video pits of each template and the video clips.
对于每一个模板的每一个视频坑位,选择排序靠前的视频片段作为待填入该视频坑位的视频片段,从而得到模板的视频坑位与视频片段之间的匹配关系。For each video pit of each template, the video segment with the highest ranking is selected as the video segment to be filled in the video pit, so as to obtain the matching relationship between the video pit of the template and the video segment.
S403、根据所述匹配得分从所述多个模板中确定推荐模板。S403. Determine a recommended template from the multiple templates according to the matching score.
在得到模板的匹配关系对应的匹配得分后,可以从获取的多个模板中确定推荐模板,其中,推荐模板为用于合成推荐视频的模板。After obtaining the matching score corresponding to the matching relationship of the templates, a recommended template may be determined from the obtained templates, where the recommended template is a template used for synthesizing a recommended video.
在一实施例中,请参阅图15,根据匹配得分确定推荐模板的步骤具体包括步骤S4031至步骤S4033。In an embodiment, please refer to FIG. 15 , the step of determining the recommended template according to the matching score specifically includes steps S4031 to S4033.
S4031、根据所述匹配得分从所述多个模板中确定预设数量个模板。S4031. Determine a preset number of templates from the multiple templates according to the matching score.
首先根据匹配得分从多个模板中确定预设数量个模板,其中,该预设数量个模板的数量可以是根据经验值设定的。First, a preset number of templates is determined from a plurality of templates according to the matching score, wherein the number of the preset number of templates may be set according to an empirical value.
在具体实施过程中,可以根据匹配得分从高到低从多个模板中选择预设数量个模板。In a specific implementation process, a preset number of templates may be selected from a plurality of templates according to the matching score from high to low.
S4032、从所述预设数量个模板中任选目标数量个进行组合,得到多个模板组,所述模板组包括目标数量个模板。S4032. Select a target number of templates from the preset number of templates and combine them to obtain a plurality of template groups, where the template group includes the target number of templates.
其中,该目标数量即为推荐给用户的推荐模板的数量。从预设数量个模板中任选目标数量个进行组合,从而得到多个模板组,每个模板组中包括的模板数量均为目标数量个。The target number is the number of recommended templates recommended to the user. A target number is selected from the preset number of templates to be combined to obtain a plurality of template groups, and the number of templates included in each template group is the target number.
在具体实施过程中,可以使用组合数选择目标数量个模板并得到模板组。例如,预设数量为n,目标数量为k,那么可以选择出的模板组的数量可以为:In a specific implementation process, the number of combinations can be used to select a target number of templates and obtain a template group. For example, the preset number is n and the target number is k, then the number of template groups that can be selected can be:
Figure PCTCN2020142432-appb-000002
Figure PCTCN2020142432-appb-000002
S4033、根据所述模板组中目标数量个模板的模板类型从多个模板组中确定推荐模板组,将所述推荐模板组中的目标数量个模板作为推荐模板。S4033: Determine a recommended template group from a plurality of template groups according to the template types of the target number of templates in the template group, and use the target number of templates in the recommended template group as a recommended template.
分别确定每个模板组中所包括的目标数量个模板各自的模板类型,并根据模板类型从多个模板组中确定推荐模板组。考虑模板丰富度,使得推荐给用户的模板中包括有多种风格,供用户挑选。The respective template types of the target number of templates included in each template group are respectively determined, and a recommended template group is determined from the plurality of template groups according to the template types. Considering the richness of templates, the templates recommended to users include a variety of styles for users to choose from.
在一实施例中,请参阅图16,步骤S4033包括步骤S4033a和步骤S4033b。In one embodiment, please refer to FIG. 16 , step S4033 includes step S4033a and step S4033b.
S4033a、获取多个所述模板组中目标数量个模板的模板类型,根据所述模板类型和所述匹配得分确定多个所述模板组对应的组合得分。S4033a. Obtain template types of a target number of templates in multiple template groups, and determine combined scores corresponding to multiple template groups according to the template types and the matching scores.
对于每一个模板组,获取模板组中目标数量个模板的模板类型,然后根据模板类型计算模板组的模板丰富度得分,以及,获取模板组中目标数量个模板各自的匹配得分。根据匹配得分和丰富度得分确定每个模板组对应的组合得分。For each template group, obtain the template types of the target number of templates in the template group, then calculate the template richness score of the template group according to the template type, and obtain the respective matching scores of the target number of templates in the template group. The combined score corresponding to each template group is determined according to the matching score and richness score.
在一实施例中,所述根据所述模板类型和所述匹配得分确定多个所述模板组对应的组合得分,包括:根据所述模板类型确定多个所述模板组内目标数量个模板之间的模板丰富度;根据所述模板组内目标数量个模板之间的模板丰富度和所述模板组内目标数量个模板的匹配得分之和,确定多个所述模板组的组合得分。In one embodiment, the determining, according to the template type and the matching score, the combined scores corresponding to the multiple template groups includes: determining, according to the template type, a target number of templates in the multiple template groups. The template richness among the template groups is determined according to the sum of the template richness among the target number of templates in the template group and the matching scores of the target number of templates in the template group to determine the combined score of a plurality of the template groups.
获取模板组内各个模板的模板类型ID,不同的模板类型对应的ID不同,基于此判断模板组中的各个模板之间的模板丰富度。Obtain the template type ID of each template in the template group. Different template types have different IDs. Based on this, the template richness of each template in the template group is judged.
例如,在确定模板组中的模板丰富度时,可以使用以下公式:For example, when determining the template richness in a template group, the following formula can be used:
Figure PCTCN2020142432-appb-000003
Figure PCTCN2020142432-appb-000003
其中,E 1表示模板组中的模板丰富度,a i表示第i个模板的模板类型,a j表示第j个模板的模板类型,f(a i,a j)的值表示第i个模板的模板类型和第j个模板的模板类型是否相同。 Among them, E 1 represents the template richness in the template group, a i represents the template type of the ith template, a j represents the template type of the j th template, and the value of f( ai , a j ) represents the ith template Whether the template type of is the same as the template type of the jth template.
当第i个模板的模板类型和第j个模板的模板类型相同时,f(a i,a j)的值为0,当第i个模板的模板类型和第j个模板的模板类型不同时,f(a i,a j)的值为1。E 1的值越大,则说明该模板组中的模板种类越丰富。 When the template type of the ith template is the same as the template type of the jth template, the value of f(a i , a j ) is 0, when the template type of the ith template and the template type of the jth template are different , the value of f(a i , a j ) is 1. The larger the value of E 1 is, the richer the template types in the template group are.
在得到模板组中的模板丰富度后,可以根据模板组中各个模板的匹配得分确定模板组的组合得分。After obtaining the template richness in the template group, the combined score of the template group can be determined according to the matching score of each template in the template group.
例如,可参照如下公式:For example, refer to the following formula:
E=a*E 1+E 2 E=a*E 1 +E 2
其中,E为模板组的组合得分,E 1为模板组的模板丰富度,a为模板丰富度的预设权重,E 2为模板组中模板的匹配得分之和。 Among them, E is the combined score of the template group, E1 is the template richness of the template group, a is the preset weight of the template richness, and E2 is the sum of the matching scores of the templates in the template group.
S4033b、根据所述组合得分从多个所述模板组中确定推荐模板组。S4033b. Determine a recommended template group from a plurality of the template groups according to the combination score.
在得到各个模板组的组合得分后,可以根据各个模板组的组合得分,从多个模板组中选择出推荐模板组,推荐模板组中的模板即为推荐给用户的推荐模板。After the combined score of each template group is obtained, a recommended template group can be selected from multiple template groups according to the combined score of each template group, and the template in the recommended template group is the recommended template recommended to the user.
在具体实施过程中,可以根据组合得分的高低,从多个模板组中选择出组合得分最高的模板组作为推荐模板组。In a specific implementation process, a template group with the highest combined score may be selected from a plurality of template groups as the recommended template group according to the combination score.
在一实施例中,所述推荐视频的数量包括多个,多个所述推荐视频为根据所述推荐模板组中的目标数量个模板得到的。In an embodiment, the number of the recommended videos includes a plurality of recommended videos, and the plurality of recommended videos are obtained according to a target number of templates in the recommended template group.
在一实施例中,将多个所述推荐视频推荐给用户,以便用户选择。In one embodiment, a plurality of the recommended videos are recommended to the user for selection by the user.
对于推荐模板组中的目标数量个模板,分别根据各个模板对应的匹配关系,将视频片段填入模板的视频坑位中,从而生成多个推荐视频,将多个推荐视频 共同推荐给用户,以便于用户从中选择出最终要使用的视频。For the target number of templates in the recommended template group, according to the matching relationship corresponding to each template, the video clips are filled into the video pits of the template, thereby generating multiple recommended videos, and recommending the multiple recommended videos to the user, so that It is up to the user to select the final video to use.
在一实施例中,请参阅图17,步骤S403包括步骤S4031’和步骤S4032’。In one embodiment, please refer to FIG. 17 , step S403 includes step S4031' and step S4032'.
S4031’、根据所述匹配得分从所述多个模板中确定预设数量个模板,组成模板组。S4031': Determine a preset number of templates from the multiple templates according to the matching score to form a template group.
根据多个模板各自的匹配得分,从多个模板中确定预设数量个模板,将预设数量个模板组成模板组。According to the respective matching scores of the multiple templates, a preset number of templates are determined from the multiple templates, and the preset number of templates are formed into a template group.
具体地,可以根据模板各自的匹配得分,从高到低选择预设数量个模板,组成模板组。例如,可以根据模板的匹配得分,从高到低依次选择五个模板作为一组,构成一个模板组。Specifically, a preset number of templates may be selected from high to low according to their respective matching scores to form a template group. For example, according to the matching scores of the templates, five templates may be selected as a group from high to low to form a template group.
S4032’、确定所述模板组内模板的模板类型,根据所述模板类型从所述模板组内确定推荐模板。S4032': Determine the template type of the template in the template group, and determine a recommended template from the template group according to the template type.
确定模板组中模板的模板类型,从而根据模板类型从模板组中确定推荐模板。Determines the template type of the template in the template group to determine the recommended template from the template group based on the template type.
在一实施例中,请参阅图18,根据模板类型确定推荐模板的步骤具体包括步骤S4032’a至S4032’c。In one embodiment, referring to FIG. 18 , the step of determining the recommended template according to the template type specifically includes steps S4032'a to S4032'c.
S4032’a、确定所述模板类型的数量是否大于预设类型阈值。S4032'a. Determine whether the number of template types is greater than a preset type threshold.
确定模板组中的模板类型的数量是否大于预设类型阈值,其中,预设类型阈值为期望为用户推荐的模板类型的数量。预设类型阈值可以是预先设置好的。Determine whether the number of template types in the template group is greater than a preset type threshold, where the preset type threshold is the number of template types expected to be recommended for the user. The preset type threshold may be preset.
S4032’b、若所述模板类型的数量大于所述预设类型阈值,根据所述模板类型和相同模板类型的模板的匹配得分确定推荐模板。S4032'b. If the number of the template types is greater than the preset type threshold, determine a recommended template according to the matching score of the template type and templates of the same template type.
当模板组中模板类型的数量大于预设类型阈值时,根据模板类型和相同模板类型的模板的匹配得分来确定推荐模板,也即从相同模板类型的模板中选择匹配得分最高的模板作为推荐模板。When the number of template types in the template group is greater than the preset type threshold, the recommended template is determined according to the matching score of the template type and the templates of the same template type, that is, the template with the highest matching score is selected from the templates of the same template type as the recommended template .
在一实施例中,所述根据所述模板类型和相同模板类型的模板的匹配得分确定推荐模板,包括:根据所述模板类型对所述模板组内的模板进行类型划分,得到多个类型的模板;根据所述匹配得分确定多个类型最优模板,所述类型最优模板为每个模板类型中所述匹配得分最高的模板;从多个类型最优模板中选择所述匹配得分最高的模板作为推荐模板。In one embodiment, the determining the recommended template according to the template type and the matching scores of templates of the same template type includes: classifying the templates in the template group according to the template type, to obtain multiple types of templates. template; determine a plurality of type optimal templates according to the matching score, the type optimal template is the template with the highest matching score in each template type; select the highest matching score template from the multiple type optimal templates template as a recommended template.
根据模板类型对模板组内的模板进行类型划分,得到多个类型的模板,将 相应模板类型的模板划入对应的类型下。然后对于划分的每一个模板类型,从模板类型对应的多个模板中选择匹配得分最高的一个模板作为本模板类型下的最优模板,也即类型最优模板。对于多个模板类型的类型最优模板,从多个类型最优模板中再选择匹配得分最高的模板作为推荐模板。Divide the templates in the template group according to the template type to obtain multiple types of templates, and classify the templates of the corresponding template type into the corresponding type. Then, for each divided template type, a template with the highest matching score is selected from the templates corresponding to the template type as the optimal template under the template type, that is, the optimal template of the type. For the type-optimal templates of multiple template types, the template with the highest matching score is selected from the multiple type-optimal templates as the recommended template.
请参阅图19,图19中为从模板组中选择推荐模板的示意图。A、B、C、D为模板组中的四个模板类型,预设的类型阈值为3,也即需要从模板组中选择出三种类型的模板推荐给用户。Please refer to FIG. 19. FIG. 19 is a schematic diagram of selecting a recommended template from a template group. A, B, C, and D are four template types in the template group, and the preset type threshold is 3, that is, three types of templates need to be selected from the template group and recommended to the user.
如图19所示,A模板类型下有A 1到A n共n个模板,这n个模板的匹配得分分别为A 1-99分,A 2-98分,A 3-96分等。B模板类型下有B 1到B m共m个模板,这m个模板的匹配得分分别为B 1-97分,B 2-94分,B 3-89分等。C模板类型下有C 1到C x共x个模板,这x个模板的匹配得分分别为C 1-99分,C 2-95分,C 3-90分等。D模板类型下有D 1到D y共y个模板,这y个模板的匹配得分分别为D 1-95分,D 2-94分,D 3-93分等。 As shown in Figure 19, there are n templates from A 1 to An under the A template type, and the matching scores of these n templates are respectively A 1-99 points, A 2-98 points, A 3-96 points, etc. There are m templates from B 1 to B m under the B template type, and the matching scores of these m templates are respectively B 1 -97 points, B 2 -94 points, B 3 -89 points, etc. There are a total of x templates from C 1 to C x under the C template type, and the matching scores of these x templates are respectively C 1 -99 points, C 2 -95 points, C 3 -90 points, etc. There are y templates from D 1 to D y under the D template type, and the matching scores of these y templates are respectively D 1 -95 points, D 2 -94 points, D 3 -93 points, etc.
对于每一个模板类型中,根据模板的匹配得分选择类型最优模板,得到A模板类型的类型最优模板为A 1,B模板类型的类型最优模板为B 1,C模板类型的类型最优模板为C 1,D模板类型的类型最优模板为D 1For each template type, the optimal template of the type is selected according to the matching score of the template, and the optimal template of the type A template type is A 1 , the optimal template of the B template type is B 1 , and the optimal type of the C template type is obtained. The template is C 1 , and the type-optimal template of the D template type is D 1 .
由于仅需要选择三种类型的模板推荐给用户,因此,比较模板A 1、模板B 1、模板C 1和模板D 1的匹配得分,其中,A 1-99分,B 1-97分,C 1-99分,D 1-95分,从按照匹配得分的高低从模板A 1、模板B 1、模板C 1和模板D 1中选择三个模板,分别为模板A 1、模板B 1和模板C 1,将模板A 1、模板B 1和模板C 1作为推荐模板推荐给用户。 Since only three types of templates need to be selected and recommended to users, the matching scores of template A 1 , template B 1 , template C 1 and template D 1 are compared, where A 1 -99 points, B 1 -97 points, C 1-99 points, D 1-95 points, choose three templates from template A 1 , template B 1 , template C 1 and template D 1 according to the level of matching score, template A 1 , template B 1 and template respectively C 1 , the template A 1 , the template B 1 and the template C 1 are recommended to the user as recommended templates.
S4032’c、若所述模板类型的数量小于或等于所述预设类型阈值,从所述模板组内相同模板类型的多个模板中选择所述匹配得分最高的模板,作为推荐模板。S4032'c. If the number of template types is less than or equal to the preset type threshold, select the template with the highest matching score from multiple templates of the same template type in the template group as a recommended template.
若模板类型的数量小于预设类型阈值,则说明该模板组中的模板类型达不到预设类型阈值的要求,因此,可以直接将该模板组中所拥有的模板类型的模板,选择匹配得分最高的模板作为推荐模板。If the number of template types is less than the preset type threshold, it means that the template types in the template group cannot meet the requirements of the preset type threshold. Therefore, you can directly select the matching score for the templates of the template type in the template group. The highest template is the recommended template.
例如,若预设的类型阈值为3,而模板组中的模板类型仅有A和B两个,且A模板类型下匹配得分最高的模板为模板A 1,B模板类型下匹配得分最高 的模板为模板B 1,此时可以将模板A 1和模板B 1作为推荐模板。 For example, if the preset type threshold is 3, and there are only two template types in the template group, A and B, and the template with the highest matching score under the A template type is template A 1 , and the template with the highest matching score under the B template type For template B 1 , template A 1 and template B 1 can be used as recommended templates at this time.
若模板类型的数量等于预设类型阈值,则说明该模板组中的模板类型恰好达到预设类型阈值的要求,因此,也可以直接将该模板组中所拥有的模板类型的模板,选择匹配得分最高的模板作为推荐模板。If the number of template types is equal to the preset type threshold, it means that the template types in the template group just meet the requirements of the preset type threshold. Therefore, it is also possible to directly select the matching score for templates of template types in the template group. The highest template is the recommended template.
例如,若预设的类型阈值为3,而模板组中的模板类型有A、B和C两个,且A模板类型下匹配得分最高的模板为模板A 1,B模板类型下匹配得分最高的模板为模板B 1,C模板类型下匹配得分最高的模板为模板C 1,此时可以将模板A 1、模板B 1和模板C 1作为推荐模板。 For example, if the preset type threshold is 3, and there are two template types in the template group, A, B, and C, and the template with the highest matching score under template type A is template A 1 , and the template with the highest matching score under template type B is template A 1 . The template is template B 1 , and the template with the highest matching score under the C template type is template C 1 . In this case, template A 1 , template B 1 and template C 1 may be used as recommended templates.
在一实施例中,步骤S403具体包括获取所述多个模板的模板类型;根据所述多个模板的模板类型和匹配得分确定推荐模板。In an embodiment, step S403 specifically includes acquiring template types of the multiple templates; and determining a recommended template according to the template types and matching scores of the multiple templates.
对于多个模板,获取各个模板的模板类型,以便于根据多个模板的模板类型和匹配得分确定推荐模板,使能够向用户推荐多种模板类型的推荐模板。For multiple templates, the template type of each template is obtained, so as to determine the recommended template according to the template type and matching score of the multiple templates, so that the recommended templates of multiple template types can be recommended to the user.
在一实施例中,所述根据所述多个模板的模板类型和匹配得分确定推荐模板,包括:根据所述模板类型将所述多个模板划分为多个类型模板组,每个所述类型模板组至少包括一个所述模板;根据所述模板的匹配得分从所述多个类型模板组中确定满足类型需求数量的模板;以及根据所述模板的匹配得分从所述多个模板中剩余的模板选择模板,选择的模板数量满足模板需求数量。In an embodiment, the determining a recommended template according to template types and matching scores of the multiple templates includes: dividing the multiple templates into multiple type template groups according to the template types, and each type of The template group includes at least one of the templates; according to the matching score of the template, the template that meets the required number of types is determined from the plurality of type template groups; and the remaining templates from the plurality of templates according to the matching score of the template Template Select a template, and the number of selected templates meets the required number of templates.
对于多个模板,当满足多个模板的模板类型数量大于预设的第一阈值时,获取的第一阈值个不同类型的模板作为第一推荐模板;其中,第一推荐模板为所属模板类型中具有最高匹配得分的模板;第一阈值为类型需求数量。For multiple templates, when the number of template types that satisfy the multiple templates is greater than the preset first threshold, the obtained first threshold number of templates of different types are used as the first recommended templates; wherein, the first recommended templates are in the template type to which they belong. Template with the highest match score; the first threshold is the number of type requirements.
根据所述多个模板中除所述第一推荐模板外的其他模板,按照匹配分数的高低获取第二推荐模板;其中,第一推荐模板和第二推荐模板的总和为第二阈值,第二阈值为模板需求数量。According to other templates in the multiple templates except the first recommended template, the second recommended template is obtained according to the level of matching scores; wherein, the sum of the first recommended template and the second recommended template The threshold is the number of template requirements.
根据多个模板中各个模板的模板类型对多个模板进行分类,得到多个类型模板组,每个类型模板组对应一个模板类型,每个类型模板组中包括至少一个模板。The multiple templates are classified according to the template type of each template in the multiple templates to obtain multiple type template groups, each type template group corresponds to a template type, and each type template group includes at least one template.
然后根据模板的匹配得分从类型模板组中选择出满足类型需求数量的模板,接着从剩余的模板中,根据剩余的模板的匹配得分再选择模板,直至选择出的模板数量满足模板需求数量。Then, according to the matching score of the template, the template that meets the required quantity of the type is selected from the type template group, and then the template is selected from the remaining templates according to the matching score of the remaining template until the number of selected templates meets the required quantity of the template.
例如,如图19所示,若共有24个模板,模板需求数量为5个,类型需求数量为3个。For example, as shown in Figure 19, if there are 24 templates in total, the number of template requirements is 5, and the number of type requirements is 3.
获取这24个模板的模板类型,并将这24个模板按照模板类型的不同进行分类,得到A、B、C、D四个类型模板组,每个类型模板组都对应一个模板类型,每个类型模板组中包括六个模板。Obtain the template types of the 24 templates, and classify the 24 templates according to the different template types, and obtain four type template groups A, B, C, and D. Each type template group corresponds to a template type, and each type template group corresponds to a template type. There are six templates in the Type Templates group.
其中,A模板类型下有A 1到A 6共6个模板,这6个模板的匹配得分分别为A 1-99分,A 2-98分,A 3-96分等。B模板类型下有B 1到B 6共6个模板,这6个模板的匹配得分分别为B 1-97分,B 2-94分,B 3-89分等。C模板类型下有C 1到C 6共6个模板,这6个模板的匹配得分分别为C 1-99分,C 2-95分,C 3-90分等。D模板类型下有D 1到D 6共6个模板,这6个模板的匹配得分分别为D 1-95分,D 2-94分,D 3-93分等。 Among them, there are 6 templates from A 1 to A 6 under the A template type, and the matching scores of these 6 templates are respectively A 1-99 points, A 2-98 points, A 3-96 points and so on. There are 6 templates from B 1 to B 6 under the B template type. The matching scores of these 6 templates are respectively B 1 -97 points, B 2 -94 points, B 3 -89 points, etc. There are 6 templates from C 1 to C 6 under the C template type, and the matching scores of these 6 templates are respectively C 1 -99 points, C 2 -95 points, C 3 -90 points, etc. There are 6 templates from D 1 to D 6 under the D template type, and the matching scores of these 6 templates are D 1 -95 points, D 2 -94 points, D 3 -93 points, etc. respectively.
A模板类型下的模板的匹配得分的最高分为A 1-99分,B模板类型下的模板的匹配得分的最高分为B 1-97分,C模板类型下的模板的匹配得分的最高分为C 1-99分,D模板类型下的模板的匹配得分的最高分为D 1-95分。 The maximum matching score of templates under A template type is A 1 -99 points, the maximum matching score of templates under B template type is B 1 -97 points, and the maximum matching score of templates under C template type For C 1 -99 points, the highest matching score of templates under the D template type is D 1 -95 points.
根据A、B、C、D四个模板类型下的模板的匹配得分的最高分,从四个模板类型中选择出三个匹配得分最高的模板,分别为模板A 1、模板B 1和模板C 1,使选择出的三个模板满足类型需求数量。 According to the highest matching scores of templates under the four template types A, B, C, and D, three templates with the highest matching scores are selected from the four template types, namely template A 1 , template B 1 and template C 1. Make the selected three templates meet the type requirements.
此时选择出的模板的数量为3个,而模板需求数量为5个,因此,可以按照剩余的21个模板中各个模板的匹配得分的高低,再选择2个模板,使选择出的模板数量满足模板需求数量。At this time, the number of templates selected is 3, and the required number of templates is 5. Therefore, according to the matching score of each template in the remaining 21 templates, 2 templates can be selected to make the number of templates selected. Meet the number of template requirements.
其中,在选择出类型需求数量的模板后,剩余的21个模板包括模板A 2到A 6,B 2到B 6,C 2到C 6和D 1到D 6。根据这21个模板中模板的匹配得分的高低,选择模板A 2和模板A 3,此时选择出的模板的数量满足模板需求数量,将选择出的满足模板需求数量的模板作为推荐模板,也即,将模板A 1、模板A 2、模板A 3、模板B 1和模板C 1这五个模板作为推荐模板。 Among them, after selecting the templates of the required number of types, the remaining 21 templates include templates A 2 to A 6 , B 2 to B 6 , C 2 to C 6 and D 1 to D 6 . According to the matching scores of the templates in these 21 templates, template A 2 and template A 3 are selected. At this time, the number of selected templates meets the required number of templates, and the selected templates that meet the required number of templates are used as recommended templates. That is, five templates, template A 1 , template A 2 , template A 3 , template B 1 , and template C 1 , are used as recommended templates.
另外,当有至少两个模板的匹配得分相同时,可以考虑模板的多样性,选择与已选模板的模板类型不同的模板。In addition, when there are at least two templates with the same matching score, the diversity of templates may be considered, and a template with a different template type from the selected template may be selected.
在一实施例中,所述根据所述多个模板的模板类型和匹配得分确定推荐模板,包括:根据所述模板的匹配得分的高低依次从所述多个模板中选择模板, 直至选择的模板类型满足类型需求数量;根据所述模板的匹配得分从所述多个模板中剩余的模板选择模板,直至选择的模板数量满足模板需求数量。In one embodiment, the determining the recommended template according to the template types and matching scores of the multiple templates includes: sequentially selecting templates from the multiple templates according to the matching scores of the templates, until the selected template is The type meets the required number of types; templates are selected from the remaining templates in the plurality of templates according to the matching scores of the templates, until the number of selected templates meets the required number of templates.
根据模板的匹配得分的高低从多个模板中选择模板,并确定选择出的模板类型,使在每次选择时,选择出的模板类型不同,直至选择出的模板类型满足类型需求数量。接着从剩余的模板中,根据剩余的模板的匹配得分再选择模板,直至选择出的模板数量满足模板需求数量。Templates are selected from multiple templates according to their matching scores, and the selected template type is determined, so that each time the selected template type is different, until the selected template type meets the required number of types. Then, from the remaining templates, templates are selected according to the matching scores of the remaining templates, until the number of selected templates meets the required number of templates.
例如,如图19所示,若共有24个模板,模板需求数量为5个,类型需求数量为3个。For example, as shown in Figure 19, if there are 24 templates in total, the number of template requirements is 5, and the number of type requirements is 3.
其中,A模板类型下有A 1到A 6共6个模板,这6个模板的匹配得分分别为A 1-99分,A 2-98分,A 3-96分等。B模板类型下有B 1到B 6共6个模板,这6个模板的匹配得分分别为B 1-97分,B 2-94分,B 3-89分等。C模板类型下有C 1到C 6共6个模板,这6个模板的匹配得分分别为C 1-99分,C 2-95分,C 3-90分等。D模板类型下有D 1到D 6共6个模板,这6个模板的匹配得分分别为D 1-95分,D 2-94分,D 3-93分等。 Among them, there are 6 templates from A 1 to A 6 under the A template type, and the matching scores of these 6 templates are respectively A 1-99 points, A 2-98 points, A 3-96 points and so on. There are 6 templates from B 1 to B 6 under the B template type. The matching scores of these 6 templates are respectively B 1 -97 points, B 2 -94 points, B 3 -89 points, etc. There are 6 templates from C 1 to C 6 under the C template type, and the matching scores of these 6 templates are respectively C 1 -99 points, C 2 -95 points, C 3 -90 points, etc. There are 6 templates from D 1 to D 6 under the D template type, and the matching scores of these 6 templates are D 1 -95 points, D 2 -94 points, D 3 -93 points, etc. respectively.
在选择模板时,从24个模板中选择出一个匹配得分最高的模板,此时由于模板A 1与模板C 1的匹配得分相同,且均为最高分,因此,可以从模板A 1与模板C 1中任选一个模板作为第一次选择出的模板,例如,第一次选择出的模板为模板A 1,对应的模板类型为A。 When selecting a template, select a template with the highest matching score from the 24 templates. At this time, since the matching scores of template A 1 and template C 1 are the same and both have the highest score, template A 1 and template C can be selected from template A 1 and template C. One template selected from 1 is selected as the template selected for the first time. For example, the template selected for the first time is template A 1 , and the corresponding template type is A.
然后再从除模板A 1以外的剩余23个模板中选择匹配得分最高的模板,第二次选择出的模板为模板C 1,对应的模板类型为C。此时选择出的模板类型为两个,尚未满足类型需求数量,因此,继续从除模板A 1和模板C 1以外的剩余22个模板中选择匹配得分最高的模板,这时匹配得分最高的模板为模板A 2,对应的模板类型为A,但由于已选择出的两个模板(模板A 1和模板C 1)中已经有了同样模板类型的模板A 1,因此,从除模板A 1和模板C 1以外剩余的22个模板中选择匹配得分与模板A 2的匹配得分的差值最小的模板,为模板B 1,对应的模板类型为B。由于模板B 1与已选择出的模板A 1和模板C 1的类型均不相同,此时将模板B 1作为第三次选择出的模板,此时选择出的模板为模板A 1、模板C 1和模板B 1,模板类型为三个,满足类型需求数量。 Then, the template with the highest matching score is selected from the remaining 23 templates except template A 1 , and the template selected for the second time is template C 1 , and the corresponding template type is C. At this time, there are two template types selected, and the required number of types has not been met. Therefore, the template with the highest matching score is continued to be selected from the remaining 22 templates except template A 1 and template C 1. At this time, the template with the highest matching score is template A 2 , and the corresponding template type is A, but since there is already a template A 1 of the same template type in the two templates that have been selected (template A 1 and template C 1 ), therefore, the template A 1 and Among the remaining 22 templates other than template C 1 , the template with the smallest difference between the matching score and the matching score of template A 2 is selected as template B 1 , and the corresponding template type is B. Since template B 1 is different from the selected template A 1 and template C 1 , template B 1 is used as the template selected for the third time, and the templates selected at this time are template A 1 and template C 1 and template B 1 , there are three template types, which meet the required number of types.
然后再根据模板的匹配得分,从除模板A 1、模板C 1和模板B 1以外的剩余的 21个模板中选择匹配得分最高的模板,选择出的模板为模板A 2,此时选择出的模板数量为四个,不满足模板需求数量。 Then according to the matching score of the template, the template with the highest matching score is selected from the remaining 21 templates except template A 1 , template C 1 and template B 1 , and the selected template is template A 2 . The number of templates is four, which does not meet the required number of templates.
再从剩余的20个模板中选择匹配得分最高的模板,选择出的模板为模板A 3,此时选择出的模板数量为五个,满足模板需求数量。将选择出的满足模板需求数量的模板作为推荐模板,也即,将模板A 1、模板A 2、模板A 3、模板B 1和模板C 1这五个模板作为推荐模板。 Then, the template with the highest matching score is selected from the remaining 20 templates, and the selected template is template A 3 . At this time, the number of selected templates is five, which meets the required number of templates. The selected templates that meet the required number of templates are used as recommended templates, that is, five templates of template A 1 , template A 2 , template A 3 , template B 1 and template C 1 are used as recommended templates.
S404、根据所述推荐模板对应的匹配关系将所述视频片段填入所述推荐模板的对应视频坑位,得到推荐视频。S404. Fill the video clip into the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template to obtain a recommended video.
根据推荐模板对应的匹配关系,也即推荐模板中视频坑位与视频片段之间的匹配关系,将视频片段填入对应的视频坑位,得到推荐视频。According to the matching relationship corresponding to the recommended template, that is, the matching relationship between the video pits and the video clips in the recommended template, the video clips are filled in the corresponding video pits to obtain the recommended video.
在一实施例中,步骤S404具体为确定所述视频片段的视频时长是否大于所述视频坑位的时长;若所述视频片段的视频时长大于所述视频坑位的时长,对所述视频片段进行片段提取,得到挑选片段。In one embodiment, step S404 is specifically to determine whether the video duration of the video clip is greater than the duration of the video pit; if the video duration of the video clip is greater than the duration of the video pit, the video clip is Fragment extraction is performed to obtain selected fragments.
在根据匹配关系将视频片段填入对应的视频坑位时,判断视频片段的视频时长是否大于视频坑位的时长。其中,视频坑位的时长决定了该视频坑位可以填入的视频片段的最大时长,因此,当视频片段的视频时长大于视频坑位的时长时,视频片段无法直接填入对应的视频坑位,需要从视频片段中提取相应时长的片段填入对应的视频坑位中。When filling the video clips into the corresponding video pits according to the matching relationship, it is determined whether the video duration of the video clips is greater than the duration of the video pits. Among them, the duration of the video pit determines the maximum duration of the video clip that can be filled in the video slot. Therefore, when the video duration of the video clip is greater than the duration of the video slot, the video clip cannot be directly filled into the corresponding video slot. , it is necessary to extract a segment of the corresponding duration from the video segment and fill it in the corresponding video pit.
其中,所述挑选片段的视频时长小于或等于所述视频坑位的时长。在具体实施过程中,为了将挑选片段填入对应的视频坑位,并保证得到的推荐视频的完整度,在确定挑选片段时,可以使挑选片段的视频时长等于视频坑位的时长。Wherein, the video duration of the selected segment is less than or equal to the duration of the video pit. In the specific implementation process, in order to fill the selected segment into the corresponding video pit and ensure the integrity of the obtained recommended video, when determining the selected segment, the video duration of the selected segment can be equal to the duration of the video pit.
在一实施例中,所述对所述视频片段进行片段提取,得到挑选片段,包括:根据所述视频片段的视频元素,对所述视频片段进行片段提取,得到挑选片段。In an embodiment, the performing segment extraction on the video segment to obtain the selected segment includes: performing segment extraction on the video segment according to the video element of the video segment to obtain the selected segment.
在对视频片段进行片段提取时,可以根据视频片段的视频元素对视频片段进行片段提取,得到挑选片段。When segment extraction is performed on the video segment, segment extraction can be performed on the video segment according to the video element of the video segment to obtain the selected segment.
其中,所述视频元素包括笑脸画面、笑声音频、人物动作、清晰人声、画面构图和美学打分中的至少一种。在提取挑选片段时,可以从视频片段中根据视频元素提取较为精彩的片段作为挑选片段,例如将包括笑脸画面的片段或者美学打分较高的片段作为挑选片段等。Wherein, the video elements include at least one of smiley face pictures, laughter audio, character movements, clear human voices, picture composition and aesthetic scores. When extracting selected clips, more exciting clips can be extracted from video clips as selected clips according to video elements, for example, clips including smiling faces or clips with high aesthetic scores are selected as selected clips.
在一实施例中,步骤S404具体为根据所述推荐模板对应的匹配关系将所述视频片段填入所述推荐模板的对应视频坑位,得到初始视频;基于所述推荐模板的模板要求对所述初始视频进行图像优化,得到推荐视频。In one embodiment, step S404 is specifically to fill the video clip into the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template to obtain an initial video; based on the template requirements of the recommended template Perform image optimization on the initial video to get the recommended video.
在根据推荐模板对应的匹配关系将视频片段填入对应的视频坑位后,得到初始视频,然后可以根据模板要求来对初始视频进行图像优化,将图像优化后的视频作为推荐视频推荐给用户。其中,所述模板要求包括转场设置、加减速设置、贴图特效设置中的至少一种。After filling the video clips into the corresponding video pits according to the matching relationship corresponding to the recommended template, the initial video is obtained, and then the initial video can be image-optimized according to the template requirements, and the image-optimized video is recommended to the user as a recommended video. The template requirements include at least one of transition settings, acceleration and deceleration settings, and texture special effects settings.
在一实施例中,由于待处理视频素材的来源可以是无人机拍摄的航拍视频,而航拍时摄像头与拍摄物体之间的距离较远,画面变化内容小。因此,在将航拍视频填入对应的视频坑位时,可以先自动识别画面变化的速度,根据画面变化的速度自动给航拍视频做调速处理,再将调速处理后的航拍视频填入对应的视频坑位中。所述画面变化的速度可以根据预设时间内多个连续帧进行分析得到。In one embodiment, since the source of the video material to be processed may be an aerial video shot by a drone, and the distance between the camera and the photographed object is relatively long during aerial photography, the content of the picture changes is small. Therefore, when filling the aerial video into the corresponding video pit, you can automatically identify the speed of the screen change, automatically adjust the speed of the aerial video according to the speed of the screen change, and then fill in the speed-adjusted aerial video into the corresponding video. in the video pit. The speed of the picture change can be obtained by analyzing a plurality of consecutive frames within a preset time.
在一实施例中,在确定模板的视频坑位与视频片段之间的匹配关系时,对于识别出的航拍视频,可以将航拍视频放在模板的前几个视频坑位和/或后几个视频坑位,从而提高得到的推荐视频的质量。In one embodiment, when determining the matching relationship between the video pits and the video clips of the template, for the identified aerial video, the aerial video can be placed in the first several video pits and/or the latter several of the template. Video pits, thereby improving the quality of the recommended video.
上述实施例提供的视频处理方法,通过获取多个模板,并为每个模板的视频坑位匹配视频片段,得到每个模板对应的匹配关系,从而确定出每个模板对应的匹配关系的匹配得分,根据匹配得分从多个模板中确定出推荐模板,最终根据推荐模板对应的匹配关系合成推荐视频。根据匹配得分确定推荐模板,基于推荐模板合成推荐视频,能够自动为视频片段确定合适的推荐模板,降低了用户在进行视频剪辑时的工作量,提高了合成的推荐视频的多样性。The video processing method provided by the above-mentioned embodiment, by obtaining a plurality of templates, and matching video clips for the video pits of each template, obtaining the matching relationship corresponding to each template, thereby determining the matching score corresponding to the matching relationship of each template. , according to the matching score, a recommended template is determined from multiple templates, and finally a recommended video is synthesized according to the matching relationship corresponding to the recommended template. The recommended template is determined according to the matching score, and the recommended video is synthesized based on the recommended template, which can automatically determine the appropriate recommended template for the video clip, reduce the workload of the user when editing the video, and improve the diversity of the synthesized recommended video.
需要说明的是,上述实施例根据实际需要,可以单独执行也可以组合执行,具体的执行顺序和组合方式不作具体的限定;各个步骤之间也可以单独执行或组合执行,具体的执行顺序和组合方式不作具体的限定。It should be noted that the above embodiments can be executed individually or in combination according to actual needs, and the specific execution order and combination are not specifically limited; each step can also be executed individually or in combination, and the specific execution sequence and combination The method is not specifically limited.
请参阅图20,图20是本申请实施例提供的一种视频处理装置的示意性框图。如图20所示,该视频处理装置500还至少包括一个或多个处理器501和存储器502。Please refer to FIG. 20. FIG. 20 is a schematic block diagram of a video processing apparatus provided by an embodiment of the present application. As shown in FIG. 20 , the video processing apparatus 500 further includes at least one or more processors 501 and a memory 502 .
其中,处理器501例如可以是微控制单元(Micro-controller Unit,MCU)、 中央处理单元(Central Processing Unit,CPU)或数字信号处理器(Digital Signal Processor,DSP)等。Wherein, the processor 501 may be, for example, a micro-controller unit (Micro-controller Unit, MCU), a central processing unit (Central Processing Unit, CPU), or a digital signal processor (Digital Signal Processor, DSP) or the like.
存储器502可以是Flash芯片、只读存储器(ROM,Read-Only Memory)磁盘、光盘、U盘或移动硬盘等。The memory 502 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) magnetic disk, an optical disk, a U disk, or a removable hard disk, and the like.
其中,存储器502用于存储计算机程序;处理器501用于执行所述计算机程序并在执行所述计算机程序时,执行本申请实施例提供的任一项所述的视频处理方法,以降低用户在进行视频剪辑时的工作量,提供多样化的推荐视频。Wherein, the memory 502 is used for storing a computer program; the processor 501 is used for executing the computer program, and when executing the computer program, executes any one of the video processing methods provided in the embodiments of the present application, so as to reduce the need for users to The workload of video editing, providing a variety of recommended videos.
请参阅图21,图21是本申请实施例提供的一种终端设备的示意性框图。如图21所示,该终端设备600还至少包括一个或多个处理器601和存储器602。Please refer to FIG. 21. FIG. 21 is a schematic block diagram of a terminal device provided by an embodiment of the present application. As shown in FIG. 21 , the terminal device 600 further includes at least one or more processors 601 and a memory 602 .
其中所述终端设备包括手机、遥控器、PC、平板电脑等终端。The terminal equipment includes terminals such as a mobile phone, a remote control, a PC, and a tablet computer.
其中,处理器601例如可以是微控制单元(Micro-controller Unit,MCU)、中央处理单元(Central Processing Unit,CPU)或数字信号处理器(Digital Signal Processor,DSP)等。The processor 601 may be, for example, a Micro-controller Unit (MCU), a Central Processing Unit (CPU), a Digital Signal Processor (DSP), or the like.
存储器602可以是Flash芯片、只读存储器(ROM,Read-Only Memory)磁盘、光盘、U盘或移动硬盘等。The memory 602 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) magnetic disk, an optical disk, a U disk, or a removable hard disk, and the like.
其中,存储器602用于存储计算机程序;处理器601用于执行所述计算机程序并在执行所述计算机程序时,执行本申请实施例提供的任一项所述的视频处理方法,以降低用户在进行视频剪辑时的工作量,提供多样化的推荐视频。Wherein, the memory 602 is used to store a computer program; the processor 601 is used to execute the computer program, and when executing the computer program, execute any one of the video processing methods provided in the embodiments of this application, so as to reduce the user's need for The workload of video editing, providing a variety of recommended videos.
本申请的实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序中包括程序指令,所述处理器执行所述程序指令,实现上述实施例提供的任一种所述的视频处理方法的步骤。Embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, the computer program includes program instructions, and the processor executes the program instructions to implement the above implementation The steps of any one of the video processing methods provided in the example.
其中,所述计算机可读存储介质可以是前述任一实施例所述的终端设备的内部存储单元,例如所述终端设备的存储器或内存。所述计算机可读存储介质也可以是所述终端设备的外部存储设备,例如所述终端设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。The computer-readable storage medium may be an internal storage unit of the terminal device described in any of the foregoing embodiments, such as a memory or memory of the terminal device. The computer-readable storage medium may also be an external storage device of the terminal device, such as a plug-in hard disk equipped on the terminal device, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) ) card, Flash Card, etc.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。 因此,本申请的保护范围应以权利要求的保护范围为准。The above are only specific embodiments of the present application, but the protection scope of the present application is not limited thereto. Any person skilled in the art can easily think of various equivalents within the technical scope disclosed in the present application. Modifications or substitutions shall be covered by the protection scope of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (144)

  1. 一种视频处理方法,其特征在于,包括:A video processing method, comprising:
    根据待处理视频素材的视频信息确定模板,所述模板至少包括一个视频坑位;Determine a template according to the video information of the video material to be processed, and the template includes at least one video pit;
    根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,得到所述模板对应的匹配关系,其中,所述视频片段为所述待处理视频素材中的片段;Determine the video clip corresponding to the video pit according to the pit information of the video pit in the template, and obtain the matching relationship corresponding to the template, wherein the video fragment is a fragment in the video material to be processed;
    根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。According to the matching relationship corresponding to the template, the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
  2. 根据权利要求1所述的方法,其特征在于,所述根据待处理视频素材的视频信息确定模板,包括:The method according to claim 1, wherein the determining the template according to the video information of the video material to be processed comprises:
    根据待处理视频素材的视频信息确定所述待处理视频素材的视频标签;Determine the video tag of the video material to be processed according to the video information of the video material to be processed;
    根据所述待处理视频素材的视频标签,确定与所述视频标签相匹配的多个模板。According to the video tag of the video material to be processed, multiple templates matching the video tag are determined.
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述待处理视频素材的视频标签,确定与所述视频标签相匹配的多个模板,包括:The method according to claim 2, wherein the determining a plurality of templates matching the video tag according to the video tag of the video material to be processed comprises:
    根据所述视频标签确定所述待处理视频素材对应的视频主题;Determine the video theme corresponding to the to-be-processed video material according to the video tag;
    根据所述待处理视频素材的视频主题,确定与所述视频主题相匹配的多个模板。According to the video theme of the video material to be processed, a plurality of templates matching the video theme are determined.
  4. 根据权利要求2所述的方法,其特征在于,所述视频标签包括运镜方向、景别、所述待处理视频素材中单一视频帧的目标物大小和位置、所述待处理视频素材中连续视频帧的目标物大小和位置、所述待处理视频素材中相邻视频帧的相似度中的至少一种。The method according to claim 2, wherein the video tag includes the direction of the mirror movement, the scene, the target size and position of a single video frame in the to-be-processed video material, the continuous At least one of the target size and position of the video frame, and the similarity of adjacent video frames in the video material to be processed.
  5. 根据权利要求4所述的方法,其特征在于,所述待处理视频素材中单一视频帧的目标物大小和位置为利用物体检测算法或显著性检测算法确定的。The method according to claim 4, wherein the size and position of the target object in a single video frame in the video material to be processed are determined by using an object detection algorithm or a saliency detection algorithm.
  6. 根据权利要求4所述的方法,其特征在于,所述待处理视频素材中连续视频帧的目标物大小和位置为基于预先训练的神经网络模型确定的。The method according to claim 4, wherein the target size and position of the continuous video frames in the video material to be processed are determined based on a pre-trained neural network model.
  7. 根据权利要求3所述的方法,其特征在于,所述方法还包括:The method according to claim 3, wherein the method further comprises:
    若无法确定所述待处理视频素材对应的视频主题,则选择预设模板作为所述待处理视频素材对应的模板。If the video theme corresponding to the to-be-processed video material cannot be determined, a preset template is selected as the template corresponding to the to-be-processed video material.
  8. 根据权利要求3所述的方法,其特征在于,所述根据所述待处理视频素材的视频主题,确定与所述视频主题相匹配的多个模板,包括:The method according to claim 3, wherein the determining a plurality of templates matching the video theme according to the video theme of the to-be-processed video material comprises:
    根据所述模板的模板影响因子,从与所述视频主题相匹配的多个模板中,确定所述待处理视频素材对应的模板。According to the template influence factor of the template, a template corresponding to the video material to be processed is determined from a plurality of templates matching the video theme.
  9. 根据权利要求8所述的方法,其特征在于,所述根据所述模板的模板影响因子,从与所述视频主题相匹配的多个模板中,确定所述待处理视频素材对应的模板,包括:The method according to claim 8, wherein, according to the template influence factor of the template, the template corresponding to the video material to be processed is determined from a plurality of templates matching the video theme, comprising: :
    获取所述模板影响因子的评价分数和预设权重;Obtain the evaluation score and preset weight of the template impact factor;
    根据所述模板影响因子的评价分数和预设权重,确定与所述视频主题相匹配的多个模板的模板打分;According to the evaluation score and preset weight of the template impact factor, determine the template scores of the multiple templates matching the video theme;
    根据所述模板打分确定所述待处理视频素材对应的模板。The template corresponding to the video material to be processed is determined according to the template score.
  10. 根据权利要求8所述的方法,其特征在于,所述模板影响因子包括音乐匹配度、模板热度和用户喜好度中的至少一种。The method according to claim 8, wherein the template influence factor comprises at least one of music matching degree, template popularity and user preference.
  11. 根据权利要求10所述的方法,其特征在于,所述音乐匹配度为根据预先训练好的音乐推荐网络模型得到的,所述音乐推荐网络模型能够输出与所述视频主题相匹配的多个模板的模板音乐与所述待处理视频素材的匹配度得分。The method according to claim 10, wherein the music matching degree is obtained according to a pre-trained music recommendation network model, and the music recommendation network model can output a plurality of templates matching the video theme The matching score between the template music and the video material to be processed.
  12. 根据权利要求10所述的方法,其特征在于,所述模板热度为根据与所述视频主题相匹配的多个模板被使用的频次和/或点赞数确定的。The method according to claim 10, wherein the template popularity is determined according to the frequency of use and/or the number of likes of a plurality of templates matching the video theme.
  13. 根据权利要求10所述的方法,其特征在于,所述用户喜好度为根据所述用户对与所述视频主题相匹配的多个模板的选择频次和/或满意度打分确定的。The method according to claim 10, wherein the user preference is determined according to the user's selection frequency and/or satisfaction score of multiple templates matching the video theme.
  14. 根据权利要求1所述的方法,其特征在于,所述坑位信息包括坑位音乐和坑位标签中的至少一种。The method according to claim 1, wherein the pit information includes at least one of pit music and pit labels.
  15. 根据权利要求1所述的方法,其特征在于,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,包括:The method according to claim 1, wherein, determining the video clip corresponding to the video pit according to the pit information of the video pit in the template, comprising:
    根据所述模板中视频坑位的坑位音乐确定与所述视频坑位对应的视频片 段。The video segment corresponding to the video pit is determined according to the pit music of the video pit in the template.
  16. 根据权利要求15所述的方法,其特征在于,所述根据所述模板中视频坑位的坑位音乐确定与所述视频坑位对应的视频片段,包括:The method according to claim 15, wherein, determining the video clip corresponding to the video pit according to the pit music of the video pit in the template, comprising:
    确定所述模板中视频坑位的坑位音乐与视频片段的匹配程度;Determine the matching degree of the pit music of the video pit in the template and the video clip;
    根据所述匹配程度确定所述模板中视频坑位对应的视频片段。The video segment corresponding to the video pit in the template is determined according to the matching degree.
  17. 根据权利要求16所述的方法,其特征在于,所述模板中视频坑位的坑位音乐与视频片段的匹配程度为利用预先训练的音乐匹配模型得到的,所述音乐匹配模型能够输出所述模板中视频坑位的坑位音乐与视频片段的匹配度得分。The method according to claim 16, wherein the matching degree of the pit music of the video pit in the template and the video clip is obtained by using a pre-trained music matching model, and the music matching model can output the The matching score of the pit music of the video pit in the template and the video clip.
  18. 根据权利要求1所述的方法,其特征在于,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段之后,还包括:The method according to claim 1, wherein after determining the video segment corresponding to the video pit according to the pit information of the video pit in the template, the method further comprises:
    确定所述模板中视频坑位对应的多个所述视频片段的拍摄质量;Determine the shooting quality of a plurality of the video clips corresponding to the video pits in the template;
    根据多个所述视频片段的拍摄质量确定所述模板中视频坑位对应的最优视频片段;Determine the optimal video clip corresponding to the video pit in the template according to the shooting quality of a plurality of the video clips;
    根据所述模板中视频坑位对应的最优视频片段得到所述模板对应的匹配关系。The matching relationship corresponding to the template is obtained according to the optimal video segment corresponding to the video pit in the template.
  19. 根据权利要求18所述的方法,其特征在于,所述视频片段的拍摄质量为根据所述视频片段的图像内容和视频片段评价确定的。The method according to claim 18, wherein the shooting quality of the video clip is determined according to the image content of the video clip and the evaluation of the video clip.
  20. 根据权利要求19所述的方法,其特征在于,所述图像内容包括是否有主要拍摄物体、镜头内信息量、镜头稳定性和色彩饱和度中的至少一种。The method according to claim 19, wherein the image content includes at least one of whether there is a main photographed object, the amount of information in the lens, lens stability and color saturation.
  21. 根据权利要求19所述的方法,其特征在于,所述视频片段评价包括视频片段的美学打分。20. The method of claim 19, wherein the video segment evaluation includes an aesthetic score for the video segment.
  22. 根据权利要求1所述的方法,其特征在于,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段之后,还包括:The method according to claim 1, wherein after determining the video segment corresponding to the video pit according to the pit information of the video pit in the template, the method further comprises:
    确定所述模板中相邻两个视频坑位对应的视频片段之间的匹配度;Determine the degree of matching between the video clips corresponding to two adjacent video pits in the template;
    根据所述匹配度确定与所述视频坑位对应的最优视频片段;Determine the optimal video segment corresponding to the video pit according to the matching degree;
    根据所述视频坑位对应的最优视频片段得到所述模板对应的匹配关系。The matching relationship corresponding to the template is obtained according to the optimal video segment corresponding to the video pit.
  23. 根据权利要求22所述的方法,其特征在于,所述相邻两个视频坑位对应的视频片段的匹配度为根据所述视频片段的运镜方向连贯性、景别的递增 递减关系和匹配剪辑确定的。The method according to claim 22, wherein the degree of matching of the video clips corresponding to the two adjacent video pits is based on the continuity of the moving direction of the video clips, the relationship of increasing and decreasing of scenes, and the matching degree. Clip ok.
  24. 根据权利要求23所述的方法,其特征在于,所述相邻两个视频坑位对应的视频片段的匹配度为利用预先训练的片段匹配模型得到的,所述片段匹配模型能够输出相邻两个所述视频坑位填入的视频片段的匹配度。The method according to claim 23, wherein the matching degree of the video clips corresponding to the two adjacent video pits is obtained by using a pre-trained clip matching model, and the clip matching model can output two adjacent video clips. The matching degree of the video clips filled in the video pits.
  25. 根据权利要求1所述的方法,其特征在于,所述根据所述模板中视频坑位的坑位信息确定与所述视频坑位对应的视频片段,包括:The method according to claim 1, wherein, determining the video clip corresponding to the video pit according to the pit information of the video pit in the template, comprising:
    根据所述模板中视频坑位的坑位标签确定与所述视频坑位对应的视频片段。The video segment corresponding to the video pit is determined according to the pit label of the video pit in the template.
  26. 根据权利要求25所述的方法,其特征在于,所述根据所述模板中视频坑位的坑位标签确定与所述视频坑位对应的视频片段,包括:The method according to claim 25, wherein, determining the video segment corresponding to the video pit according to the pit label of the video pit in the template, comprising:
    确定视频片段的视频标签,将与所述视频坑位的坑位标签相匹配的视频标签对应的视频片段作为待填入所述视频坑位的视频片段。The video tag of the video segment is determined, and the video segment corresponding to the video tag matching the pit tag of the video pit is used as the video segment to be filled in the video pit.
  27. 根据权利要求1至26任一项所述的方法,其特征在于,所述根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,包括:The method according to any one of claims 1 to 26, wherein the step of filling the video segment into the corresponding video slot of the template according to the matching relationship corresponding to the template, comprises:
    确定所述视频片段的视频时长是否大于所述视频坑位的时长;Determine whether the video duration of the video clip is greater than the duration of the video pit;
    若所述视频片段的视频时长大于所述视频坑位的时长,对所述视频片段进行片段提取,得到挑选片段;If the video duration of the video segment is greater than the duration of the video pit, extracting segments from the video segment to obtain a selection segment;
    其中,所述挑选片段的视频时长小于或等于所述视频坑位的时长。Wherein, the video duration of the selected segment is less than or equal to the duration of the video pit.
  28. 根据权利要求27所述的方法,其特征在于,所述对所述视频片段进行片段提取,得到挑选片段,包括:The method according to claim 27, wherein the performing segment extraction on the video segment to obtain the selected segment, comprising:
    根据所述视频片段的视频元素,对所述视频片段进行片段提取,得到挑选片段。Segment extraction is performed on the video segment according to the video element of the video segment to obtain a selected segment.
  29. 根据权利要求28所述的方法,其特征在于,所述视频元素包括笑脸画面、笑声音频、人物动作、清晰人声、画面构图和美学打分中的至少一种。The method of claim 28, wherein the video elements include at least one of smiley images, laughter audio, character movements, clear human voices, image composition, and aesthetic scores.
  30. 根据权利要求1至26任一项所述的方法,其特征在于,所述根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频,包括:The method according to any one of claims 1 to 26, wherein, according to the matching relationship corresponding to the template, the video clip is filled into the corresponding video slot of the template to obtain a recommended video, comprising:
    根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到初始视频;Filling the video clips into the corresponding video pits of the template according to the matching relationship corresponding to the template to obtain an initial video;
    基于所述模板的模板要求对所述初始视频进行图像优化,得到推荐视频。The template based on the template requires image optimization of the initial video to obtain a recommended video.
  31. 根据权利要求30所述的方法,其特征在于,所述模板要求包括转场设置、加减速设置、贴图特效设置中的至少一种。The method according to claim 30, wherein the template requirements include at least one of transition settings, acceleration and deceleration settings, and texture special effects settings.
  32. 根据权利要求1至26任一项所述的方法,其特征在于,所述方法包括:The method according to any one of claims 1 to 26, wherein the method comprises:
    对所述待处理视频素材进行去重处理。Perform de-duplication processing on the to-be-processed video material.
  33. 根据权利要求32所述的方法,其特征在于,所述去重处理包括相似素材聚类。The method of claim 32, wherein the deduplication process includes clustering of similar materials.
  34. 根据权利要求1至26任一项所述的方法,其特征在于,所述方法包括:The method according to any one of claims 1 to 26, wherein the method comprises:
    获取所述待处理视频素材的图像质量;obtaining the image quality of the video material to be processed;
    根据所述待处理视频素材的图像质量对所述待处理视频素材进行废片去除。The to-be-processed video material is discarded according to the image quality of the to-be-processed video material.
  35. 根据权利要求34所述的方法,其特征在于,所述图像质量包括画面抖动、画面模糊、画面过曝、画面欠曝、图像中无明确场景或图像中无明确主体中的至少一种。The method according to claim 34, wherein the image quality comprises at least one of picture jitter, picture blur, picture overexposure, picture underexposure, no clear scene in the image, or no clear subject in the image.
  36. 根据权利要求1至26任一项所述的方法,其特征在于,所述待处理视频素材包括通过手持端拍摄的视频素材、通过可移动平台拍摄的视频素材、从云端服务器获取的视频素材和从本地服务器获取的视频素材中的至少一种。The method according to any one of claims 1 to 26, wherein the video material to be processed includes video material shot by a handheld terminal, video material shot by a movable platform, video material obtained from a cloud server, and At least one of the video materials obtained from the local server.
  37. 根据权利要求1至26任一项所述的方法,其特征在于,所述方法包括:The method according to any one of claims 1 to 26, wherein the method comprises:
    对待处理视频素材进行素材选择。Material selection for the video material to be processed.
  38. 根据权利要求37所述的方法,其特征在于,所述对待处理视频素材进行素材选择,包括:The method according to claim 37, wherein the material selection for the video material to be processed comprises:
    根据所述待处理视频素材的素材参数进行素材选择;Material selection is performed according to the material parameters of the video material to be processed;
    其中,所述素材参数包括拍摄时间、拍摄地点、拍摄目标物中的至少一种。Wherein, the material parameters include at least one of shooting time, shooting location, and shooting target.
  39. 根据权利要求37所述的方法,其特征在于,所述对待处理视频素材进行素材选择,包括:The method according to claim 37, wherein the material selection for the video material to be processed comprises:
    根据用户的选择操作对待处理视频素材进行素材选择。Material selection is performed on the video material to be processed according to the user's selection operation.
  40. 根据权利要求37所述的方法,其特征在于,所述对待处理视频素材进行素材选择,包括:The method according to claim 37, wherein the material selection for the video material to be processed comprises:
    根据所述待处理视频素材的素材参数,对所述待处理视频素材进行聚类以实现素材选择;According to the material parameters of the to-be-processed video material, the to-be-processed video material is clustered to realize material selection;
    其中,所述聚类包括时间聚类、地点聚类、目标物聚类中的至少一种。Wherein, the clustering includes at least one of time clustering, location clustering, and target object clustering.
  41. 根据权利要求1至26任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 26, wherein the method further comprises:
    根据待处理视频素材的视频信息,对所述待处理视频素材进行分割生成多个视频片段。According to the video information of the video material to be processed, the to-be-processed video material is divided to generate multiple video segments.
  42. 根据权利要求41所述的方法,其特征在于,所述视频信息包括运镜方向和场景信息中的至少一项。The method according to claim 41, wherein the video information includes at least one of a moving direction and scene information.
  43. 根据权利要求41所述的方法,其特征在于,所述对所述待处理视频素材进行分割生成多个视频片段,包括:The method according to claim 41, wherein the step of dividing the to-be-processed video material to generate multiple video segments comprises:
    根据所述待处理视频素材的视频信息对所述待处理视频素材进行分割,得到多个第一视频片段;dividing the to-be-processed video material according to the video information of the to-be-processed video material to obtain a plurality of first video segments;
    对所述第一视频片段进行聚类分割,得到多个第二视频片段;Clustering and segmenting the first video clips to obtain a plurality of second video clips;
    其中,将所述第二视频片段作为待填入所述模板的视频坑位的视频片段。Wherein, the second video segment is used as the video segment to be filled in the video pit of the template.
  44. 根据权利要求43所述的方法,其特征在于,所述对所述第一视频片段进行聚类分割之前,所述方法还包括:The method according to claim 43, wherein before the cluster segmentation of the first video segment, the method further comprises:
    确定多个所述第一视频片段中是否存在有视频时长大于预设时长的第一视频片段;determining whether there is a first video clip with a video duration greater than a preset duration in the plurality of first video clips;
    若存在有视频时长大于所述预设时长的第一视频片段,执行所述对所述第一视频片段进行聚类分割的步骤。If there is a first video segment with a video duration greater than the preset duration, the step of clustering and segmenting the first video segment is performed.
  45. 根据权利要求43所述的方法,其特征在于,所述对所述第一视频片段进行聚类分割,包括:The method according to claim 43, wherein the performing cluster segmentation on the first video segment comprises:
    确定滑动窗口和聚类中心,其中,所述滑动窗口用于确定待处理的当前视频帧,所述聚类中心用于确定所述第一视频片段的视频分割点;determining a sliding window and a clustering center, wherein the sliding window is used to determine the current video frame to be processed, and the clustering center is used to determine the video segmentation point of the first video segment;
    基于所述聚类中心,根据所述滑动窗口对所述第一视频片段的视频帧进行聚类分析,确定视频分割点;Based on the cluster center, cluster analysis is performed on the video frames of the first video segment according to the sliding window, and a video segmentation point is determined;
    根据所述视频分割点对所述第一视频片段进行视频分割。Video segmentation is performed on the first video segment according to the video segmentation point.
  46. 根据权利要求45所述的方法,其特征在于,所述聚类中心包括所述第一视频片段的第一帧视频帧的图像特征。The method of claim 45, wherein the cluster centers comprise image features of a first video frame of the first video segment.
  47. 根据权利要求46所述的方法,其特征在于,所述第一视频片段的第一帧视频帧的图像特征是根据预先训练好的图像特征网络模型得到的,所述图像特征网络模型能够输出所述第一视频片段中各个视频帧的图像特征。The method according to claim 46, wherein the image features of the first video frame of the first video segment are obtained according to a pre-trained image feature network model, and the image feature network model can output the Describe the image features of each video frame in the first video segment.
  48. 根据权利要求45所述的方法,其特征在于,所述滑动窗口的大小与所述第一视频片段的时长相关;或者,所述滑动窗口的大小与用户设置的期望分割速度相关。The method according to claim 45, wherein the size of the sliding window is related to the duration of the first video segment; or, the size of the sliding window is related to a desired segmentation speed set by a user.
  49. 根据权利要求45所述的方法,其特征在于,所述滑动窗口的大小等于1。The method of claim 45, wherein the size of the sliding window is equal to one.
  50. 根据权利要求45所述的方法,其特征在于,所述基于所述聚类中心,根据所述滑动窗口对所述第一视频片段的视频帧进行聚类分析,确定视频分割点,包括:The method according to claim 45, wherein, based on the cluster center, performing a cluster analysis on the video frames of the first video segment according to the sliding window to determine a video segmentation point, comprising:
    根据所述滑动窗口确定当前视频帧,并确定所述当前视频帧的图像特征与所述聚类中心的相似度;Determine the current video frame according to the sliding window, and determine the similarity between the image feature of the current video frame and the cluster center;
    若所述相似度小于预设阈值,则将所述当前视频帧作为视频分割点,并重新确定聚类中心;If the similarity is less than the preset threshold, the current video frame is used as the video segmentation point, and the cluster center is re-determined;
    根据重新确定的聚类中心,继续确定视频分割点,直至所述第一视频片段的最后一个视频帧。According to the re-determined cluster center, the video segmentation point is continued to be determined until the last video frame of the first video segment.
  51. 根据权利要求50所述的方法,其特征在于,所述确定所述当前视频帧的图像特征与所述聚类中心的相似度,包括:The method according to claim 50, wherein the determining the similarity between the image feature of the current video frame and the cluster center comprises:
    确定所述当前视频帧的图像特征与所述聚类中心之间的余弦相似度。A cosine similarity between the image feature of the current video frame and the cluster center is determined.
  52. 根据权利要求50所述的方法,其特征在于,所述重新确定聚类中心,包括:The method according to claim 50, wherein the re-determining the cluster center comprises:
    将所述当前视频帧的图像特征作为重新确定的聚类中心。The image feature of the current video frame is used as the re-determined cluster center.
  53. 根据权利要求50所述的方法,其特征在于,所述确定所述当前视频帧的图像特征与所述聚类中心的相似度之后,所述方法还包括:The method according to claim 50, wherein after determining the similarity between the image feature of the current video frame and the cluster center, the method further comprises:
    若所述相似度大于或等于预设阈值,则更新所述聚类中心;If the similarity is greater than or equal to a preset threshold, update the cluster center;
    根据更新后的所述聚类中心继续确定当前视频帧的图像特征与更新后的所述聚类中心的相似度。Continue to determine the similarity between the image feature of the current video frame and the updated cluster center according to the updated cluster center.
  54. 根据权利要求53所述的方法,其特征在于,所述更新所述聚类中心,包括:The method of claim 53, wherein the updating the cluster centers comprises:
    获取所述当前视频帧的图像特征;obtaining the image feature of the current video frame;
    根据所述当前视频帧的图像特征和所述聚类中心,确定更新后的聚类中心。The updated cluster center is determined according to the image feature of the current video frame and the cluster center.
  55. 一种视频处理方法,其特征在于,包括:A video processing method, comprising:
    根据待处理视频素材的视频片段和模板的视频坑位构建流网络图;Build a stream network diagram according to the video clips of the video material to be processed and the video pits of the template;
    基于所述流网络图确定所述视频片段与所述视频坑位对应的匹配关系;Determine the matching relationship corresponding to the video segment and the video pit based on the stream network graph;
    根据所述匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频;Filling the video clip into the corresponding video pit of the template according to the matching relationship to obtain a recommended video;
    其中,所述流网络图包括多个节点,每个所述节点对应一个所述视频片段和一个所述视频坑位的匹配关系。The flow network graph includes a plurality of nodes, and each of the nodes corresponds to a matching relationship between one of the video clips and one of the video pits.
  56. 根据权利要求55所述的视频处理方法,其特征在于,所述基于所述流网络图确定所述视频片段与所述视频坑位对应的匹配关系,包括:The video processing method according to claim 55, wherein the determining the matching relationship between the video segment and the video pit based on the flow network graph comprises:
    基于最大流算法为所述模板的视频坑位匹配合适的视频片段,得到最优路径;Based on the maximum flow algorithm, the video pits of the template are matched with suitable video clips to obtain the optimal path;
    其中,将所述最优路径中所述视频片段与所述视频坑位之间的对应关系作为所述模板的视频坑位与所述视频片段的匹配关系。Wherein, the corresponding relationship between the video clips and the video pits in the optimal path is used as the matching relationship between the video pits of the template and the video clips.
  57. 根据权利要求56所述的方法,其特征在于,所述基于最大流算法为所述模板的视频坑位匹配合适的视频片段,得到最优路径,包括:The method according to claim 56, wherein the matching of the video pits of the template based on the maximum flow algorithm to an appropriate video segment to obtain an optimal path, comprising:
    根据所述流网络图中相邻两个节点之间的能量值,确定所述模板对应的最优路径。According to the energy value between two adjacent nodes in the flow network graph, the optimal path corresponding to the template is determined.
  58. 根据权利要求57所述的方法,其特征在于,所述方法包括:The method of claim 57, wherein the method comprises:
    根据每个所述节点的能量值影响因子确定相邻两个节点之间的能量值。The energy value between two adjacent nodes is determined according to the energy value influence factor of each of the nodes.
  59. 根据权利要求58所述的方法,其特征在于,所述根据每个所述节点的能量值影响因子确定相邻两个节点之间的能量值,包括:The method according to claim 58, wherein the determining the energy value between two adjacent nodes according to the energy value influence factor of each of the nodes comprises:
    获取所述能量值影响因子的评价分数和预设权重;obtaining the evaluation score and preset weight of the energy value influencing factor;
    根据所述能量值影响因子的评价分数和预设权重确定相邻两个节点之间 的能量值。The energy value between two adjacent nodes is determined according to the evaluation score of the energy value influence factor and the preset weight.
  60. 根据权利要求58所述的方法,其特征在于,所述能量值影响因子包括每个所述视频坑位对应的视频片段的拍摄质量、每个所述视频坑位与对应的视频片段的匹配程度和相邻两个所述视频坑位对应的视频片段的匹配度中的至少一种。The method according to claim 58, wherein the energy value influence factor comprises the shooting quality of the video clip corresponding to each video pit, the degree of matching between each video pit and the corresponding video clip At least one of the matching degrees of video clips corresponding to two adjacent video pits.
  61. 根据权利要求60所述的方法,其特征在于,所述每个所述视频坑位对应的视频片段的拍摄质量为根据所述视频片段的图像内容和视频片段评价确定的。The method according to claim 60, wherein the shooting quality of the video clip corresponding to each of the video pits is determined according to the image content of the video clip and the evaluation of the video clip.
  62. 根据权利要求60所述的方法,其特征在于,所述每个所述视频坑位与对应的视频片段的匹配程度为根据所述视频坑位的坑位音乐与所述视频片段的匹配度确定的。The method according to claim 60, wherein the matching degree of each of the video pits and the corresponding video clip is determined according to the matching degree of the pit music of the video pit and the video clip of.
  63. 根据权利要求62所述的方法,其特征在于,所述每个所述视频坑位与对应的视频片段的匹配程度为利用预先训练的音乐匹配模型得到的,所述音乐匹配模型能够输出所述视频坑位的坑位音乐与所述视频片段的匹配度得分。The method according to claim 62, wherein the degree of matching between each of the video pits and the corresponding video segment is obtained by using a pre-trained music matching model, and the music matching model can output the The matching degree score between the pit music of the video pit and the video clip.
  64. 根据权利要求60所述的方法,其特征在于,所述相邻两个所述视频坑位对应的视频片段的匹配度为根据所述视频片段的运镜方向连贯性、景别的递增递减关系和匹配剪辑确定的。The method according to claim 60, wherein the matching degree of the video clips corresponding to the two adjacent video pits is based on the coherence of the moving direction of the video clips, and the relationship of increasing and decreasing according to the scene. and matching clips are ok.
  65. 根据权利要求64所述的方法,其特征在于,所述相邻两个所述视频坑位对应的视频片段的匹配度为利用预先训练的片段匹配模型得到的,所述片段匹配模型能够输出相邻两个所述视频坑位填入的视频片段的匹配度。The method according to claim 64, wherein the matching degree of the video clips corresponding to the two adjacent video pits is obtained by using a pre-trained clip matching model, and the clip matching model can output the corresponding matching degree. Matching degree of video clips filled in adjacent two video pits.
  66. 一种视频处理方法,其特征在于,包括:A video processing method, comprising:
    获取多个模板,所述模板至少包括一个视频坑位;Obtain a plurality of templates, and the template includes at least one video pit;
    为每个所述模板的视频坑位匹配视频片段,得到每个所述模板对应的匹配关系,并确定每个所述模板对应的匹配关系的匹配得分,其中所述视频片段为待处理视频素材的片段;Matching video clips for the video pits of each of the templates, obtaining a matching relationship corresponding to each of the templates, and determining a matching score of the matching relationship corresponding to each of the templates, wherein the video clips are the video material to be processed fragment;
    根据所述匹配得分从所述多个模板中确定推荐模板;determining a recommended template from the plurality of templates according to the matching score;
    根据所述推荐模板对应的匹配关系将所述视频片段填入所述推荐模板的对应视频坑位,得到推荐视频。The video clip is filled in the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template to obtain a recommended video.
  67. 根据权利要求66所述的方法,其特征在于,所述方法还包括:The method of claim 66, wherein the method further comprises:
    对待处理素材进行分割生成多个视频片段。Split the material to be processed to generate multiple video clips.
  68. 根据权利要求67所述的方法,其特征在于,所述对待处理素材进行分割生成多个视频片段,包括:The method according to claim 67, wherein the dividing the material to be processed to generate a plurality of video clips comprises:
    根据待处理视频素材的视频信息,对所述待处理视频素材进行分割生成多个视频片段。According to the video information of the video material to be processed, the to-be-processed video material is divided to generate multiple video segments.
  69. 根据权利要求68所述的方法,其特征在于,所述视频信息包括运镜方向和场景信息中的至少一项。The method according to claim 68, wherein the video information includes at least one of a moving direction and scene information.
  70. 根据权利要求68所述的方法,其特征在于,所述对所述待处理视频素材进行分割生成多个视频片段,包括:The method according to claim 68, wherein the step of dividing the to-be-processed video material to generate multiple video segments comprises:
    根据所述待处理视频素材的视频信息对所述待处理视频素材进行分割,得到多个第一视频片段;dividing the to-be-processed video material according to the video information of the to-be-processed video material to obtain a plurality of first video segments;
    对所述第一视频片段进行聚类分割,得到多个第二视频片段;Clustering and segmenting the first video clips to obtain a plurality of second video clips;
    其中,将所述第二视频片段作为待填入所述模板的视频坑位的视频片段。Wherein, the second video segment is used as the video segment to be filled in the video pit of the template.
  71. 根据权利要求70所述的方法,其特征在于,所述对所述第一视频片段进行聚类分割之前,所述方法还包括:The method according to claim 70, wherein before the cluster segmentation of the first video segment, the method further comprises:
    确定多个所述第一视频片段中是否存在有视频时长大于预设时长的第一视频片段;determining whether there is a first video clip with a video duration greater than a preset duration in the plurality of first video clips;
    若存在有视频时长大于所述预设时长的第一视频片段,执行所述对所述第一视频片段进行聚类分割的步骤。If there is a first video segment with a video duration greater than the preset duration, the step of clustering and segmenting the first video segment is performed.
  72. 根据权利要求70所述的方法,其特征在于,所述对所述第一视频片段进行聚类分割,包括:The method according to claim 70, wherein the performing cluster segmentation on the first video segment comprises:
    确定滑动窗口和聚类中心,其中,所述滑动窗口用于确定待处理的当前视频帧,所述聚类中心用于确定所述第一视频片段的视频分割点;determining a sliding window and a clustering center, wherein the sliding window is used to determine the current video frame to be processed, and the clustering center is used to determine the video segmentation point of the first video segment;
    基于所述聚类中心,根据所述滑动窗口对所述第一视频片段的视频帧进行聚类分析,确定视频分割点;Based on the clustering center, clustering analysis is performed on the video frames of the first video segment according to the sliding window, and a video segmentation point is determined;
    根据所述视频分割点对所述第一视频片段进行视频分割。Video segmentation is performed on the first video segment according to the video segmentation point.
  73. 根据权利要求72所述的方法,其特征在于,所述聚类中心包括所述第一视频片段的第一帧视频帧的图像特征。The method of claim 72, wherein the cluster centers comprise image features of a first video frame of the first video segment.
  74. 根据权利要求73所述的方法,其特征在于,所述第一视频片段的第 一帧视频帧的图像特征是根据预先训练好的图像特征网络模型得到的,所述图像特征网络模型能够输出所述第一视频片段中各个视频帧的图像特征。The method according to claim 73, wherein the image features of the first video frame of the first video segment are obtained according to a pre-trained image feature network model, and the image feature network model can output the Describe the image features of each video frame in the first video segment.
  75. 根据权利要求72所述的方法,其特征在于,所述滑动窗口的大小与所述第一视频片段的时长相关;或者,所述滑动窗口的大小与用户设置的期望分割速度相关。The method according to claim 72, wherein the size of the sliding window is related to the duration of the first video clip; or, the size of the sliding window is related to a desired segmentation speed set by a user.
  76. 根据权利要求72所述的方法,其特征在于,所述滑动窗口的大小等于1。The method of claim 72, wherein the size of the sliding window is equal to one.
  77. 根据权利要求72所述的方法,其特征在于,所述基于所述聚类中心,根据所述滑动窗口对所述第一视频片段的视频帧进行聚类分析,确定视频分割点,包括:The method according to claim 72, characterized in that, based on the cluster center, performing a cluster analysis on the video frames of the first video segment according to the sliding window to determine a video segmentation point, comprising:
    根据所述滑动窗口确定当前视频帧,并确定所述当前视频帧的图像特征与所述聚类中心的相似度;Determine the current video frame according to the sliding window, and determine the similarity between the image feature of the current video frame and the cluster center;
    若所述相似度小于预设阈值,则将所述当前视频帧作为视频分割点,并重新确定聚类中心;If the similarity is less than the preset threshold, the current video frame is used as the video segmentation point, and the cluster center is re-determined;
    根据重新确定的聚类中心,继续确定视频分割点,直至所述第一视频片段的最后一个视频帧。According to the re-determined cluster center, the video segmentation point is continued to be determined until the last video frame of the first video segment.
  78. 根据权利要求77所述的方法,其特征在于,所述确定所述当前视频帧的图像特征与所述聚类中心的相似度,包括:The method according to claim 77, wherein the determining the similarity between the image feature of the current video frame and the cluster center comprises:
    确定所述当前视频帧的图像特征与所述聚类中心之间的余弦相似度。A cosine similarity between the image feature of the current video frame and the cluster center is determined.
  79. 根据权利要求77所述的方法,其特征在于,所述重新确定聚类中心,包括:The method of claim 77, wherein the re-determining the cluster center comprises:
    将所述当前视频帧的图像特征作为重新确定的聚类中心。The image feature of the current video frame is used as the re-determined cluster center.
  80. 根据权利要求77所述的方法,其特征在于,所述确定所述当前视频帧的图像特征与所述聚类中心的相似度之后,所述方法还包括:The method according to claim 77, wherein after determining the similarity between the image feature of the current video frame and the cluster center, the method further comprises:
    若所述相似度大于或等于预设阈值,则更新所述聚类中心;If the similarity is greater than or equal to a preset threshold, update the cluster center;
    根据更新后的所述聚类中心继续确定当前视频帧的图像特征与更新后的所述聚类中心的相似度。Continue to determine the similarity between the image feature of the current video frame and the updated cluster center according to the updated cluster center.
  81. 根据权利要求80所述的方法,其特征在于,所述更新所述聚类中心,包括:The method of claim 80, wherein the updating the cluster centers comprises:
    获取所述当前视频帧的图像特征;obtaining the image feature of the current video frame;
    根据所述当前视频帧的图像特征和所述聚类中心,确定更新后的聚类中心。The updated cluster center is determined according to the image feature of the current video frame and the cluster center.
  82. 根据权利要求66所述的方法,其特征在于,所述为每个所述模板的视频坑位匹配视频片段,得到每个所述模板对应的匹配关系,包括:The method according to claim 66, wherein the matching of video clips for the video pits of each of the templates to obtain a matching relationship corresponding to each of the templates, comprising:
    根据所述视频片段和每个所述模板的视频坑位构建多个流网络图,所述流网络图包括多个节点,每个所述节点对应一个所述视频片段和一个所述视频坑位的匹配关系;A plurality of stream network graphs are constructed according to the video clips and the video pits of each of the templates, the stream network graph includes a plurality of nodes, and each of the nodes corresponds to one of the video clips and one of the video pits matching relationship;
    基于多个所述流网络图确定每个所述模板的视频坑位与所述视频片段的匹配关系。The matching relationship between the video pits of each template and the video segment is determined based on the plurality of stream network graphs.
  83. 根据权利要求82所述的方法,其特征在于,所述基于多个所述流网络图确定每个所述模板的视频坑位与所述视频片段的匹配关系,包括:The method according to claim 82, wherein the determining the matching relationship between the video pits of each of the templates and the video clips based on a plurality of the stream network graphs comprises:
    基于最大流算法为每个所述模板的视频坑位匹配合适的视频片段,得到最优路径;Based on the maximum flow algorithm, a suitable video segment is matched for the video pit of each described template to obtain the optimal path;
    其中,将所述最优路径中所述视频片段与所述视频坑位之间的对应关系作为每个所述模板的视频坑位与所述视频片段的匹配关系。Wherein, the corresponding relationship between the video clips and the video pits in the optimal path is taken as the matching relationship between the video pits of each template and the video clips.
  84. 根据权利要求83所述的方法,其特征在于,所述基于最大流算法为每个所述模板的视频坑位匹配合适的视频片段,得到最优路径,包括:The method according to claim 83, wherein the matching of a suitable video segment for each of the video pits of the template based on a maximum flow algorithm to obtain an optimal path, comprising:
    根据所述流网络图中相邻两个节点之间的能量值,确定每个所述模板对应的最优路径。According to the energy value between two adjacent nodes in the flow network graph, the optimal path corresponding to each template is determined.
  85. 根据权利要求84所述的方法,其特征在于,所述确定每个所述模板的视频坑位与所述视频片段的匹配关系的匹配得分,包括:The method according to claim 84, wherein the determining the matching score of the matching relationship between the video pits of each of the templates and the video clips comprises:
    根据所述最优路径中每相邻两个节点之间的能量值,确定每个所述模板的视频坑位与所述视频片段的匹配关系的匹配得分。According to the energy value between every two adjacent nodes in the optimal path, a matching score of the matching relationship between the video pit of each template and the video segment is determined.
  86. 根据权利要求84所述的方法,其特征在于,所述方法包括:The method of claim 84, wherein the method comprises:
    根据每个所述节点的能量值影响因子确定相邻两个节点之间的能量值。The energy value between two adjacent nodes is determined according to the energy value influence factor of each of the nodes.
  87. 根据权利要求86所述的方法,其特征在于,所述根据每个所述节点的能量值影响因子确定相邻两个节点之间的能量值,包括:The method according to claim 86, wherein the determining the energy value between two adjacent nodes according to the energy value influence factor of each of the nodes comprises:
    获取所述能量值影响因子的评价分数和预设权重;obtaining the evaluation score and preset weight of the energy value influencing factor;
    根据所述能量值影响因子的评价分数和预设权重确定相邻两个节点之间 的能量值。The energy value between two adjacent nodes is determined according to the evaluation score of the energy value influence factor and the preset weight.
  88. 根据权利要求86所述的方法,其特征在于,所述能量值影响因子包括每个所述视频坑位对应的视频片段的拍摄质量、每个所述视频坑位与对应的视频片段的匹配程度和相邻两个所述视频坑位对应的视频片段的匹配度中的至少一种。The method according to claim 86, wherein the energy value influence factor comprises the shooting quality of the video clip corresponding to each video pit, the degree of matching between each video pit and the corresponding video clip At least one of the matching degrees of video clips corresponding to two adjacent video pits.
  89. 根据权利要求88所述的方法,其特征在于,所述每个所述视频坑位对应的视频片段的拍摄质量为根据所述视频片段的图像内容和视频片段评价确定的。The method according to claim 88, wherein the shooting quality of the video clip corresponding to each of the video pits is determined according to the image content of the video clip and the evaluation of the video clip.
  90. 根据权利要求88所述的方法,其特征在于,所述每个所述视频坑位与对应的视频片段的匹配程度为根据所述视频坑位的坑位音乐与所述视频片段的匹配度确定的。The method according to claim 88, wherein the degree of matching between each of the video pits and the corresponding video clip is determined according to the matching degree of the pit music of the video pit and the video clip of.
  91. 根据权利要求90所述的方法,其特征在于,所述每个所述视频坑位与对应的视频片段的匹配程度为利用预先训练的音乐匹配模型得到的,所述音乐匹配模型能够输出所述视频坑位的坑位音乐与所述视频片段的匹配度得分。The method according to claim 90, wherein the degree of matching between each of the video pits and the corresponding video segment is obtained by using a pre-trained music matching model, and the music matching model can output the The matching degree score between the pit music of the video pit and the video clip.
  92. 根据权利要求88所述的方法,其特征在于,所述相邻两个所述视频坑位对应的视频片段的匹配度为根据所述视频片段的运镜方向连贯性、景别的递增递减关系和匹配剪辑确定的。The method according to claim 88, wherein the degree of matching of the video clips corresponding to the two adjacent video pits is a relationship of increasing and decreasing according to the continuity of the moving direction of the video clips and the degree of scene. and matching clips are ok.
  93. 根据权利要求92所述的方法,其特征在于,所述相邻两个所述视频坑位对应的视频片段的匹配度为利用预先训练的片段匹配模型得到的,所述片段匹配模型能够输出相邻两个所述视频坑位填入的视频片段的匹配度。The method according to claim 92, wherein the matching degree of the video clips corresponding to the two adjacent video pits is obtained by using a pre-trained clip matching model, and the clip matching model can output the corresponding matching degree. Matching degree of video clips filled in adjacent two video pits.
  94. 根据权利要求66所述的方法,其特征在于,所述为每个所述模板的视频坑位匹配视频片段,得到每个所述模板对应的匹配关系,包括:The method according to claim 66, wherein the matching of video clips for the video pits of each of the templates to obtain a matching relationship corresponding to each of the templates, comprising:
    根据所述模板的视频坑位的坑位标签或所述模板的模板标签对所述视频片段进行分类,得到分类后的视频片段;According to the pit label of the video pit of the template or the template label of the template, the video clip is classified to obtain the video clip after classification;
    根据所述分类后的视频片段确定所述模板对应的匹配关系。The matching relationship corresponding to the template is determined according to the classified video segment.
  95. 根据权利要求94所述的方法,其特征在于,所述根据所述模板的视频坑位的坑位标签或所述模板的模板标签对所述视频片段进行分类,包括:The method according to claim 94, wherein the classifying the video clip according to the pit tag of the video pit of the template or the template tag of the template, comprising:
    根据所述视频坑位的坑位标签或所述模板的模板标签对多个所述视频片段进行等级划分,得到多个等级类别的视频片段。According to the pit tag of the video pit or the template tag of the template, a plurality of the video clips are graded to obtain video clips of multiple grade categories.
  96. 根据权利要求95所述的方法,其特征在于,所述多个等级类别的视频片段至少包括第一类别的视频片段、第二类别的视频片段和第三类别的视频片段;The method of claim 95, wherein the video clips of the plurality of hierarchical categories at least include video clips of a first category, video clips of a second category, and video clips of a third category;
    其中,所述第一类别的视频片段的精彩等级大于所述第二类别的视频片段的精彩等级,所述第二类别的视频片段的精彩等级大于所述第三类别的视频片段的精彩等级。The highlight level of the video clips of the first category is greater than the highlight level of the video clips of the second category, and the highlight level of the video clips of the second category is greater than the highlight level of the video clips of the third category.
  97. 根据权利要求96所述的方法,其特征在于,所述精彩等级根据所述视频片段的画面内容和音频内容确定的。The method of claim 96, wherein the highlight level is determined according to picture content and audio content of the video clip.
  98. 根据权利要求94所述的方法,其特征在于,所述根据所述分类后的视频片段确定每个所述模板的视频坑位与所述视频片段的匹配关系,包括:The method according to claim 94, wherein determining the matching relationship between the video pits of each template and the video clips according to the classified video clips comprises:
    根据所述视频坑位的坑位标签或所述模板的模板标签,对所述分类后的视频片段进行排序;以及Sorting the classified video clips according to the pit tag of the video pit or the template tag of the template; and
    根据所述排序结果确定每个所述模板的视频坑位与所述视频片段的匹配关系。The matching relationship between the video pits of each template and the video segment is determined according to the sorting result.
  99. 根据权利要求98所述的方法,其特征在于,所述根据所述分类后的视频片段确定每个所述模板的视频坑位与所述视频片段的匹配关系,包括:The method according to claim 98, wherein determining the matching relationship between the video pits of each template and the video clips according to the classified video clips comprises:
    根据所述排序结果为所述模板的视频坑位分配视频片段,确定每个所述模板的视频坑位与所述视频片段的匹配关系。Allocate video clips to the video pits of the template according to the sorting result, and determine the matching relationship between the video pits of each template and the video clips.
  100. 根据权利要求66所述的方法,其特征在于,所述根据所述匹配得分从所述多个模板中确定推荐模板,包括:The method according to claim 66, wherein the determining a recommended template from the plurality of templates according to the matching score comprises:
    根据所述匹配得分从所述多个模板中确定预设数量个模板;Determine a preset number of templates from the plurality of templates according to the matching score;
    从所述预设数量个模板中任选目标数量个进行组合,得到多个模板组,所述模板组包括目标数量个模板;Selecting a target number of templates from the preset number of templates to combine to obtain a plurality of template groups, and the template group includes a target number of templates;
    根据所述模板组中目标数量个模板的模板类型从多个模板组中确定推荐模板组,将所述推荐模板组中的目标数量个模板作为推荐模板。A recommended template group is determined from a plurality of template groups according to the template types of the target number of templates in the template group, and the target number of templates in the recommended template group is used as the recommended template.
  101. 根据权利要求100所述的方法,其特征在于,所述根据所述模板组中目标数量个模板的模板类型从多个模板组中确定推荐模板组,包括:The method according to claim 100, wherein the determining a recommended template group from a plurality of template groups according to the template types of the target number of templates in the template group comprises:
    获取多个所述模板组中目标数量个模板的模板类型,根据所述模板类型和所述匹配得分确定多个所述模板组对应的组合得分;Obtain template types of a target number of templates in a plurality of the template groups, and determine a combination score corresponding to a plurality of the template groups according to the template types and the matching scores;
    根据所述组合得分从多个所述模板组中确定推荐模板组。A recommended template group is determined from a plurality of the template groups according to the combined score.
  102. 根据权利要求101所述的方法,其特征在于,所述根据所述模板类型和所述匹配得分确定多个所述模板组对应的组合得分,包括:The method according to claim 101, wherein the determining, according to the template type and the matching score, the combined scores corresponding to a plurality of the template groups, comprising:
    根据所述模板类型确定多个所述模板组内目标数量个模板之间的模板丰富度;Determine the template richness among the target number of templates in the template groups according to the template type;
    根据所述模板组内目标数量个模板之间的模板丰富度和所述模板组内目标数量个模板的匹配得分之和,确定多个所述模板组的组合得分。A combined score of a plurality of the template groups is determined according to the sum of the template richness between the target number of templates in the template group and the matching scores of the target number of templates in the template group.
  103. 根据权利要求100所述的方法,其特征在于,所述推荐视频的数量包括多个,多个所述推荐视频为根据所述推荐模板组中的目标数量个模板得到的。The method according to claim 100, wherein the number of the recommended videos includes a plurality of recommended videos, and the plurality of the recommended videos are obtained according to the target number of templates in the recommended template group.
  104. 根据权利要求103所述的方法,其特征在于,所述方法还包括:The method of claim 103, wherein the method further comprises:
    将多个所述推荐视频推荐给用户,以便用户选择。A plurality of the recommended videos are recommended to the user for selection by the user.
  105. 根据权利要求66所述的方法,其特征在于,所述根据所述匹配得分从所述多个模板中确定推荐模板,包括:The method according to claim 66, wherein the determining a recommended template from the plurality of templates according to the matching score comprises:
    获取所述多个模板的模板类型;obtaining the template types of the multiple templates;
    根据所述多个模板的模板类型和匹配得分确定推荐模板。A recommended template is determined according to template types and matching scores of the plurality of templates.
  106. 根据权利要求105所述的方法,其特征在于,所述根据所述多个模板的模板类型和匹配得分确定推荐模板,包括:The method according to claim 105, wherein the determining a recommended template according to template types and matching scores of the multiple templates comprises:
    根据所述模板类型将所述多个模板划分为多个类型模板组,每个所述类型模板组至少包括一个所述模板;dividing the plurality of templates into a plurality of type template groups according to the template type, each of the type template groups including at least one of the templates;
    根据所述模板的匹配得分从所述多个类型模板组中确定满足类型需求数量的模板;以及determining templates from the plurality of type template groups that satisfy the required number of types according to the matching scores of the templates; and
    根据所述模板的匹配得分从所述多个模板中剩余的模板选择模板,直至选择的模板数量满足模板需求数量。Templates are selected from the remaining templates in the plurality of templates according to the matching scores of the templates, until the number of selected templates meets the required number of templates.
  107. 根据权利要求106所述的方法,其特征在于,所述根据所述多个模板的模板类型和匹配得分确定推荐模板,包括:The method according to claim 106, wherein the determining a recommended template according to template types and matching scores of the multiple templates comprises:
    根据所述模板的匹配得分的高低依次从所述多个模板中选择模板,直至选择的模板类型满足类型需求数量;Templates are selected from the plurality of templates in sequence according to the matching scores of the templates, until the selected template type meets the required number of types;
    根据所述模板的匹配得分从所述多个模板中剩余的模板选择模板,直至选 择的模板数量满足模板需求数量。Templates are selected from the remaining templates in the plurality of templates according to the matching scores of the templates, until the number of selected templates meets the required number of templates.
  108. 根据权利要求66至107任一项所述的方法,其特征在于,所述待处理视频素材包括通过手持端拍摄的视频素材、通过可移动平台拍摄的视频素材、从云端服务器获取的视频素材和从本地服务器获取的视频素材中的至少一种。The method according to any one of claims 66 to 107, wherein the video material to be processed includes video material shot by a handheld terminal, video material shot by a movable platform, video material obtained from a cloud server, and At least one of the video materials obtained from the local server.
  109. 根据权利要求67所述的方法,其特征在于,所述对待处理视频素材进行视频分割,包括:The method according to claim 67, wherein the performing video segmentation on the video material to be processed comprises:
    对待处理视频素材进行素材选择,对选择的待处理视频素材进行视频分割。Material selection is performed on the video material to be processed, and video segmentation is performed on the selected video material to be processed.
  110. 根据权利要求109所述的方法,其特征在于,所述对待处理视频素材进行素材选择,包括:The method according to claim 109, wherein the material selection for the video material to be processed comprises:
    根据所述待处理视频素材的素材参数进行素材选择;Material selection is performed according to the material parameters of the video material to be processed;
    其中,所述素材参数包括拍摄时间、拍摄地点、拍摄目标物中的至少一种。Wherein, the material parameters include at least one of shooting time, shooting location, and shooting target.
  111. 根据权利要求109所述的方法,其特征在于,所述对待处理视频素材进行素材选择,包括:The method according to claim 109, wherein the material selection for the video material to be processed comprises:
    根据用户的选择操作对待处理视频素材进行素材选择。Material selection is performed on the video material to be processed according to the user's selection operation.
  112. 根据权利要求109所述的方法,其特征在于,所述对待处理视频素材进行素材选择,包括:The method according to claim 109, wherein the material selection for the video material to be processed comprises:
    根据所述待处理视频素材的素材参数,对所述待处理视频素材进行聚类以实现素材选择;According to the material parameters of the to-be-processed video material, the to-be-processed video material is clustered to realize material selection;
    其中,所述聚类包括时间聚类、地点聚类、目标物聚类中的至少一种。Wherein, the clustering includes at least one of time clustering, location clustering, and target object clustering.
  113. 根据权利要求66至107任一项所述的方法,其特征在于,所述方法包括:The method according to any one of claims 66 to 107, wherein the method comprises:
    获取所述待处理视频素材的图像质量;obtaining the image quality of the video material to be processed;
    根据所述待处理视频素材的图像质量对所述待处理视频素材进行废片去除。The to-be-processed video material is discarded according to the image quality of the to-be-processed video material.
  114. 根据权利要求113所述的方法,其特征在于,所述图像质量包括画面抖动、画面模糊、画面过曝、画面欠曝、图像中无明确场景或图像中无明确主体中的至少一种。The method according to claim 113, wherein the image quality comprises at least one of picture jitter, picture blur, picture overexposure, picture underexposure, no clear scene in the image, or no clear subject in the image.
  115. 根据权利要求66至107任一项所述的方法,其特征在于,所述根据所述推荐模板对应的匹配关系将所述视频片段填入所述推荐模板的对应视频 坑位,包括:The method according to any one of claims 66 to 107, wherein the video clip is filled into the corresponding video pit of the recommended template according to the corresponding matching relationship of the recommended template, comprising:
    确定填入所述视频坑位的视频片段的视频时长是否大于所述视频坑位的时长;Determine whether the video duration of the video clip filled in the video pit is greater than the duration of the video pit;
    若填入所述视频坑位的视频片段的视频时长大于所述视频坑位的时长,对所述视频片段进行片段提取,得到挑选片段;If the video duration of the video segment filled in the video pit is greater than the duration of the video pit, extract the segment from the video segment to obtain a selected segment;
    其中,所述挑选片段的视频时长小于或等于所述视频坑位的时长。Wherein, the video duration of the selected segment is less than or equal to the duration of the video pit.
  116. 根据权利要求115所述的方法,其特征在于,所述对所述视频片段进行片段提取,得到挑选片段,包括:The method according to claim 115, wherein the performing segment extraction on the video segment to obtain the selected segment, comprising:
    根据所述视频片段的视频元素,对所述视频片段进行片段提取,得到挑选片段。Segment extraction is performed on the video segment according to the video element of the video segment to obtain a selected segment.
  117. 根据权利要求116所述的方法,其特征在于,所述视频元素包括笑脸画面、笑声音频、人物动作、清晰人声、画面构图和美学打分中的至少一种。The method of claim 116, wherein the video elements include at least one of smiley images, laughter audio, character movements, clear human voices, image composition, and aesthetic scores.
  118. 根据权利要求66至107任一项所述的方法,其特征在于,所述根据所述推荐模板对应的匹配关系将所述视频片段填入所述推荐模板的对应视频坑位,包括:The method according to any one of claims 66 to 107, wherein filling the video clip into the corresponding video slot of the recommended template according to the matching relationship corresponding to the recommended template, comprising:
    根据所述推荐模板的视频坑位与所述视频片段的匹配关系将所述视频片段填入所述推荐模型的对应视频坑位,得到初始视频;According to the matching relationship between the video pits of the recommended template and the video clips, the video clips are filled in the corresponding video pits of the recommendation model to obtain an initial video;
    基于所述推荐模板的模板要求对所述初始视频进行图像优化,得到推荐视频。Based on the template requirements of the recommended template, image optimization is performed on the initial video to obtain a recommended video.
  119. 根据权利要求118所述的方法,其特征在于,所述模板要求包括转场设置、加减速设置、贴图特效设置中的至少一种。The method according to claim 118, wherein the template requirements include at least one of transition settings, acceleration and deceleration settings, and texture special effects settings.
  120. 根据权利要求66至107任一项所述的方法,其特征在于,所述方法包括:The method according to any one of claims 66 to 107, wherein the method comprises:
    对所述待处理视频素材进行去重处理。Perform de-duplication processing on the to-be-processed video material.
  121. 根据权利要求120所述的方法,其特征在于,所述去重处理包括相似素材聚类。120. The method of claim 120, wherein the deduplication process includes clustering of similar material.
  122. 一种视频处理方法,其特征在于,用于将待处理视频素材和预设的模板进行合成,包括:A video processing method, characterized in that it is used for synthesizing a to-be-processed video material and a preset template, comprising:
    根据待处理视频素材的视频信息,对所述待处理视频素材进行分割生成多 个视频片段;According to the video information of the video material to be processed, the to-be-processed video material is divided to generate a plurality of video segments;
    根据所述模板的视频坑位的坑位信息,确定待填入所述模板中各个视频坑位的视频片段,得到所述模板对应的匹配关系;According to the pit information of the video pits of the template, determine the video clips to be filled in each video pit in the template, and obtain the matching relationship corresponding to the template;
    根据所述模板对应的匹配关系将所述视频片段填入所述模板的对应视频坑位,得到推荐视频。According to the matching relationship corresponding to the template, the video clip is filled in the corresponding video slot of the template to obtain a recommended video.
  123. 根据权利要求122所述的方法,其特征在于,所述视频信息包括运镜方向和场景信息中的至少一项。The method according to claim 122, wherein the video information includes at least one of a moving direction and scene information.
  124. 根据权利要求122所述的方法,其特征在于,所述对所述待处理视频素材进行分割生成多个视频片段,包括:The method according to claim 122, wherein the dividing the to-be-processed video material to generate multiple video segments comprises:
    根据所述待处理视频素材的视频信息对所述待处理视频素材进行分割,得到多个第一视频片段;dividing the to-be-processed video material according to the video information of the to-be-processed video material to obtain a plurality of first video segments;
    对所述第一视频片段进行聚类分割,得到多个第二视频片段;Clustering and segmenting the first video clips to obtain a plurality of second video clips;
    其中,将所述第二视频片段作为待填入所述模板的视频坑位的视频片段。Wherein, the second video segment is used as the video segment to be filled in the video pit of the template.
  125. 根据权利要求124所述的方法,其特征在于,所述对所述第一视频片段进行聚类分割之前,所述方法还包括:The method according to claim 124, wherein before the cluster segmentation of the first video segment, the method further comprises:
    确定多个所述第一视频片段中是否存在有视频时长大于预设时长的第一视频片段;determining whether there is a first video clip with a video duration greater than a preset duration in the plurality of first video clips;
    若存在有视频时长大于所述预设时长的第一视频片段,执行所述对所述第一视频片段进行聚类分割的步骤。If there is a first video segment with a video duration greater than the preset duration, the step of clustering and segmenting the first video segment is performed.
  126. 根据权利要求124所述的方法,其特征在于,所述对所述第一视频片段进行聚类分割,包括:The method according to claim 124, wherein the performing cluster segmentation on the first video segment comprises:
    确定滑动窗口和聚类中心,其中,所述滑动窗口用于确定待处理的当前视频帧,所述聚类中心用于确定所述第一视频片段的视频分割点;determining a sliding window and a clustering center, wherein the sliding window is used to determine the current video frame to be processed, and the clustering center is used to determine the video segmentation point of the first video segment;
    基于所述聚类中心,根据所述滑动窗口对所述第一视频片段的视频帧进行聚类分析,确定视频分割点;Based on the cluster center, cluster analysis is performed on the video frames of the first video segment according to the sliding window, and a video segmentation point is determined;
    根据所述视频分割点对所述第一视频片段进行视频分割。Video segmentation is performed on the first video segment according to the video segmentation point.
  127. 根据权利要求126所述的方法,其特征在于,所述聚类中心包括所述第一视频片段的第一帧视频帧的图像特征。The method of claim 126, wherein the cluster centers comprise image features of a first video frame of the first video segment.
  128. 根据权利要求127所述的方法,其特征在于,所述第一视频片段的 第一帧视频帧的图像特征是根据预先训练好的图像特征网络模型得到的,所述图像特征网络模型能够输出所述第一视频片段中各个视频帧的图像特征。The method according to claim 127, wherein the image feature of the first video frame of the first video segment is obtained according to a pre-trained image feature network model, and the image feature network model can output all Describe the image features of each video frame in the first video segment.
  129. 根据权利要求126所述的方法,其特征在于,所述滑动窗口的大小与所述第一视频片段的时长相关;或者,所述滑动窗口的大小与用户设置的期望分割速度相关。The method according to claim 126, wherein the size of the sliding window is related to the duration of the first video clip; or, the size of the sliding window is related to a desired segmentation speed set by a user.
  130. 根据权利要求126所述的方法,其特征在于,所述滑动窗口的大小等于1。126. The method of claim 126, wherein the size of the sliding window is equal to one.
  131. 根据权利要求126所述的方法,其特征在于,所述基于所述聚类中心,根据所述滑动窗口对所述第一视频片段的视频帧进行聚类分析,确定视频分割点,包括:The method according to claim 126, wherein, based on the cluster center, performing a cluster analysis on the video frames of the first video segment according to the sliding window to determine a video segmentation point, comprising:
    根据所述滑动窗口确定当前视频帧,并确定所述当前视频帧的图像特征与所述聚类中心的相似度;Determine the current video frame according to the sliding window, and determine the similarity between the image feature of the current video frame and the cluster center;
    若所述相似度小于预设阈值,则将所述当前视频帧作为视频分割点,并重新确定聚类中心;If the similarity is less than the preset threshold, the current video frame is used as the video segmentation point, and the cluster center is re-determined;
    根据重新确定的聚类中心,继续确定视频分割点,直至所述第一视频片段的最后一个视频帧。According to the re-determined cluster center, the video segmentation point is continued to be determined until the last video frame of the first video segment.
  132. 根据权利要求131所述的方法,其特征在于,所述确定所述当前视频帧的图像特征与所述聚类中心的相似度,包括:The method according to claim 131, wherein the determining the similarity between the image feature of the current video frame and the cluster center comprises:
    确定所述当前视频帧的图像特征与所述聚类中心之间的余弦相似度。A cosine similarity between the image feature of the current video frame and the cluster center is determined.
  133. 根据权利要求131所述的方法,其特征在于,所述重新确定聚类中心,包括:The method of claim 131, wherein the re-determining the cluster center comprises:
    将所述当前视频帧的图像特征作为重新确定的聚类中心。The image feature of the current video frame is used as the re-determined cluster center.
  134. 根据权利要求131所述的方法,其特征在于,所述确定所述当前视频帧的图像特征与所述聚类中心的相似度之后,所述方法还包括:The method according to claim 131, wherein after determining the similarity between the image feature of the current video frame and the cluster center, the method further comprises:
    若所述相似度大于或等于预设阈值,则更新所述聚类中心;If the similarity is greater than or equal to a preset threshold, update the cluster center;
    根据更新后的所述聚类中心继续确定当前视频帧的图像特征与更新后的所述聚类中心的相似度。Continue to determine the similarity between the image feature of the current video frame and the updated cluster center according to the updated cluster center.
  135. 根据权利要求134所述的方法,其特征在于,所述更新所述聚类中心,包括:The method of claim 134, wherein the updating the cluster centers comprises:
    获取所述当前视频帧的图像特征;obtaining the image feature of the current video frame;
    根据所述当前视频帧的图像特征和所述聚类中心,确定更新后的聚类中心。The updated cluster center is determined according to the image feature of the current video frame and the cluster center.
  136. 一种视频处理装置,其特征在于,所述视频处理装置包括处理器和存储器;A video processing device, characterized in that the video processing device includes a processor and a memory;
    所述存储器用于存储计算机程序;the memory is used to store computer programs;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:如权利要求1至54任一项所述的视频处理方法。The processor is configured to execute the computer program and implement the video processing method according to any one of claims 1 to 54 when the computer program is executed.
  137. 一种视频处理装置,其特征在于,所述视频处理装置包括处理器和存储器;A video processing device, characterized in that the video processing device comprises a processor and a memory;
    所述存储器用于存储计算机程序;the memory is used to store computer programs;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:如权利要求55至65任一项所述的视频处理方法。The processor is configured to execute the computer program and implement the video processing method according to any one of claims 55 to 65 when the computer program is executed.
  138. 一种视频处理装置,其特征在于,所述视频处理装置包括处理器和存储器;A video processing device, characterized in that the video processing device includes a processor and a memory;
    所述存储器用于存储计算机程序;the memory is used to store computer programs;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:如权利要求66至121任一项所述的视频处理方法。The processor is configured to execute the computer program and implement the video processing method according to any one of claims 66 to 121 when the computer program is executed.
  139. 一种视频处理装置,其特征在于,所述视频处理装置包括处理器和存储器;A video processing device, characterized in that the video processing device includes a processor and a memory;
    所述存储器用于存储计算机程序;the memory is used to store computer programs;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:如权利要求122至135任一项所述的视频处理方法。The processor is configured to execute the computer program, and when executing the computer program, implement the video processing method according to any one of claims 122 to 135.
  140. 一种终端设备,其特征在于,所述终端设备包括处理器和存储器;A terminal device, characterized in that the terminal device includes a processor and a memory;
    所述存储器用于存储计算机程序;the memory is used to store computer programs;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:如权利要求1至54任一项所述的视频处理方法。The processor is configured to execute the computer program and implement the video processing method according to any one of claims 1 to 54 when the computer program is executed.
  141. 一种终端设备,其特征在于,所述终端设备包括处理器和存储器;A terminal device, characterized in that the terminal device includes a processor and a memory;
    所述存储器用于存储计算机程序;the memory is used to store computer programs;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现: 如权利要求55至65任一项所述的视频处理方法。The processor is configured to execute the computer program, and when executing the computer program, implement: the video processing method according to any one of claims 55 to 65.
  142. 一种终端设备,其特征在于,所述终端设备包括处理器和存储器;A terminal device, characterized in that the terminal device includes a processor and a memory;
    所述存储器用于存储计算机程序;the memory is used to store computer programs;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:The processor is configured to execute the computer program and when executing the computer program, realize:
    如权利要求66至121任一项所述的视频处理方法。The video processing method according to any one of claims 66 to 121.
  143. 一种终端设备,其特征在于,所述终端设备包括处理器和存储器;A terminal device, characterized in that the terminal device includes a processor and a memory;
    所述存储器用于存储计算机程序;the memory is used to store computer programs;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现:如权利要求122至135任一项所述的视频处理方法。The processor is configured to execute the computer program, and when executing the computer program, implement the video processing method according to any one of claims 122 to 135.
  144. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如权利要求1至135任一项所述的视频处理方法的步骤。A computer-readable storage medium, characterized in that, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor can realize the implementation of any one of claims 1 to 135. The steps of the video processing method.
PCT/CN2020/142432 2020-12-31 2020-12-31 Video processing method, video processing apparatus, terminal device, and storage medium WO2022141533A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080075426.7A CN114731458A (en) 2020-12-31 2020-12-31 Video processing method, video processing apparatus, terminal device, and storage medium
PCT/CN2020/142432 WO2022141533A1 (en) 2020-12-31 2020-12-31 Video processing method, video processing apparatus, terminal device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/142432 WO2022141533A1 (en) 2020-12-31 2020-12-31 Video processing method, video processing apparatus, terminal device, and storage medium

Publications (1)

Publication Number Publication Date
WO2022141533A1 true WO2022141533A1 (en) 2022-07-07

Family

ID=82229974

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/142432 WO2022141533A1 (en) 2020-12-31 2020-12-31 Video processing method, video processing apparatus, terminal device, and storage medium

Country Status (2)

Country Link
CN (1) CN114731458A (en)
WO (1) WO2022141533A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100725A (en) * 2022-08-23 2022-09-23 浙江大华技术股份有限公司 Object recognition method, object recognition apparatus, and computer storage medium
CN116866498A (en) * 2023-06-15 2023-10-10 天翼爱音乐文化科技有限公司 Video template generation method and device, electronic equipment and storage medium
CN116980717A (en) * 2023-09-22 2023-10-31 北京小糖科技有限责任公司 Interaction method, device, equipment and storage medium based on video decomposition processing
CN117278801A (en) * 2023-10-11 2023-12-22 广州智威智能科技有限公司 AI algorithm-based student activity highlight instant shooting and analyzing method

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115134646B (en) * 2022-08-25 2023-02-10 荣耀终端有限公司 Video editing method and electronic equipment
CN118233712A (en) * 2022-12-19 2024-06-21 北京字跳网络技术有限公司 Video generation method, device, equipment and storage medium
CN115695944B (en) * 2022-12-30 2023-03-28 北京远特科技股份有限公司 Vehicle-mounted image processing method and device, electronic equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120317598A1 (en) * 2011-06-09 2012-12-13 Comcast Cable Communications, Llc Multiple Video Content in a Composite Video Stream
CN104735468A (en) * 2015-04-03 2015-06-24 北京威扬科技有限公司 Method and system for synthesizing images into new video based on semantic analysis
CN110324676A (en) * 2018-03-28 2019-10-11 腾讯科技(深圳)有限公司 Data processing method, media content put-on method, device and storage medium
CN110730381A (en) * 2019-07-12 2020-01-24 北京达佳互联信息技术有限公司 Method, device, terminal and storage medium for synthesizing video based on video template
CN111357277A (en) * 2018-11-28 2020-06-30 深圳市大疆创新科技有限公司 Video clip control method, terminal device and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8103150B2 (en) * 2007-06-07 2012-01-24 Cyberlink Corp. System and method for video editing based on semantic data
CN110532426A (en) * 2019-08-27 2019-12-03 新华智云科技有限公司 It is a kind of to extract the method and system that Multi-media Material generates video based on template

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120317598A1 (en) * 2011-06-09 2012-12-13 Comcast Cable Communications, Llc Multiple Video Content in a Composite Video Stream
CN104735468A (en) * 2015-04-03 2015-06-24 北京威扬科技有限公司 Method and system for synthesizing images into new video based on semantic analysis
CN110324676A (en) * 2018-03-28 2019-10-11 腾讯科技(深圳)有限公司 Data processing method, media content put-on method, device and storage medium
CN111357277A (en) * 2018-11-28 2020-06-30 深圳市大疆创新科技有限公司 Video clip control method, terminal device and system
CN110730381A (en) * 2019-07-12 2020-01-24 北京达佳互联信息技术有限公司 Method, device, terminal and storage medium for synthesizing video based on video template

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100725A (en) * 2022-08-23 2022-09-23 浙江大华技术股份有限公司 Object recognition method, object recognition apparatus, and computer storage medium
CN115100725B (en) * 2022-08-23 2022-11-22 浙江大华技术股份有限公司 Object recognition method, object recognition apparatus, and computer storage medium
CN116866498A (en) * 2023-06-15 2023-10-10 天翼爱音乐文化科技有限公司 Video template generation method and device, electronic equipment and storage medium
CN116866498B (en) * 2023-06-15 2024-04-05 天翼爱音乐文化科技有限公司 Video template generation method and device, electronic equipment and storage medium
CN116980717A (en) * 2023-09-22 2023-10-31 北京小糖科技有限责任公司 Interaction method, device, equipment and storage medium based on video decomposition processing
CN116980717B (en) * 2023-09-22 2024-01-23 北京小糖科技有限责任公司 Interaction method, device, equipment and storage medium based on video decomposition processing
CN117278801A (en) * 2023-10-11 2023-12-22 广州智威智能科技有限公司 AI algorithm-based student activity highlight instant shooting and analyzing method
CN117278801B (en) * 2023-10-11 2024-03-22 广州智威智能科技有限公司 AI algorithm-based student activity highlight instant shooting and analyzing method

Also Published As

Publication number Publication date
CN114731458A (en) 2022-07-08

Similar Documents

Publication Publication Date Title
WO2022141533A1 (en) Video processing method, video processing apparatus, terminal device, and storage medium
US11328013B2 (en) Generating theme-based videos
CN112740709B (en) Computer-implemented method, computing device, and computer-readable medium for performing gating for video analytics
US11321385B2 (en) Visualization of image themes based on image content
US9870798B2 (en) Interactive real-time video editor and recorder
US9570107B2 (en) System and method for semi-automatic video editing
US9554111B2 (en) System and method for semi-automatic video editing
US10679063B2 (en) Recognizing salient video events through learning-based multimodal analysis of visual features and audio-based analytics
US11231838B2 (en) Image display with selective depiction of motion
CN113569088B (en) Music recommendation method and device and readable storage medium
US10541000B1 (en) User input-based video summarization
CN111428088A (en) Video classification method and device and server
US20170065889A1 (en) Identifying And Extracting Video Game Highlights Based On Audio Analysis
CN111683209A (en) Mixed-cut video generation method and device, electronic equipment and computer-readable storage medium
JP7208595B2 (en) Movie Success Index Prediction
CN111930994A (en) Video editing processing method and device, electronic equipment and storage medium
WO2022061806A1 (en) Film production method, terminal device, photographing device, and film production system
CN110866563B (en) Similar video detection and recommendation method, electronic device and storage medium
WO2014179749A1 (en) Interactive real-time video editor and recorder
Xu et al. Fast summarization of user-generated videos: exploiting semantic, emotional, and quality clues
CN113992973B (en) Video abstract generation method, device, electronic equipment and storage medium
CN116095363B (en) Mobile terminal short video highlight moment editing method based on key behavior recognition
CN114328990B (en) Image integrity recognition method, device, computer equipment and storage medium
CN117201837A (en) Video generation method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20967871

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20967871

Country of ref document: EP

Kind code of ref document: A1