CN114025232B - Video material cutting method, device, terminal equipment and readable storage medium - Google Patents

Video material cutting method, device, terminal equipment and readable storage medium Download PDF

Info

Publication number
CN114025232B
CN114025232B CN202111232378.0A CN202111232378A CN114025232B CN 114025232 B CN114025232 B CN 114025232B CN 202111232378 A CN202111232378 A CN 202111232378A CN 114025232 B CN114025232 B CN 114025232B
Authority
CN
China
Prior art keywords
video
target video
target
clips
segment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111232378.0A
Other languages
Chinese (zh)
Other versions
CN114025232A (en
Inventor
王传鹏
张昕玥
张婷
孙尔威
李腾飞
周惠存
陈春梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hard Link Network Technology Co ltd
Original Assignee
Shanghai Hard Link Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hard Link Network Technology Co ltd filed Critical Shanghai Hard Link Network Technology Co ltd
Priority to CN202111232378.0A priority Critical patent/CN114025232B/en
Publication of CN114025232A publication Critical patent/CN114025232A/en
Application granted granted Critical
Publication of CN114025232B publication Critical patent/CN114025232B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4781Games
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The invention discloses a method and a device for shearing video materials, terminal equipment and a readable storage medium, wherein the method comprises the following steps: acquiring a plurality of video clips to be clipped; identifying whether a video clip to be clipped has a target video element; determining clipping points of video clips to be clipped according to the target video elements, and obtaining a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements; matching the first target video segment with the second target video segment, and taking the first target video segment and the second target video segment which meet the preset matching relation as candidate segments to be spliced; and splicing the candidate segments to be spliced to obtain a third target video segment. The invention can find out the video clips (such as characters, scenario clips with the same scene and video clips of playing methods) meeting the preset splicing requirement at the highest speed in a plurality of video clips and cut out the video clips, and ensures that the splicing of the cut-out video clips is more consistent.

Description

Video material cutting method, device, terminal equipment and readable storage medium
Technical Field
The present invention relates to the field of video processing technologies, and in particular, to a method and apparatus for cutting video material, a terminal device, and a readable storage medium.
Background
The video editing is to select, pick up, decompose and splice a large amount of materials shot in film making, and finally finish a coherent and smooth work with clear meaning, clear theme and artistic appeal. At present, the cutting before video splicing is performed manually, a user needs to identify similar elements (such as characters, scenario fragments with the same scene and video fragments of playing methods) in video materials, then the corresponding cutting is performed, and the video materials are multiplied along with the time length of video release, so that the manual cutting of the materials required by video splicing needs very large workload, and is time-consuming and labor-consuming.
Disclosure of Invention
The embodiment of the invention provides a video material cutting method, a device, terminal equipment and a readable storage medium, which can automatically find out video clips meeting the preset splicing requirement at the fastest speed in a plurality of video clips and cut out the video clips, and ensure that the splicing of the cut-out video clips is more consistent, and the cutting and splicing efficiency of the video material is higher.
An embodiment of the present invention provides a method for cutting video material, including:
acquiring a plurality of video clips to be clipped;
identifying whether a video clip to be clipped has a target video element;
Determining clipping points of video clips to be clipped according to the target video elements, and obtaining a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements;
matching the first target video segment with the second target video segment, and taking the first target video segment and the second target video segment which meet the preset matching relation as candidate segments to be spliced;
And splicing the candidate segments to be spliced to obtain a third target video segment.
As an improvement of the above solution, the determining, according to the target video element, a clipping point of the video clip to be clipped, and obtaining a plurality of first target video clips containing the target video element and a plurality of second target video clips not containing the target video element:
If the target video element exists, the continuous preset number of frames of the video element form a plurality of first target video fragments, and the rest continuous video frames form a plurality of second target video fragments.
As an improvement of the above solution, the matching the first target video segment with the second target video segment, and using the first target video segment and the second target video segment that satisfy a preset matching relationship as candidate segments to be spliced includes:
Identifying characters and scenes of the first target video clips and the second target video clips to obtain character labels and scene labels corresponding to the first target video clips and the second target video clips;
And taking the first target video segment and the second target video segment with the same character tag and scene tag as candidate segments to be spliced.
As an improvement of the above solution, the splicing the candidate segments to be spliced to obtain a third target video segment includes:
judging whether the number of the candidate fragments to be spliced is larger than a preset number threshold;
if yes, selecting a first target video segment and a second target video segment with highest matching degree from the candidate segments to be spliced to splice, and obtaining a third target video segment;
and if not, splicing the candidate segments to be spliced to obtain a third target video segment.
As an improvement of the above solution, the obtaining a plurality of video clips to be clipped includes:
Determining scene conversion points corresponding to different scenes of the video material to be cut according to the color characteristics and the structural characteristics of each video frame in the video material to be cut; wherein, the lens angles corresponding to different scenes are different;
and segmenting the video material to be cut according to the scene transition point to obtain a plurality of video clips to be cut.
As an improvement of the above solution, the determining, according to the color feature and the structural feature of each video frame in the video material to be cut, a scene transition point corresponding to a different scene of the video material to be cut includes:
respectively extracting color characteristics and structural characteristics of each video frame in the video material to be cut;
Calculating the feature similarity of any two adjacent video frames in the video material to be sheared according to the color feature and the structural feature of each video frame;
And determining a frame node when the feature similarity of two adjacent video frames in the video material to be cut meets a preset condition as a scene transition point.
As an improvement of the above solution, the splicing the candidate segments to be spliced to obtain a third target video segment includes:
Carrying out overall optimization on the color indexes of the candidate segments to be spliced;
And splicing the candidate segments to be spliced after the color index optimization to obtain a third target video segment.
Another embodiment of the present invention correspondingly provides a video material clipping apparatus, including:
The video clip acquisition module is used for acquiring a plurality of video clips to be clipped;
the video element identification module is used for identifying whether a video clip to be clipped has a target video element or not;
The video clip cutting module is used for determining clipping points of video clips to be clipped according to the target video elements to obtain a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements;
The video segment matching module is used for matching the first target video segment with the second target video segment, and taking the first target video segment and the second target video segment which meet the preset matching relation as candidate segments to be spliced;
and the video segment splicing module is used for splicing the candidate segments to be spliced to obtain a third target video segment.
Another embodiment of the present invention provides a terminal device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor implements the video material cropping method according to the embodiment of the present invention when the processor executes the computer program.
Another embodiment of the present invention provides a computer readable storage medium, where the computer readable storage medium includes a stored computer program, and when the computer program runs, the device where the computer readable storage medium is controlled to execute the video material cropping method described in the embodiment of the present invention.
Compared with the prior art, the video material cutting method, the video material cutting device, the terminal equipment and the computer readable storage medium disclosed by the embodiment of the invention have the following advantages:
The method comprises the steps of identifying target video elements of a plurality of video clips to be clipped, determining clipping points of the video clips to be clipped according to the target video elements, and obtaining a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements; then matching the first target video segment with the second target video segment, and taking the first target video segment and the second target video segment which meet the preset matching relation as candidate segments to be spliced; and finally, splicing the candidate segments to be spliced to obtain a third target video segment. Based on the method, the video clips (such as characters, scenario clips with the same scene and video clips of playing methods) meeting the preset splicing requirements can be automatically found out from a plurality of video clips in a massive amount and cut out, the splicing of the cut-out video clips is more consistent, and the cutting and splicing efficiency of video materials is higher.
Drawings
Fig. 1 is a flow chart of a method for cutting video material according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a clipping process for 25 frames of video material to be clipped according to an embodiment of the present invention;
Fig. 3 is a schematic structural diagram of a video material clipping apparatus according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, fig. 1 is a flowchart of a method for cutting video material according to an embodiment of the present invention, including the following steps:
s1, acquiring a plurality of video clips to be clipped.
In this embodiment, the plurality of video clips to be clipped are pre-clipped from the same video material to be clipped (of course, the video clips may also be pre-clipped from the plurality of video materials to be clipped), and the video material to be clipped may be a complete video to be clipped, or may be a plurality of video segments to be clipped and then spliced into a complete video. If the video material to be cut includes only one scene, no preliminary scene segmentation is required. If the video material to be cut is a plurality of scenes, the video material to be cut is subjected to preliminary scene segmentation, and the expected position of the segmentation is a frame node with shot direction conversion, color characteristics or structural characteristics having larger transformation. It will be appreciated that a color feature is a global feature that describes the surface properties of a scene to which an image or image region corresponds. Wherein the color features can be represented by a color histogram, a color matrix, and the like. The structural features are the position structural relations among the contents of all parts of the representation image, and can be represented by pixel matrix features. It should be noted that, the present invention is applicable to video materials with more scenes by using a plurality of representations, two, three or more than three.
S2, identifying whether the video clip to be clipped has a target video element.
In this embodiment, some of the video clips to be clipped contain target video elements (e.g., target game characters or target game props, etc.), and other portions of the video clips to be clipped do not contain target video elements. As an example, the target video element is a piece of a three-play game. The video frame part of the video frame part to be clipped containing the chessmen is a chessman playing method part, and the video frame part of the video frame part to be clipped containing no chessman is other types of fragments, such as a game scenario fragment.
Specifically, firstly, the content of the video frame element in the video clip to be clipped is identified through a pre-trained network model, and whether the identified content of the video element is a predetermined target video element (for example, the target video element is a target game character or a target game prop) is judged.
S3, determining clipping points of the video clips to be clipped according to the target video elements, and obtaining a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements.
Specifically, the step S3 includes:
s30, if the target video element exists, forming a plurality of first target video fragments by continuous preset number of frames of the video element, and forming a plurality of second target video fragments by the rest continuous video frames.
Specifically, for video frames containing the same target video element, the feature similarity degree between the video frames is high, so that the content of each video frame in each video segment to be clipped can be identified through an artificial intelligence algorithm (for example, a training image classification model), the target video element in each video segment to be clipped is judged according to the identification result, and then the video frames containing the same target video element are identified, so that the corresponding clipping point is determined: wherein a first frame and a last frame of a continuous preset number of frames containing target video elements are determined as clipping points of the video clips to be clipped so as to clip the video clips to be clipped to a first target video clip; the remaining consecutive video frames of the video segment to be clipped constitute a second target video segment.
For ease of understanding, the following exemplary description is provided herein: there are many games currently combining triple play with RPG games, and for such games and users, they are often concerned about advertising works composed of triple play video clips and scenario video clips. Then, as shown in fig. 2, a video material of 10 frames of video images is taken as an example, wherein the target video element is a piece representing a three-play game. Firstly, determining whether the content of each video frame of a video segment to be clipped contains the image content of 'three-elimination chess pieces', wherein 3-5 and 8-10 video frames in the video segment to be clipped contain the target video element of 'three-elimination chess pieces', and 1-2 and 6-7 video segments to be clipped do not contain the target video element of 'three-elimination chess pieces'. The clipping points corresponding to the three-piece pieces in the video clips to be clipped are 3, 5, 8 and 10, wherein the video segments formed by the video frames 1-2 are second target video clips, the video segments formed by the video frames 3-5 are first target video clips, the video segments formed by the video frames 6-7 are second target video clips, and the video segments formed by the video frames 8-10 are first target video clips.
And S4, matching the first target video segment with the second target video segment, and taking the first target video segment and the second target video segment which meet the preset matching relation as candidate segments to be spliced.
As an example, if the preset matching relationship is that the labels of the video clips are the same (the matching degree of the first target video clip and the second target video clip can be understood to be 100%, that is, the labels are the same by means of label matching, and the matching degree is the highest), the step S4 includes the following substeps:
S40, identifying characters and scenes of the first target video clips and the second target video clips to obtain character labels and scene labels corresponding to the first target video clips and the second target video clips;
S41, taking the first target video segment and the second target video segment with the same character tag and scene tag as candidate segments to be spliced.
Specifically, characters and scenes of the first target video segments and the second target video segments are identified through a pre-trained network model, so that character labels and scene labels (for example, target game scene labels or target game character labels) corresponding to the first target video segments and the second target video segments are obtained.
As another example, the preset matching relationship is that the matching degree of the first target video segment and the second target video segment is greater than a preset matching threshold. Wherein, the matching degree of the two can be calculated through SIFT feature matching and other algorithms.
And S5, splicing the candidate segments to be spliced to obtain a third target video segment.
In summary, according to the video material cutting method provided by the embodiment of the invention, firstly, target video element identification is performed on a plurality of video clips to be cut, and cutting points of the video clips to be cut are determined according to the target video elements, so that a plurality of first target video clips containing target video elements and a plurality of second target video clips not containing target video elements are obtained; then matching the first target video segment with the second target video segment, and taking the first target video segment and the second target video segment which meet the preset matching relation as candidate segments to be spliced; and finally, splicing the candidate segments to be spliced to obtain a third target video segment. Based on the method, the video clips (such as characters, scenario clips with the same scene and video clips of playing methods) meeting the preset splicing requirements can be automatically found out from a plurality of video clips in a massive amount and cut out, the splicing of the cut-out video clips is more consistent, and the cutting and splicing efficiency of video materials is higher.
In the above embodiment, the step S5 includes, as an example:
S50, judging whether the number of the candidate fragments to be spliced is larger than a preset number threshold;
s51, if so, selecting a first target video segment and a second target video segment with highest matching degree from the candidate segments to be spliced, and splicing to obtain a third target video segment;
And S52, if not, splicing the candidate segments to be spliced to obtain a third target video segment.
In this embodiment, when the number of candidate segments to be spliced in the same type is too large, the first target video segment and the second target video segment with the highest matching degree can be selected from the candidate segments to be spliced, so that the splicing content of the video material can be ensured not to be too redundant.
In a specific embodiment, the step S1 includes the following substeps:
s11, determining scene conversion points corresponding to different scenes of the video material to be cut according to the color characteristics and the structural characteristics of each video frame in the video material to be cut; wherein, the lens angles corresponding to different scenes are different;
S12, segmenting the video material to be cut according to the scene transition point to obtain a plurality of video clips to be cut.
The color features and the structural features of each video frame are extracted by detecting the video material to be cut frame by frame, so that whether the scene changes is judged: if the color characteristics and the structural characteristics between two adjacent frames of videos are changed greatly, judging that the visual angle of the video lens is changed, namely the scene of the video is changed, and taking a frame node between the two adjacent frames of videos as a scene conversion point at the moment; if the color characteristics and the structural characteristics between two adjacent frames of video are not changed greatly, the video lens visual angle is unchanged, namely the scene of the video is unchanged, and a frame node between the two adjacent frames of video is not required to be used as a scene conversion point. Therefore, after the scene conversion points corresponding to different scenes of the video material to be cut are determined, the video material to be cut can be subjected to preliminary scene segmentation according to the scene conversion points, so that a plurality of video clips to be cut with single scenes are obtained.
Illustratively, as shown in fig. 2 (a), the video material to be cut is a video clip that completely contains 25 frames of video images. Firstly, detecting video materials of the 25 frames of video images frame by frame, and determining color characteristics and structural characteristics of each video frame. If the number of frame nodes in which the color characteristics and the structural characteristics of two adjacent video frames are greatly changed is 4, the number of frame nodes in which the 2 nd and 3 rd video frames, the 8 th and 9 th video frames, the 15 th and 16 th video frames, and the 23 rd and 24 th video frames are sequentially determined, the number of corresponding scene transition points is 4, and the number of corresponding scene transition points is sequentially determined to be a scene transition point ①, a scene transition point ②, a scene transition point ③, and a scene transition point ④. And automatically cutting the video material of the 25-frame video image into 5 video clips to be clipped of a single scene according to the 4 scene transition points, wherein the video clips are a video clip 1 to be clipped, a video clip 2 to be clipped, a video clip 3 to be clipped, a video clip 4 to be clipped and a video clip 5 to be clipped in sequence.
Specifically, the step S11 includes the following substeps:
S110, respectively extracting color characteristics and structural characteristics of each video frame in the video material to be cut;
S111, calculating the feature similarity of any two adjacent video frames in the video material to be sheared according to the color feature and the structural feature of each video frame;
And S112, determining a frame node when the feature similarity of two adjacent video frames in the video material to be sheared meets a preset condition as a scene transition point.
In this embodiment, frame-by-frame detection is performed on the content of the whole video material, color features and structural features of each frame are extracted, feature similarity between two adjacent video frames is calculated, and if the feature similarity is smaller than a preset threshold, it is determined that scenes of the two adjacent video frames are switched; if the feature similarity is larger than or equal to a preset threshold, judging that the scenes of the two adjacent video frames are not switched.
In a more specific embodiment, the feature similarity of two adjacent video frames is used to determine that a local peak appears in the difference between the two adjacent video frames and the position meeting the rule requirement is subjected to scene segmentation processing, that is, when the difference curve of the two adjacent video frames reaches the local maximum, the frame node is determined to be a scene transition point.
Specifically, the person recognition of the plurality of first target video clips and the plurality of second target video clips in step S40 may be performed by:
It can be appreciated that the number of characters in a target video segment may be 0, 1,2 or more than 2, for example, if all video frames in a certain target video segment do not include characters, then the number of characters in the target video segment is 0; for another example, if some of the video frames of a certain target video segment do not contain characters and other of the video frames contain 1,2 or more than 2 characters, then the characters of the target video segment correspond to 1,2 or more than 2 characters.
In this embodiment, the number of people included in each target video clip and the character category corresponding to each person can be identified through the identification of the preset target identification model. And then determining the target person of each target video segment according to the first preset rule. It is to be appreciated that the target persona may be a single persona or may be a crowd of personas, such as a double persona, a triple persona.
In one example, the first preset rule includes: (1) Determining the most frequent person appearing in the corresponding target video segment as the target person of the target video segment, such as the principal angle; (2) One or more specific characters selected by default or user definition of the system are determined as target characters of the target video clip.
Correspondingly, the target characters in each target video segment are determined according to different first preset rules.
If the first preset rule (1) is adopted, counting the occurrence frequency of each character in the corresponding target video segment to determine the principal angle and the character category of the principal angle.
Specifically, in the recognition stage, a preset target recognition model extracts the character subgraphs of the video frames of each target video segment, then carries out specified category classification according to the characteristics of the character subgraphs, and carries out overall category frequency statistics and rules to realize final role category judgment.
More specifically, video frames near the clip point can be selected, a person is detected for the full image of each video frame, then a sub-image is cut out, and then the characteristics are calculated and the category classification calculation is performed. For example, in a target video segment, summarizing the number of times a certain person appears, the number of frames appearing, the percentage of the number of frames appearing to the total number of frames of the segment, and the like, and combining the target rule of finding the principal angle, judging the principal angle and the role category of the target video segment, such as the color categories of wearing men, aida, zombies, and the like.
If the first preset rule (2) is adopted, specific characters in each target video segment are required to be directly identified, and the specific characters are determined to be target characters.
After determining the target characters in each target video segment, the method for optimizing the continuity of the characters of the front and rear target video segments is as follows: judging whether the target characters in each target video segment are the same, if not, determining the final target characters of all the target video segments according to a second preset rule, screening out video frames of the final target characters in each target video segment, obtaining uniform final target characters, and ensuring that the characters appear in each target video segment continuously; if yes, the video frames of the final target characters of all the target video fragments are not required to be screened out.
In one example, the second preset rule includes: (1) Determining the target person with highest frequency in all target video clips as the final target person of all target video clips; (2) And determining the target person selected by default or user definition of the system as the final target person of all target video clips.
Taking 3 target video clips as an example, identifying the people in each target video clip based on a preset target identification model, then determining the target people in each target video clip according to a first preset rule, determining that the people 1 in the target video clip 1 occur most frequently, determining that the target people in the target video clip 1 are people 1, determining that the people 2 in the target video clip 2 occur most frequently, determining that the target people in the target video clip 2 are people 2, and determining that the people 1 in the target video clip 3 occur most frequently, and determining that the target people in the target video clip 3 are people 1. And determining that the final target characters of the 3 target video segments are the character 1 with highest occurrence frequency according to a second preset rule because the target characters of the 3 target video segments are different, and obtaining the target video segments 1 and 3 in which the characters 1 continuously appear by screening out the video frames in which the characters 1 in all the target video segments are located.
In a specific embodiment, the target recognition model needs to perform extraction and category training of a character subgraph on an original video in advance to obtain a preset target recognition model, and the corresponding training process is as follows:
In the training stage, firstly, extracting the positions of character subgraphs in a designated game according to a target recognition model (such as yolo model), intercepting the character subgraphs, further selecting and classifying features, and then defining character categories aiming at feature points expected to be matched, wherein a male character comprises a police figure, an equipment soldier figure and the like; in order to improve the total amount of the data set and balance the data amount in different categories (the occurrence ratio of each role in the material has larger difference), the data of the data set is enhanced, and the quality of the data set is improved through methods such as stretching, overturning, adding noise points and the like; the image classification model (e.g., resnet model) is then trained and optimized based on the created data set.
As a supplementary explanation of one case of the above embodiment of matching degree calculation, in a case where most of image foreground contents of image frames of both the first target video segment and the second target video segment are substantially the same but image scenes (backgrounds) are not all the same, when matching of the first target video segment and the second target video segment is performed not by the above-described label method but by the calculation method of matching degree of relevant image foreground features, for example, after feature matching is successful on video foreground contents (e.g., characters) of both the first target video segment and the second target video segment, and candidate segments to be spliced are obtained, it is also possible to determine whether or not a scene between two adjacent candidate segments to be spliced is changed.
Specifically, the length of the candidate segments to be spliced is inferred according to judgment of the duration of transition, and the like, each candidate segment to be spliced can cover two adjacent candidate segments to be spliced in a plurality of transitions near the clipping point, for example, 5 seconds before and after the clipping point, the scene change of 5 seconds before and after the clipping point of each candidate segment to be spliced is compared, if the scene change is judged to be large, the situation that the two adjacent candidate segments to be spliced are transited is indicated, if the scene is judged to be similar, the situation that the two adjacent candidate segments to be spliced are not transited is indicated, and the scene change before and after the clipping point of the next target video segment is continuously detected until the scene change before and after all clipping points are judged.
And adding some preset video linking fragments between two adjacent candidate fragments to be spliced, wherein the scenes of the candidate fragments to be spliced are changed, so that the scenes of the candidate fragments to be spliced, which are continuously appeared by two adjacent final target characters, are continuously appeared by the final target characters, and the candidate fragments to be spliced (namely, the target video fragments to be spliced) with continuous scenes are obtained. The video connection segments are preset transition videos, and play a role in connecting candidate segments to be spliced of two different scenes. It should be noted that, the recognition of the scene of the candidate segment may be performed by a related image background feature recognition algorithm (such as a related OpenCV background detection algorithm, etc.), so as to obtain a scene tag of the candidate segment.
In a specific embodiment, the step S5 includes the following steps:
s50, performing overall optimization on color indexes of the candidate segments to be spliced;
And S51, splicing the candidate segments to be spliced after the color index optimization to obtain a third target video segment.
In this embodiment, the color index of each candidate segment is integrally optimized, so that the display look and feel of each candidate segment is more consistent. Wherein the color index of the candidate segment includes, but is not limited to, darkness, color temperature, contrast, etc. of the image.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a video material clipping apparatus according to another embodiment of the present invention, including:
The video clip acquisition module 1 is used for acquiring a plurality of video clips to be clipped;
A video element identification module 2, configured to identify whether a video clip to be clipped has a target video element;
a video clip module 3, configured to determine clip points of video clips to be clipped according to the target video elements, so as to obtain a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements;
the video segment matching module 4 is used for matching the first target video segment with the second target video segment, and taking the first target video segment and the second target video segment which meet the preset matching relation as candidate segments to be spliced;
and the video segment splicing module 5 is used for splicing the candidate segments to be spliced to obtain a third target video segment.
In this embodiment, by identifying target video elements of a plurality of video clips to be clipped, and determining clip points of the video clips to be clipped according to the target video elements, a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements are obtained; then matching the first target video segment with the second target video segment, and taking the first target video segment and the second target video segment which meet the preset matching relation as candidate segments to be spliced; and finally, splicing the candidate segments to be spliced to obtain a third target video segment. Based on the method, the video clips (such as characters, scenario clips with the same scene and video clips of playing methods) meeting the preset splicing requirements can be automatically found out from a plurality of video clips in a massive amount and cut out, the splicing of the cut-out video clips is more consistent, and the cutting and splicing efficiency of video materials is higher.
Another embodiment of the present invention provides a terminal device, including: a processor, a memory, and a computer program stored in the memory and executable on the processor, which when executed implements the steps of the various embodiments of the video material cropping method described above, such as steps S1-S3 shown in fig. 1. Or the processor performs the functions of the modules in the embodiment of the video material shearing device when executing the computer program.
Another embodiment of the present invention provides a computer readable storage medium, where the computer readable storage medium includes a stored computer program, where when the computer program runs, the computer readable storage medium is controlled to execute steps in the foregoing embodiments of the video material cutting method, for example, steps S1 to S3 shown in fig. 1.
Illustratively, the computer program may be split into one or more modules that are stored in the memory and executed by the processor to perform the present invention. The one or more modules may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program in the terminal device. For example, the computer program may be divided into a video clip acquisition module 1, a video element identification module 2, a video clip clipping module 3, a video clip matching module 4 and a video clip splicing module 5, where the specific functions of the modules are as follows:
The video clip acquisition module 1 is used for acquiring a plurality of video clips to be clipped; a video element identification module 2, configured to identify whether a video clip to be clipped has a target video element; a video clip module 3, configured to determine clip points of video clips to be clipped according to the target video elements, so as to obtain a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements; the video segment matching module 4 is used for matching the first target video segment with the second target video segment, and taking the first target video segment and the second target video segment which meet the preset matching relation as candidate segments to be spliced; and the video segment splicing module 5 is used for splicing the candidate segments to be spliced to obtain a third target video segment.
The terminal equipment can be computing equipment such as a desktop computer, a notebook computer, a palm computer, a cloud server and the like. The terminal device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of a terminal device and does not constitute a limitation of the terminal device, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the terminal device may further include an input-output device, a network access device, a bus, etc.
The Processor may be a central processing unit (Central Processing Unit, CPU), other general purpose Processor, digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), off-the-shelf Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is a control center of the terminal device, and which connects various parts of the entire terminal device using various interfaces and lines.
The memory may be used to store the computer program and/or module, and the processor may implement various functions of the terminal device by running or executing the computer program and/or module stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart memory card (SMART MEDIA CARD, SMC), secure Digital (SD) card, flash memory card (FLASH CARD), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
Wherein the terminal device integrated modules may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as a stand alone product. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

Claims (7)

1. A method of cropping video material, comprising:
respectively extracting color characteristics and structural characteristics of each video frame in the video material to be cut;
Calculating the feature similarity of any two adjacent video frames in the video material to be sheared according to the color feature and the structural feature of each video frame;
Determining frame nodes when the feature similarity of two adjacent video frames in the video material to be cut meets a preset condition as scene conversion points corresponding to different scenes; wherein, the lens angles corresponding to different scenes are different;
segmenting the video material to be cut according to the scene transition points to obtain a plurality of video clips to be cut;
Identifying whether a video clip to be clipped has a target video element, wherein the target video element is a target game prop;
Determining clipping points of video clips to be clipped according to target video elements, and obtaining a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements, wherein the first target video clips are prop playing clips, and the second target video clips are game scenario clips;
Identifying characters and scenes of the first target video clips and the second target video clips to obtain character labels and scene labels corresponding to the first target video clips and the second target video clips;
Taking a first target video segment and a second target video segment with the same character tag and scene tag as candidate segments to be spliced;
And splicing the candidate segments to be spliced to obtain a third target video segment.
2. The method for cutting video material according to claim 1, wherein the determining points of the video clips to be cut according to the target video elements results in a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements:
If the target video element exists, the continuous preset number of frames of the video element form a plurality of first target video fragments, and the rest continuous video frames form a plurality of second target video fragments.
3. The method for cutting video material according to claim 1, wherein the splicing the candidate segments to be spliced to obtain the third target video segment includes:
judging whether the number of the candidate fragments to be spliced is larger than a preset number threshold;
if yes, selecting a first target video segment and a second target video segment with highest matching degree from the candidate segments to be spliced to splice, and obtaining a third target video segment;
and if not, splicing the candidate segments to be spliced to obtain a third target video segment.
4. A method of cropping video material according to any one of claims 1 to 3, wherein said stitching the candidate segments to be stitched together to obtain a third target video segment comprises:
Carrying out overall optimization on the color indexes of the candidate segments to be spliced;
And splicing the candidate segments to be spliced after the color index optimization to obtain a third target video segment.
5. A video material shearing apparatus, comprising:
the video segment acquisition module is used for respectively extracting the color characteristics and the structural characteristics of each video frame in the video material to be cut; calculating the feature similarity of any two adjacent video frames in the video material to be sheared according to the color feature and the structural feature of each video frame; determining frame nodes when the feature similarity of two adjacent video frames in the video material to be cut meets a preset condition as scene conversion points corresponding to different scenes; wherein, the lens angles corresponding to different scenes are different; segmenting the video material to be cut according to the scene transition points to obtain a plurality of video clips to be cut;
The video element identification module is used for identifying whether a video clip to be clipped has a target video element, wherein the target video element is a target game prop;
the video clip cutting module is used for determining clipping points of video clips to be clipped according to target video elements to obtain a plurality of first target video clips containing the target video elements and a plurality of second target video clips not containing the target video elements, wherein the first target video clips are prop playing clips, and the second target video clips are game scenario clips;
the video segment matching module is used for identifying characters and scenes of the first target video segments and the second target video segments to obtain character labels and scene labels corresponding to the first target video segments and the second target video segments; taking a first target video segment and a second target video segment with the same character tag and scene tag as candidate segments to be spliced;
and the video segment splicing module is used for splicing the candidate segments to be spliced to obtain a third target video segment.
6. A terminal device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the video material cropping method according to any one of claims 1 to 4 when the computer program is executed.
7. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored computer program, wherein the computer program when run controls a device in which the computer readable storage medium is located to perform the video material cropping method according to any one of claims 1 to 4.
CN202111232378.0A 2021-10-22 2021-10-22 Video material cutting method, device, terminal equipment and readable storage medium Active CN114025232B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111232378.0A CN114025232B (en) 2021-10-22 2021-10-22 Video material cutting method, device, terminal equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111232378.0A CN114025232B (en) 2021-10-22 2021-10-22 Video material cutting method, device, terminal equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN114025232A CN114025232A (en) 2022-02-08
CN114025232B true CN114025232B (en) 2024-06-21

Family

ID=80057206

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111232378.0A Active CN114025232B (en) 2021-10-22 2021-10-22 Video material cutting method, device, terminal equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN114025232B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115412764B (en) * 2022-08-30 2023-09-29 上海硬通网络科技有限公司 Video editing method, device, equipment and storage medium
CN115460459B (en) * 2022-09-02 2024-02-27 百度时代网络技术(北京)有限公司 Video generation method and device based on AI and electronic equipment
CN116600105B (en) * 2023-05-25 2023-10-17 广州盈风网络科技有限公司 Color label extraction method, device, equipment and medium for video material

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1716415A (en) * 2004-06-30 2006-01-04 深圳市朗科科技有限公司 Digital video frequency playing device and its program backing method
CN106021496A (en) * 2016-05-19 2016-10-12 海信集团有限公司 Video search method and video search device
CN109740499A (en) * 2018-12-28 2019-05-10 北京旷视科技有限公司 Methods of video segmentation, video actions recognition methods, device, equipment and medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9502073B2 (en) * 2010-03-08 2016-11-22 Magisto Ltd. System and method for semi-automatic video editing
US10720182B2 (en) * 2017-03-02 2020-07-21 Ricoh Company, Ltd. Decomposition of a video stream into salient fragments
CN107517406B (en) * 2017-09-05 2020-02-14 语联网(武汉)信息技术有限公司 Video editing and translating method
CN109996011A (en) * 2017-12-29 2019-07-09 深圳市优必选科技有限公司 video editing device and method
CN110519655B (en) * 2018-05-21 2022-06-10 阿里巴巴(中国)有限公司 Video editing method, device and storage medium
CN110225369B (en) * 2019-07-16 2020-09-29 百度在线网络技术(北京)有限公司 Video selective playing method, device, equipment and readable storage medium
CN111131884B (en) * 2020-01-19 2021-11-23 腾讯科技(深圳)有限公司 Video clipping method, related device, equipment and storage medium
CN113329261B (en) * 2021-08-02 2021-12-07 北京达佳互联信息技术有限公司 Video processing method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1716415A (en) * 2004-06-30 2006-01-04 深圳市朗科科技有限公司 Digital video frequency playing device and its program backing method
CN106021496A (en) * 2016-05-19 2016-10-12 海信集团有限公司 Video search method and video search device
CN109740499A (en) * 2018-12-28 2019-05-10 北京旷视科技有限公司 Methods of video segmentation, video actions recognition methods, device, equipment and medium

Also Published As

Publication number Publication date
CN114025232A (en) 2022-02-08

Similar Documents

Publication Publication Date Title
CN114025232B (en) Video material cutting method, device, terminal equipment and readable storage medium
CN111327945B (en) Method and apparatus for segmenting video
CN111581433B (en) Video processing method, device, electronic equipment and computer readable medium
CN110795595B (en) Video structured storage method, device, equipment and medium based on edge calculation
CN111739027B (en) Image processing method, device, equipment and readable storage medium
CN112419170A (en) Method for training occlusion detection model and method for beautifying face image
CN111901536B (en) Video editing method, system, device and storage medium based on scene recognition
CN112381104A (en) Image identification method and device, computer equipment and storage medium
CN111429341B (en) Video processing method, device and computer readable storage medium
CN113627402B (en) Image identification method and related device
US20230021533A1 (en) Method and apparatus for generating video with 3d effect, method and apparatus for playing video with 3d effect, and device
CN113011403B (en) Gesture recognition method, system, medium and device
CN113689324A (en) Automatic adding and deleting method and device for portrait object based on two classification labels
CN116308530A (en) Advertisement implantation method, advertisement implantation device, advertisement implantation equipment and readable storage medium
CN114187558A (en) Video scene recognition method and device, computer equipment and storage medium
Hu et al. Video object segmentation in rainy situations based on difference scheme with object structure and color analysis
CN111669657A (en) Video comment method and device, electronic equipment and storage medium
CN113012030A (en) Image splicing method, device and equipment
US20220188991A1 (en) Method and electronic device for managing artifacts of image
CN115379290A (en) Video processing method, device, equipment and storage medium
Zhang et al. Text extraction from images captured via mobile and digital devices
CN115278300A (en) Video processing method, video processing apparatus, electronic device, storage medium, and program product
CN114120169A (en) Video scene recognition method, device, equipment and readable storage medium
KR100438303B1 (en) Object detection system
CN112860941A (en) Cover recommendation method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant