CN110856013A

CN110856013A - Method, system and storage medium for identifying key segments in video

Info

Publication number: CN110856013A
Application number: CN201911137199.1A
Authority: CN
Inventors: 刘辛倩
Original assignee: Gree Electric Appliances Inc of Zhuhai
Current assignee: Gree Electric Appliances Inc of Zhuhai
Priority date: 2019-11-19
Filing date: 2019-11-19
Publication date: 2020-02-28

Abstract

The invention provides a method, a system and a storage medium for identifying key segments in a video, wherein the method comprises the following steps: intercepting a frame in a video to obtain a picture corresponding to the frame; comparing the picture corresponding to the frame with all pictures in a preset picture library, wherein all the pictures in the preset picture library are divided according to categories, and each category has a respective category label; when the similarity between the picture corresponding to the frame and a certain picture in a preset picture library reaches a first preset threshold value, determining a category label of the picture in the preset picture library, and configuring the category label to the picture corresponding to the frame; when the pictures corresponding to the continuous multiframes in the video are all configured with the category labels, determining that the video clip formed by the continuous multiframes is a first key clip, and setting a watching prompt for the first key clip to prompt the audience to watch at will.

Description

Method, system and storage medium for identifying key segments in video

Technical Field

The present invention relates to video filtering technologies, and in particular, to a method, a system, and a storage medium for identifying key segments in a video.

Background

Based on the development of the film and television entertainment industry, various film and television dramas are in endless. While the content of the movie is more and more diversified, the preference and the demand of the user are more and more personalized. For a star pursuit user, the user wants to shield a star segment which the user does not like when watching a drama; for a small user, the user wants to mask some images of blood smell violence or want to be informed of the imminent terrorist segment in advance; for children, parents want to mask off some pictures, etc., that children are not comfortable with. At this time, it is very important to accurately identify the video content being played.

In the prior art, when a user watches a movie and a television play and meets uninteresting or unwanted segments, the user can only use a progress bar to adjust, but the adjustment of the progress bar often cannot be accurately adjusted to a target position. The user is required to adjust back and forth several times to accurately adjust the corresponding position. The operation mode is single, inconvenient and quick, and the error rate is high. Secondly, some users need to be informed of the uncomfortable picture in advance when watching the video, so that the users can make psychological preparation in advance; still other users have a need to predict a scenario. So far, the user can only know the situation through film evaluation, bullet screen or progress bar dragging. In addition, for the focus-only segment, the current technology can only support the identification of the portrait in the video, and only the identified segment of the designated portrait is played. Example (c): and (4) cooling: "see only for his" function.

However, the above-described techniques have failed to meet the diverse demands of current users. For example, a user may want to mask certain specific types of clips while watching a video, or may need to prompt ahead before certain key clips occur.

Disclosure of Invention

In order to solve the problems and meet the diversified requirements of people for watching videos, the video content is identified, the video clip containing sensitive content or wonderful content is determined, and the user is reminded to watch the video clip as appropriate.

In view of the above, the present invention is directed to a method, system and storage medium for identifying key segments in a video.

In a first aspect, the present invention provides a method for identifying key segments in a video, comprising the following steps: intercepting a frame in a video to obtain a picture corresponding to the frame; comparing the picture corresponding to the frame with all pictures in a preset picture library, wherein all the pictures in the preset picture library are divided according to categories, and each category has a respective category label; when the similarity between the picture corresponding to the frame and a certain picture in a preset picture library reaches a first preset threshold value, determining a category label of the picture in the preset picture library, and configuring the category label to the picture corresponding to the frame; when the pictures corresponding to the continuous multiframes in the video are all configured with the category labels, determining that the video clip formed by the continuous multiframes is a first key clip, and setting a watching prompt for the first key clip to prompt the audience to watch at will.

Preferably, the method further comprises: extracting audio in the video; analyzing the audio, and determining audio segments with the volume larger than a second preset threshold; and taking the video clip corresponding to the audio clip as a second key clip, and setting a watching prompt for the second key clip.

Preferably, the method further comprises: extracting audio in the video; extracting background music according to the voiceprint characteristics of the audio; comparing the background music with each music fragment in a preset music library; and when the background music is matched with a certain music segment in a preset music library, determining that the video segment corresponding to the background music is a third key segment, and setting a watching prompt for the third key segment.

Preferably, the method further comprises: analyzing the bullet screen information in the video, and acquiring video clips in the video with the bullet screen number exceeding a third preset threshold value; and determining the video clip as a fourth key clip, and setting a watching prompt for the fourth key clip.

Preferably, the setting of the viewing reminder for the fourth key segment includes: extracting all bullet screens contained in the fourth key segment, and acquiring entries with highest occurrence frequency in all the bullet screens; and setting the entry at the first frame of the fourth key fragment, performing pause processing at the first frame of the fourth key fragment, and setting the entry as a popup frame.

Preferably, the setting of the viewing reminder for the first key segment includes: analyzing the category label of the picture corresponding to each frame in the first key fragment, and determining the most used category label of the picture corresponding to each frame; taking the most used category label as a keyword, and setting the keyword at the first frame of the first key fragment; pause processing is carried out at the first frame of the first key fragment, and the key words are set to be popup frames.

Preferably, the setting of the viewing reminder for the first key segment further includes: an option to skip the first key fragment is set at the first frame of the first key fragment.

Preferably, when the first key segment contains a second key segment, the viewing alert set for the second key segment is cancelled.

In a second aspect, the present invention further provides a system for identifying key segments in a video, including a memory and a processor, where the memory stores program instructions, and the program instructions, when executed by the processor, implement any one of the above methods for identifying key segments in a video.

In a third aspect, the present invention further provides a storage medium storing program instructions, which when executed by a processor, implement any one of the above methods for identifying key segments in a video.

Compared with the prior art, the invention has the following advantages or beneficial effects:

according to the method and the device, the sensitive segment or the wonderful segment in the video is determined through the video picture, the audio and/or the barrage information in the video, and the user is reminded to watch the video as appropriate in a mode of pausing and popping up the prompt box before the segment starts, or an option of skipping the sensitive segment by one key is further provided for the user, so that the film watching experience of the user is increased, and the satisfaction degree of the user is improved.

Drawings

The scope of the invention will be better understood from the following detailed description of exemplary embodiments when read in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow chart of a method of identifying key snippets in a video according to the present invention;

FIG. 2 is a flowchart of the steps for setting a viewing reminder for a first key fragment;

FIG. 3 is a flow chart of determining key snippets from background music in audio;

fig. 4 is a flowchart for determining key segments by bullet screen information in a video.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, embodiments of the present invention are described in detail below with reference to the accompanying drawings and examples, so that how to apply technical means to solve technical problems and achieve a technical effect can be fully understood and implemented.

The invention is based on the idea that by analyzing video picture, audio and barrage information in a video, sensitive segments in the video are determined to remind a user to watch the video as appropriate.

Example one

The method for identifying the key segments in the video, provided by the embodiment of the invention, can be applied to a terminal. Fig. 1 is a flowchart of a method for identifying key segments in a video according to the present invention, and each step is described in detail below with reference to fig. 1. As shown in fig. 1, the method mainly comprises the following steps:

s1, capturing a frame in the video to obtain a picture corresponding to the frame;

each video comprises a plurality of frames of pictures, and the pictures corresponding to a plurality of continuous frames may be similar.

S2, comparing the picture corresponding to the frame with each picture in a preset picture library; the method comprises the steps that pictures in a preset picture library are divided according to categories, and each category is provided with a respective category label;

the preset picture library comprises a plurality of pictures, the pictures are divided according to categories, and each category is provided with a respective category label. For example, category labels may include violence, bloody smell, sentiment, etc. associated with sensitive content.

S3, when the similarity between the picture corresponding to the frame and a certain picture in a preset picture library reaches a first preset threshold value, determining the category label of the picture in the preset picture library, and configuring the category label to the picture corresponding to the frame;

the first preset threshold is a reference standard related to the similarity, and may be set according to actual situations, for example, the first preset threshold is 85%. In step S3, each frame in the video that is related to sensitivity is configured with a category label. Some of the frames associated with sensitivity may be consecutive frames and some may be separate frames. Whereas successive frames indicate that the video segment made up of these frames should be largely sensitivity dependent.

S4, when the pictures corresponding to the continuous multiframes in the video are all configured with the category labels, determining that the video clip formed by the continuous multiframes is a first key clip, and setting a watching prompt for the first key clip;

wherein the first key segment is the determined video segment containing the sensitive content. Fig. 2 is a flowchart of the steps of setting a viewing reminder for a first key segment. As shown in fig. 2, in this embodiment, setting a viewing reminder for the first key segment may be implemented in a pop-up frame manner, including the following steps:

s401, analyzing the category label of the picture corresponding to each frame in the first key fragment, and determining the category label most used by the picture corresponding to each frame;

each frame in the first key fragment corresponds to a category label associated with the sensitive content, and the category labels are classified, for example, 100 category labels are violent, 27 category labels are bloody smell, 2 category labels are emotional, and the like. In this example, it may be determined that the category label most used by the picture corresponding to each frame is violence.

S402, taking the most used category label as a keyword, and setting the keyword at the first frame of the first key fragment;

s403, performing pause processing on the first frame of the first key fragment, and setting the key word as a popup frame.

When the user views the first key segment, the user pauses at the first frame of the first key segment, and a pop-up box containing the key words pops up to remind the user of the content to be viewed next. For example, a violent letter is in the bullet box, and the user is reminded to watch the violent letter as appropriate. In this way, the user can either view the next sensitive content with the hope of anticipating it, or choose not to see the next sensitive content.

In addition, the setting of the watching reminding for the first key segment can also be realized by a mode of flicking a frame and shielding, and the method further comprises the following steps:

s404, setting an option of skipping the first key fragment at the first frame of the first key fragment.

In order to simplify the operation complexity of the user who does not want to view the sensitive content, the step S404 sets an option of skipping the sensitive segment, which can help the user to skip the sensitive segment by one key.

In the above, the process of determining the key segment by analyzing the video pictures in the video is described; the following will describe a process of determining key segments by audio in a video, including the following steps:

s5, extracting the audio in the video;

s6, analyzing the audio, and determining audio segments with the volume larger than a second preset threshold;

wherein the second preset threshold is a reference criterion related to the volume, for example, in some videos, suddenly appearing screaming sound or loud sound, some users are unacceptable, and if the user is reminded before the occurrence of treble, the satisfaction of the user is increased. Therefore, the second predetermined threshold is a maximum value of sound generally accepted by human ears.

And S7, taking the video clip corresponding to the audio clip as a second key clip, and setting a watching prompt for the second key clip.

When the first key segment contains the second key segment, the viewing reminder set for the second key segment can also be cancelled. That is, when two key snippets determined contain each other, the viewing alert is set only at the first frame of the key snippets of longer duration to prevent affecting the viewing experience of the viewer who has elected to continue viewing sensitive content. In the embodiment, the video pictures and the audio are combined to comprehensively analyze the key segments, so that the analysis of the video realizes deep analysis and accurate analysis. Of course, in other embodiments, the video pictures or audio alone may be used to analyze the key snippets separately.

In the embodiment, the video pictures and the audio in the video are comprehensively analyzed to determine the sensitive segment in the video, and before the user watches the sensitive segment, the user is reminded and a one-key skipping option is provided for the user, so that the film watching experience of the user is enhanced, and the satisfaction degree of the user is improved.

Example two

Unlike the first embodiment, in the second embodiment, the audio is analyzed by analyzing the background music, rather than by analyzing the volume, but of course, the two audio analysis methods may be combined. Fig. 3 is a flow chart of determining key segments by background music in audio, and as shown in fig. 3, the following describes a process of determining key segments by background music in video:

s5', extracting the audio frequency in the video;

s6', extracting background music according to the voiceprint characteristics of the audio;

s7', comparing the background music with each music fragment in the preset music library;

the music pieces stored in the preset music library are background music with terrorist atmosphere, which is commonly used in movie and television dramas.

S8', when the background music matches with a music segment in the preset music library, determining that the video segment corresponding to the background music is a third key segment, and setting a watching prompt for the third key segment.

In step S8', for whether the two pieces of music match, a plurality of matching modes may be set, for example, when 50% of the pitches of the notes are at the same level, the two pieces of music are considered to match, that is, the atmospheres in which the two pieces of music are baked are relatively similar. It is also possible to test whether two pieces of music are similar using some software that discriminates against music piracy.

In the second embodiment, a method for determining key segments through background music is provided, which can be used alone or in combination with the first embodiment, so that the means for identifying key segments in a video is more diversified, and the identification result is more accurate.

EXAMPLE III

At present, the video is provided with a barrage function, and after the video is watched by audiences, barrage information can be left, so that the evaluation of the audiences on the video is given. Fig. 4 is a flowchart for determining key segments by bullet screen information in a video. As shown in fig. 4, in the third embodiment, a method for determining a key segment through barrage information is provided, which includes the following steps:

s11, analyzing bullet screen information in the video, and acquiring video clips in the video with the bullet screen number exceeding a third preset threshold;

the third preset threshold is a reference extreme value related to the number of bullet screens, and can be determined according to the actual situation such as the total number of people watching the video, for example, the number of bullet screens of a certain video segment exceeds 500. In this embodiment, if an excessive specified number of bullet screens, for example, 10 bullet screens, appear in a specified time period, for example, 5 seconds, the start time of the specified time period may be considered as the start time of the key segment; when the time of more than 10 bullet screens within 5 seconds is not maintained after the start time, the end time of the key segment is considered. Video clips with high bullet screen appearance frequency can also be determined in other ways.

S12, determining the video clip as a fourth key clip;

s13, extracting all barrages contained in the fourth key segment, and acquiring entries with the highest occurrence frequency in all the barrages;

there are various possibilities for not determining the nature of the entry, perhaps the name of an actor, perhaps "horror". The purpose of the reminder set forth in this embodiment is to prompt the viewer for the next highlight and to forego what may be relevant, prompting the user for ready viewing.

S14, setting the entry at the first frame of the fourth key fragment;

and S15, performing pause processing at the first frame of the fourth key fragment, and setting the entry as a bullet frame.

The method of the third embodiment can be combined with the first and second embodiments to determine the key segments in the video, so that the method is not limited to determining only the key segments related to the sensitive content, and can also determine the highlight segments discussed by the audience with high intensity in the video to remind the audience not to miss the highlight segments.

Example four

The embodiment provides a system for identifying key segments in a video, which comprises a memory and a processor, wherein the memory stores program instructions, and the program instructions are executed by the processor to realize the method for identifying key segments in the video. The system may be a computer or a mobile terminal, and the program instructions may be a computer-readable program or a mobile terminal-readable program. The program instructions of the invention can also be modularized and directly embedded into software for playing videos.

The present embodiment also provides a storage medium storing program instructions, which when executed by a processor, implement the method for identifying key segments in a video according to the present invention. The storage medium can be a computer-readable mobile magnetic disk, an optical disk and the like which are acceptable by a computer, or a SIM card or an SD card and the like which are readable by a terminal.

The terminal mentioned in the invention can be a mobile terminal, including a smart phone, a tablet computer, a palm computer and various common portable mobile intelligent terminals.

The above embodiments are only specific embodiments of the present invention. It is obvious that the invention is not limited to the above embodiments, but that many variations are possible. All modifications attainable by one versed in the art from the present disclosure within the scope and spirit of the present invention are to be considered as within the scope and spirit of the present invention.

Claims

1. A method for identifying key segments in a video, the method comprising the steps of:

intercepting a frame in a video to obtain a picture corresponding to the frame;

comparing the picture corresponding to the frame with all pictures in a preset picture library, wherein all the pictures in the preset picture library are divided according to categories, and each category has a respective category label;

when the similarity between the picture corresponding to the frame and a certain picture in a preset picture library reaches a first preset threshold value, determining a category label of the picture in the preset picture library, and configuring the category label to the picture corresponding to the frame;

when the pictures corresponding to the continuous multiframes in the video are all configured with the category labels, determining that the video clip formed by the continuous multiframes is a first key clip, and setting a watching prompt for the first key clip to prompt the audience to watch at will.

2. The method for identifying key segments in a video according to claim 1, further comprising:

extracting audio in the video;

analyzing the audio, and determining audio segments with the volume larger than a second preset threshold;

and taking the video clip corresponding to the audio clip as a second key clip, and setting a watching prompt for the second key clip.

3. The method for identifying key segments in a video according to claim 1, further comprising:

extracting audio in the video;

extracting background music according to the voiceprint characteristics of the audio;

comparing the background music with each music fragment in a preset music library;

and when the background music is matched with a certain music segment in a preset music library, determining that the video segment corresponding to the background music is a third key segment, and setting a watching prompt for the third key segment.

4. The method for identifying key segments in a video according to claim 1, further comprising:

analyzing the bullet screen information in the video, and acquiring video clips in the video with the bullet screen number exceeding a third preset threshold value;

and determining the video clip as a fourth key clip, and setting a watching prompt for the fourth key clip.

5. The method for identifying key segments in video according to claim 4, wherein the setting of the viewing reminder for the fourth key segment includes:

extracting all bullet screens contained in the fourth key segment, and acquiring entries with highest occurrence frequency in all the bullet screens;

the entry is placed at the first frame of the fourth key fragment,

and performing pause processing on the first frame of the fourth key fragment, and setting the entry as a bullet frame.

6. The method for identifying key segments in a video according to claim 1, wherein setting a viewing reminder for a first key segment comprises:

analyzing the category label of the picture corresponding to each frame in the first key fragment, and determining the most used category label of the picture corresponding to each frame;

taking the most used category label as a keyword, and setting the keyword at the first frame of the first key fragment;

pause processing is carried out at the first frame of the first key fragment, and the key words are set to be popup frames.

7. The method for identifying key segments in a video according to claim 6, wherein setting a viewing reminder for a first key segment further comprises:

an option to skip the first key fragment is set at the first frame of the first key fragment.

8. The method for identifying key segments in a video according to claim 2, wherein when the first key segment contains the second key segment, the viewing alert set for the second key segment is cancelled.

9. A system for identifying key snippets in a video, comprising a memory and a processor, the memory storing program instructions which, when executed by the processor, implement the method for identifying key snippets in a video according to claims 1 to 8.

10. A storage medium storing program instructions which, when executed by a processor, implement the method for identifying key segments in a video according to claims 1-8.