CN102427507B - Football video highlight automatic synthesis method based on event model - Google Patents

Football video highlight automatic synthesis method based on event model Download PDF

Info

Publication number
CN102427507B
CN102427507B CN201110294384.9A CN201110294384A CN102427507B CN 102427507 B CN102427507 B CN 102427507B CN 201110294384 A CN201110294384 A CN 201110294384A CN 102427507 B CN102427507 B CN 102427507B
Authority
CN
China
Prior art keywords
football
video
collection
choice specimens
action
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110294384.9A
Other languages
Chinese (zh)
Other versions
CN102427507A (en
Inventor
赵沁平
陈小武
蒋恺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201110294384.9A priority Critical patent/CN102427507B/en
Publication of CN102427507A publication Critical patent/CN102427507A/en
Application granted granted Critical
Publication of CN102427507B publication Critical patent/CN102427507B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to a football video highlight automatic synthesis method based on an event model. The method comprises the following steps: to a football match video highlight, defining whether a football video highlight clip can be separated into a football video event composed of a plurality of motions; constructing a core-surrounding event model to express a football highlight clip; utilizing football match video and corresponding text narration to construct a training set, selecting goals and red and yellow cards as two types of football highlights, and training the event model; inputting a segment of football match video without narration, identifying an appearance position of a football highlight clip in the input video, and giving a matching mark; according to a user requirement, automatically synthesizing a football highlight clip with a highest mark to be a football video highlight. According to a method of generating the football video highlight in the invention, restriction of factors such as a lens distance of the input video, and the method can be widely applied and popularized to fields of personal digital entertainment, physical education movie and television production and the like.

Description

A kind of football video collection of choice specimens automatic synthesis method based on event model
Technical field
The present invention relates to computer vision, Video processing and augmented reality field, specifically a kind of football video collection of choice specimens automatic synthesis method based on event model.
Background technology
The sports video collection of choice specimens is a kind of as physical culture movie and video programs, and owing to obtaining sufficient information in the short period, its dapper feature is liked by spectators deeply.Especially aspect football race, just to seeing the sportsman that likes or excellent goal shots, watch that to reach the match video of 90 minutes very consuming time, therefore often adopt the mode of the football match collection of choice specimens to record the race associated topic such as Highlight playback, race summary, sportsman's personal story.The conventional video collection of choice specimens, by artificial montage match video, although montage precision is higher and be rich in emotion, need to expends a large amount of manpowers and check that frame by frame video is to find required Highlight, and editor's race Heuristics is had relatively high expectations.Along with video is understood, the research of computer vision field is constantly progressive, for generating collection of choice specimens video automatically, sport event video becomes gradually a technology and study hotspot.
At present, different according to video film source, for generating collection of choice specimens video automatically, sport event video can be divided into two large classes.One class is the automatic collection of choice specimens for television relay video.Because television relay video has added, relay the understanding of teacher to race, when processing, can will relay the implicit clue of skill as Video Roundup.For example, when football match is relayed, close-up shot or put slowly camera lens and conventionally there will be after scoring; Between switching, twice camera lens conventionally there is same event; Long shot means grand movement track of prologue or ball etc. conventionally.These class methods, by detect above-mentioned clue in football video, complete football collection of choice specimens fragment and detect also final generation collection of choice specimens video, or directly in video, detect screen display word (for example, than distributional) and determine football collection of choice specimens fragment time of origin.Although these class methods can obtain good collection of choice specimens result to a certain extent, it is too dependent on television relay video, has very large limitation in the scope of application.
Another kind of is the automatic collection of choice specimens for non-television relay video.Wherein, video theme is had more by force to method targetedly, conventionally utilize the special priori (prioris such as the netted goal in football video, large stretch of green lawn, spectators' cheer) of this video theme, the Highlight obtaining about this video theme detects clue.Its stronger specific aim has determined that such method model fixes, and reusability is poor.And what have researching value is the collection of choice specimens method within the specific limits with general applicability.The research of this aspect at present mainly concentrates on both direction: (1) Video Events analysis; (2) video content summary.
Aspect Video Events analysis, the people such as Li Fei-Fei of the ECCV meeting Shang, Stanford University of 2010 have proposed a kind of behavior model based on mankind's action sequence relation.The behavior that this model is shown different time points by action schedule is cut apart.The method trains two kinds of models, is respectively discriminative model and display model: decision model is used for the video sequence of coding based on time decomposition, and display model is cut apart for each behavior.In identifying, by learning characteristic and behavior, cut apart to decompose and carry out mating of video and model.The method, by introducing time structure, can be identified Simple and complicated mankind action preferably, but because its time tactic pattern is fixed, cannot be competent at the complicated event being comprised of action.The people such as Larry S.Davis the CVPR meeting Shang, University of Marylands of 2009 propose a kind of method that goes out complete visual plot model from the video learning with weak flag data.Wherein plot model with or the form of figure express, the plot in video can be changed and carry out simple code.With or figure in limit be equivalent to the causality based on space-time restriction.The training data obtaining with this model and study, can carry out behavior identification and plot and extract.Consider in frame of video human body attitude and the incidence relation of object around, the people such as Fowlkes of California, USA university in 2010 propose a kind of based on human body attitude and object incidence relation modeling around, carry out the method for identification maneuver.The method mainly solves the action recognition problem of still image and is translated into potential structure tag problem.
Aspect video content summary, the method that the people such as Pritch propose on the PAMI periodical of 2008 can be by a bit of summary of segment length's video simmer down to by analyzing video, and on every frame, show the movable information of multiframe, but the limitation of the method is to process whole scene in video all at the situation of motion and video through editor simultaneously.The people such as the Hwang of University of Washington propose a kind of extraction method of key frame of cutting apart based on VS and design has realized corresponding system, can process online fast and effectively.In the CVPR meeting of 2005, the people such as the Jojic of Microsoft Research propose a kind of new interaction models for monitor video and carry out index and analyze video.In addition, the people such as Wu of Vermont State university have proposed a kind of layered video summary strategy, by analyzing video content structure, to user, provide multiple dimensioned, multi-level Video summary.
In sum, technical at Video Roundup at present, the main problem that has following two aspects: (1) depends critically upon input video quality, and the scope of application is narrower.Although use the clue of the rich semantic hint information such as camera lens switching, whistle, transition to carry out Video Roundup, can comparatively fast detect football collection of choice specimens fragment, cannot understand football event carries out process, is therefore difficult to the time interval that the event that extracts occurs.(2) lessly take event and as unit, carry out Video Roundup.Because Video Events is rich and varied, directly adopt the model of characteristic statistics method to be difficult to contain completely the variation of event, how rationally to utilize domain knowledge, to event, modeling is a difficult point and study hotspot to the visual signature of binding events.
Summary of the invention
According to above-mentioned actual demand and key issue, the object of the invention is to: propose a kind of football video collection of choice specimens automatic synthesis method based on event model.The method can break through the restriction of the factors such as the camera lens distance, video length, video sound of input video, especially when input video is non-relay video, in the time of cannot therefrom obtaining the crucial clue of the collection of choice specimens such as close-up shot, cheer, the collection of choice specimens method based on event model that the present invention proposes is particularly applicable.
It is considered herein that the football video collection of choice specimens is the synthetic video that some football collection of choice specimens fragments combine, and contains an important football event in each collection of choice specimens fragment.Compare with other sports events videos, section of football match video has two features: the first, and more difficult beginning and the end clue that finds Video Events from video; The second, football match rule is complicated, and for example, when important football event of the same type (scoring or red and yellow card) occurs at every turn its duration, course of event are often different.By a large amount of observations, learn, important football event can resolve into the combination of some actions conventionally, wherein contains an important action often occurring, is called core action; Comparatively speaking, other actions are called as action around.Therefore, it is considered herein that, section of football match video collection of choice specimens fragment can be with a core-around event model represents.
For by section of football match video simmer down to football collection of choice specimens video, need in input video, detect and extract football collection of choice specimens fragment.Therefore, first the present invention builds the event model of a core-around, modeling event and form semantic relation, sequential relationship and the visual signature between each action of event.
Core-around the training process of event model comprises following steps: (1) inputs a series of section of football match video and corresponding text commentary thereof, from commentary, extract keyword, and according to the logout of commentary, add up the probability of occurrence of each keyword, and the probability that simultaneously occurs of a plurality of keyword; (2) keyword of selected probability of occurrence maximum is kernel keyword; (3) commentary is corresponding with section of football match video, recorded key word time of occurrence, and add up duration and the incident duration that keyword represents; (4) in keyword time of occurrence section, calculate Gradient Features and the Optical-flow Feature of space-time interest points, statistical gradient histogram and light stream histogram are as the local visual feature of action.
Generally, the content of core-event model modeling around comprises: the vision statistical nature of each action; The sequencing of action in event generating process; The ratio of duration and incident duration; The probability that each action occurs.
Model is after training, for detection and the extraction of Video Events.Generally speaking, input one section of section of football match video, the step of synthetic football collection of choice specimens video can be divided into: (1) extracts collection of choice specimens fragment.For every class football collection of choice specimens fragment, first according to the contained important football event of such collection of choice specimens fragment, on input video, detect respectively the core action and the action around that form this event, obtain the time of occurrence section of each action; Then, take core action as benchmark, in conjunction with action sequence relation, determine Time To Event section, count the time period of candidate's collection of choice specimens fragment; Finally, at candidate's collection of choice specimens fragment match event model, draw Model Matching mark.(2) synthetic collection of choice specimens video.First the football collection of choice specimens fragment that is every type by step (1) draws candidate's collection of choice specimens fragment list, and it is sorted from high to low according to Model Matching mark; Then the collection of choice specimens fragment classification and the collection of choice specimens video length that according to user, need are chosen some football collection of choice specimens fragments, and arrange by its time of origin; Finally select the some frames in end of previous football collection of choice specimens fragment and the some frames of beginning of a rear fragment to seamlessly transit processing, make it more meet visual perception's effect.
Compare with other Video Roundup methods, advantage of the present invention is: (1) applicable video film source is extensive.The clues such as the camera lens feature in the time of need to relying on TV station's relay video compared to other Video Roundup methods and transition switching, the present invention is by analyzing the visual signature of Video Events, all kinds of events in detection and Identification video, thus can be widely used in the Video Roundups such as individual digital amusement, Sports Scientific Research, television program designing.(2) collection of choice specimens fragment combination is flexible.Because the present invention adopts Video Events, be Video Roundup slice unit, user specify its anthology film segment type needing, collection of choice specimens video length, etc. condition, thereby can synthesize the individualized video collection of choice specimens product that meets user's request.
Accompanying drawing explanation:
Fig. 1 is the event model structure chart of core of the present invention-around;
Fig. 2 is model training process schematic diagram of the present invention;
Fig. 3 is semantic layer event model modeling flow chart of the present invention;
Fig. 4 is vision layer event model training process flow chart of the present invention;
Fig. 5 is football collection of choice specimens fragment leaching process schematic diagram of the present invention;
Fig. 6 is football collection of choice specimens segment condense schematic diagram of the present invention.
Embodiment:
Below in conjunction with accompanying drawing, the present invention is elaborated.
The present invention defines the football video collection of choice specimens and is defined as the important football event sets that in football match, generation, the video of take are carrier.The football video collection of choice specimens is combined by a series of football collection of choice specimens fragments, and each football collection of choice specimens fragment comprises an important football event.The event model of the core that the present invention builds-around for detection of with identification section of football match video in important football event, and then extract football collection of choice specimens fragment.Football collection of choice specimens fragment is different according to the important football event type wherein comprising, and has different classes of.For example, goal and red and yellow card belong to different important football events, therefore, and the football collection of choice specimens fragment that the football collection of choice specimens fragment that comprises goal and the football collection of choice specimens fragment that comprises red and yellow card belong to a different category.
Consult the event model structure chart of Fig. 1 core of the present invention-around, the event model of the core that the present invention builds-around simultaneously semantic and visually to football collection of choice specimens fragment in the important football event that comprises carry out modeling.This model mainly comprises 3 parts: (1) semantic relation, the action of the main modeling core of this part and each possibility that around action occurs simultaneously, and the possibility that occurs in this important football event of each action.(2) time sequencing, this part is mainly modeled in important football event generating process, the time location that each action may occur and duration length.(3) visual appearance, this part mainly refers to move the visual signature statistics in space-time interest points in the video of place time interval.For similar important football event, the action of selecting a most probable to occur is considered as core action, and other actions are considered as supporting surrounding's action of this event.Therefore, the temporal relation constraint between action around and core action is by the model that is built into of implicit expression, and this is very helpful for locating events in video.
This core-around event model can be divided into two-layer when training: semantic layer and vision layer.For a class event E and the behavior aggregate { a that describes it i, i=1 ..., n}, a in semantic layer modeling event E iprobability of happening and a iwhether be the core of E.The visual appearance of vision layer modeling event, and semantic layer model is introduced as prior probability.Vision layer model has three parameters: identify certain action a ibest grader A i; Grader A ibest time of occurrence anchor point t i; A itime interval r in event generating process i.
Event model training set comprise video-frequency band { V 1..., V n, and the class label y of corresponding actions i(y i∈ 1,1}, i=1 ..., N).Adopt implicit expression Support Vector Machine LSVM to learn this model, in LSVM framework, energy function is maximized according to hidden variable, and the hidden variable here refers to that position appears in the best of classification of motion device, this position not accurately provides, but obtains by the training of training sample implicit expression.
Consult Fig. 2 football anthology film of the present invention segment model training process schematic diagram, model training process of the present invention is mainly divided into three steps: (1) semantic relation modeling.Its detailed process as shown in Figure 3, first, using the commentary with Time And Event sign as training text, through sentence element analysis, is extracted its verb, gerund keyword, and is built the keyword set of presentation of events; Based on WordNet classified vocabulary, keyword is mapped to different classes of, and using this class label as action classification label; Add up each action in this classification collection of choice specimens fragment occurrence number and occur total degree, calculating the sign degree of each action to this classification anthology film section, and selecting the action of sign degree maximum to move as core; Operation of recording frequency, and to calculate its probability of happening be prior probability.(2) action visual signature statistics.According to the time marking of commentary and action classification label, obtain the video time interval that this action occurs; Video-frequency band in this video time interval is divided into some parts, at every a histogram of gradients and light stream histogram calculating in space-time interest points.(3) sequential relationship modeling.According to the time marking of commentary, event identifier and action classification label, draw the action order of occurrence figure of the contained event of similar football collection of choice specimens fragment, according to event vision layer model, utilize LSVM to train each to move best occurrence positions.
Consult Fig. 4 vision layer of the present invention event model training process flow chart, the training process of event model of the present invention on vision layer is as follows: (1) calculated characteristics point, and by each the video V in training set p(p ∈ 1 ..., N}) be on average divided into M video-frequency band detect
Figure BDA0000095269790000052
space-time interest points
Figure BDA0000095269790000053
wherein
Figure BDA0000095269790000054
for video-frequency band
Figure BDA0000095269790000055
in space-time interest points number.(2) statistics st lhistogram of gradients
Figure BDA0000095269790000056
with light stream histogram
Figure BDA0000095269790000057
wherein the abscissa of histogram of gradients is that gradient vector is interval, and interval number represents with ng, and ordinate represents to drop on the interval gradient vector number of each vector; The histogrammic abscissa of light stream is that light stream vectors is interval, and interval number represents with nf, and ordinate represents to drop on the interval light stream vectors number of each vector.(3) histogram of gradients of each video-frequency band space-time interest points and light stream histogram are normalized to a nd dimensional vector, nd=ng+nf wherein, and utilize the k-means algorithm will individual vector gathers the class for K, constructs the coding schedule of video-frequency band vision statistical nature.(4) initialization grader A ibest time of occurrence anchor point t iand A itime interval r in event generating process i, then by step (5) (6) training classifier A i.(5) according to t iand r iintercepting video V psome video-frequency bands, add up space-time interest points vector that it comprises, and be mapped to coding schedule and form a vector distribution histogram that length is K
Figure BDA0000095269790000061
this histogram is normalized to K dimensional vector and adds positive example collection Ψ.(6) with r idetermine intercepting window size, at video V pupper slip, calculates the vector distribution histogram in time anchor point t place intercepting video-frequency band
Figure BDA0000095269790000062
calculate K dimensional vector and the concentrated vectorial distance of positive example that this histogram forms
Figure BDA0000095269790000063
if
Figure BDA0000095269790000064
(ε is that certain is indivisible), will
Figure BDA0000095269790000065
replace
Figure BDA0000095269790000066
add positive example collection, repeat this step; Otherwise finish this step.(7) statistics t is at video V pthe position of middle appearance, is fitted to secondary parabolic curve
Figure BDA0000095269790000067
{ α wherein i, β iit is conic section parameter.This secondary parabolic curve abscissa represents the time of occurrence of the t after normalization, and ordinate is illustrated in this temporal occurrence number, waits until identifying use as time penalty function.
Consult Fig. 5 football collection of choice specimens of the present invention fragment leaching process schematic diagram, this leaching process mainly comprises the following steps: (1), for input section of football match video section, detects the action likely occurring; (2) take certain class football collection of choice specimens fragment is example, uses rough time period of this football collection of choice specimens fragment of core operating position fixing of the contained important football event of such football collection of choice specimens fragment as candidate's time period of this football collection of choice specimens fragment; (3) calculate the matching degree of this candidate time period and corresponding event model, and with fraction representation, be called this candidate time period for the matching score of this football collection of choice specimens fragment.All candidates time period of similar football collection of choice specimens fragment is arranged according to matching score order from high to low.The matching process step of candidate's football collection of choice specimens fragment and event model is as follows: (1) is by candidate's football collection of choice specimens fragment V faccording to training set video, dividing partition of the scale is video-frequency band (2) get grader A i, according to its time interval r idelimitation sliding window size, at V fq section video-frequency band on slide, calculate the vector distribution histogram in time anchor point t place intercepting video-frequency band
Figure BDA0000095269790000069
calculate K dimensional vector and the concentrated vector similarity of positive example that this histogram forms
Figure BDA00000952697900000610
(3) time at anchor point t computing time place punishment
Figure BDA00000952697900000611
(4) according to formula
Figure BDA00000952697900000612
calculate grader A iat candidate's football collection of choice specimens fragment V fon best score as grader A icoupling mark; (5) cumulative Model Matching mark, and return to step (2) until all graders coupling is complete.
Consult Fig. 6 football collection of choice specimens of the present invention segment condense schematic diagram, the football anthology film segment type and the collection of choice specimens video length that according to user, need, by editing the transition effect between every two football collection of choice specimens fragments, to complete Video Roundup.Choose the last N frame of football collection of choice specimens Segment A and the beginning N frame of football collection of choice specimens fragment B as transitional region, adjust the transparency of every frame, and make the x frame transparency of the A after adjustment
Figure BDA00000952697900000613
x frame transparency with B
Figure BDA00000952697900000614
meet
Figure BDA00000952697900000615
The present invention can support according to user's request collection of choice specimens section of football match video.(1) given collection of choice specimens video length, generates the collection of choice specimens video of a football match.(2) specify football anthology film segment type, generate about specifying the football collection of choice specimens video of anthology film segment type.(3) specify collection of choice specimens video length and anthology film segment type simultaneously, generate the football collection of choice specimens video about the length-specific of such collection of choice specimens fragment.
The foregoing is only basic explanations more of the present invention, any equivalent transformation of doing according to technical scheme of the present invention, all should belong to protection scope of the present invention.

Claims (3)

1. the football video collection of choice specimens automatic synthesis method based on event model, is characterized in that comprising following steps:
(1) definition football video collection of choice specimens fragment is carried out, can be decomposed into the important football event of many combination of actions by single or many people;
(2) build the event model of a core-around, according to action probability of happening, the action of specifying most probable to occur is core action, and all the other actions are action around, and this event model specifically comprises Action Semantic relation, action sequence relation and three parts of local visual signature;
(3) utilize section of football match video and corresponding text commentary thereof to build training set, select to score and red and yellow card as the two class football collection of choice specimens, from moving semantic relation, action sequence relation and three aspects of local visual signature, train the event model of described core-around respectively;
(4) input one section of section of football match video that there is no commentary, the event model that utilizes training to obtain extracts football collection of choice specimens fragment in input video, and provides the mark that mates of candidate's collection of choice specimens fragment and model;
(5) classification of football collection of choice specimens fragment is sorted according to coupling mark, the football collection of choice specimens fragment that mark is higher synthesizes a football video collection of choice specimens automatically;
The core of described step (2)-around event model requires event can be broken down into a plurality of actions, described core-three of event model modelings around partial content:
(2.1) Action Semantic relation comprises the probability that each action occurs, and the probability that each around moves and core action occurs simultaneously;
(2.2) action sequence relation comprises the sequencing of action in event generating process, and the ratio of duration and incident duration;
(2.3) local visual feature comprises gradient and the light stream statistical nature of each action in motion time-continuing process;
Require in described step (3) the section of football match video text commentary of input containing free record and logout, can be corresponding with video time, for certain type football collection of choice specimens, train the step of described core-around as follows:
(3.1) input a series of section of football match video and corresponding text commentary thereof, from commentary, extract keyword, and according to the logout of commentary, add up the probability of occurrence of each keyword, and the probability that simultaneously occurs of a plurality of keyword;
(3.2) keyword of selected probability of occurrence maximum is kernel keyword;
(3.3) commentary is corresponding with section of football match video, recorded key word time of occurrence, and add up duration and the incident duration that keyword represents;
(3.4) in keyword time of occurrence section, calculate Gradient Features and the Optical-flow Feature of space-time interest points, statistical gradient histogram and light stream histogram are as the local visual feature of action;
Described step (4) is inputted one section of section of football match video, and its collection of choice specimens fragment leaching process is divided into following steps:
(4.1) on input video, detect respectively core action and action around, obtain the time of occurrence section of everything;
(4.2) take core action as benchmark, in conjunction with action sequence relation, determine that Time To Event section counts candidate's football collection of choice specimens fragment;
(4.3) at candidate's football collection of choice specimens fragment match event model, draw Model Matching mark.
2. the football video collection of choice specimens automatic synthesis method based on event model according to claim 1, it is characterized in that: in step (1), using Video Events as football anthology film segment unit, for the football collection of choice specimens fragment of certain type, carry out separately the football video collection of choice specimens.
3. the football video collection of choice specimens automatic synthesis method based on event model according to claim 1, it is characterized in that: while some candidate's football collection of choice specimens fragments being combined as to the football video collection of choice specimens in step (5), the collection of choice specimens type and the video length that according to user, need start to do transition processing with ending to each football collection of choice specimens fragment.
CN201110294384.9A 2011-09-30 2011-09-30 Football video highlight automatic synthesis method based on event model Active CN102427507B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110294384.9A CN102427507B (en) 2011-09-30 2011-09-30 Football video highlight automatic synthesis method based on event model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110294384.9A CN102427507B (en) 2011-09-30 2011-09-30 Football video highlight automatic synthesis method based on event model

Publications (2)

Publication Number Publication Date
CN102427507A CN102427507A (en) 2012-04-25
CN102427507B true CN102427507B (en) 2014-03-05

Family

ID=45961446

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110294384.9A Active CN102427507B (en) 2011-09-30 2011-09-30 Football video highlight automatic synthesis method based on event model

Country Status (1)

Country Link
CN (1) CN102427507B (en)

Families Citing this family (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440274B (en) * 2013-08-07 2016-09-28 北京航空航天大学 A kind of video event sketch construction described based on details and matching process
CN103886089B (en) * 2014-03-31 2017-12-15 吴怀正 Driving recording video concentration method based on study
CN104135667B (en) * 2014-06-10 2015-06-24 腾讯科技(深圳)有限公司 Video remote explanation synchronization method, terminal equipment and system
CN104038717B (en) 2014-06-26 2017-11-24 北京小鱼在家科技有限公司 A kind of intelligent recording system
CN104883478B (en) 2015-06-17 2018-11-16 北京金山安全软件有限公司 Video processing method and device
CN106993209A (en) * 2016-01-20 2017-07-28 上海慧体网络科技有限公司 A kind of method that short video clip is carried out based on mobile terminal technology
CN105959710B (en) * 2016-05-26 2018-10-26 简极科技有限公司 A kind of live streaming of sport video, shearing and storage system
CA3028328A1 (en) * 2016-06-20 2017-12-28 Gal Oz Method and system for automatically producing video highlights
CN107707931B (en) * 2016-08-08 2021-09-10 阿里巴巴集团控股有限公司 Method and device for generating interpretation data according to video data, method and device for synthesizing data and electronic equipment
US10335690B2 (en) 2016-09-16 2019-07-02 Microsoft Technology Licensing, Llc Automatic video game highlight reel
JP6767042B2 (en) * 2016-09-26 2020-10-14 国立研究開発法人情報通信研究機構 Scenario passage classifier, scenario classifier, and computer programs for it
CN106899809A (en) * 2017-02-28 2017-06-27 广州市诚毅科技软件开发有限公司 A kind of video clipping method and device based on deep learning
JP6472478B2 (en) * 2017-04-07 2019-02-20 キヤノン株式会社 Video distribution apparatus, video distribution method, and program
CN107071528A (en) * 2017-04-20 2017-08-18 暴风集团股份有限公司 A kind of display methods and display device of physical culture schedules
KR102262481B1 (en) * 2017-05-05 2021-06-08 구글 엘엘씨 Video content summary
CN108229285B (en) * 2017-05-27 2021-04-23 北京市商汤科技开发有限公司 Object classification method, object classifier training method and device and electronic equipment
CN107423274B (en) * 2017-06-07 2020-11-20 北京百度网讯科技有限公司 Artificial intelligence-based game comment content generation method and device and storage medium
CN107729821B (en) * 2017-09-27 2020-08-11 浙江大学 Video summarization method based on one-dimensional sequence learning
CN109977735A (en) * 2017-12-28 2019-07-05 优酷网络技术(北京)有限公司 Move the extracting method and device of wonderful
CN110121107A (en) * 2018-02-06 2019-08-13 上海全土豆文化传播有限公司 Video material collection method and device
CN108288475A (en) * 2018-02-12 2018-07-17 成都睿码科技有限责任公司 A kind of sports video collection of choice specimens clipping method based on deep learning
CN110366050A (en) * 2018-04-10 2019-10-22 北京搜狗科技发展有限公司 Processing method, device, electronic equipment and the storage medium of video data
CN110392281B (en) * 2018-04-20 2022-03-18 腾讯科技(深圳)有限公司 Video synthesis method and device, computer equipment and storage medium
US11594028B2 (en) * 2018-05-18 2023-02-28 Stats Llc Video processing for enabling sports highlights generation
CN108900896A (en) * 2018-05-29 2018-11-27 深圳天珑无线科技有限公司 Video clipping method and device
CN109214330A (en) * 2018-08-30 2019-01-15 北京影谱科技股份有限公司 Video Semantic Analysis method and apparatus based on video timing information
CN109407826B (en) * 2018-08-31 2020-04-07 百度在线网络技术(北京)有限公司 Ball game simulation method and device, storage medium and electronic equipment
CN109391856A (en) * 2018-10-22 2019-02-26 百度在线网络技术(北京)有限公司 Video broadcasting method, device, computer equipment and storage medium
CN109710806A (en) * 2018-12-06 2019-05-03 苏宁体育文化传媒(北京)有限公司 The method for visualizing and system of football match data
CN109919078A (en) * 2019-03-05 2019-06-21 腾讯科技(深圳)有限公司 A kind of method, the method and device of model training of video sequence selection
CN111950332B (en) * 2019-05-17 2023-09-05 杭州海康威视数字技术股份有限公司 Video time sequence positioning method, device, computing equipment and storage medium
CN112235631B (en) 2019-07-15 2022-05-03 北京字节跳动网络技术有限公司 Video processing method and device, electronic equipment and storage medium
CN110851621B (en) * 2019-10-31 2023-10-13 中国科学院自动化研究所 Method, device and storage medium for predicting video highlight level based on knowledge graph
CN110933459B (en) * 2019-11-18 2022-04-26 咪咕视讯科技有限公司 Event video clipping method, device, server and readable storage medium
CN110769178B (en) * 2019-12-25 2020-05-19 北京影谱科技股份有限公司 Method, device and equipment for automatically generating goal shooting highlights of football match and computer readable storage medium
CN111757147B (en) * 2020-06-03 2022-06-24 苏宁云计算有限公司 Method, device and system for event video structuring
WO2022007545A1 (en) * 2020-07-06 2022-01-13 聚好看科技股份有限公司 Video collection generation method and display device
CN111935155B (en) * 2020-08-12 2021-07-30 北京字节跳动网络技术有限公司 Method, apparatus, server and medium for generating target video
CN112182297A (en) * 2020-09-30 2021-01-05 北京百度网讯科技有限公司 Training information fusion model, and method and device for generating collection video
CN113537052B (en) * 2021-07-14 2023-07-28 北京百度网讯科技有限公司 Video clip extraction method, device, equipment and storage medium
CN113792654A (en) * 2021-09-14 2021-12-14 湖南快乐阳光互动娱乐传媒有限公司 Video clip integration method and device, electronic equipment and storage medium
CN115119050B (en) * 2022-06-30 2023-12-15 北京奇艺世纪科技有限公司 Video editing method and device, electronic equipment and storage medium
CN115412765B (en) * 2022-08-31 2024-03-26 北京奇艺世纪科技有限公司 Video highlight determination method and device, electronic equipment and storage medium
CN117478824B (en) * 2023-12-27 2024-03-22 苏州元脑智能科技有限公司 Conference video generation method and device, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040167767A1 (en) * 2003-02-25 2004-08-26 Ziyou Xiong Method and system for extracting sports highlights from audio signals
JP4683031B2 (en) * 2007-10-17 2011-05-11 ソニー株式会社 Electronic device, content classification method and program thereof
US8437620B2 (en) * 2010-03-05 2013-05-07 Intel Corporation System, method, and computer program product for custom stream generation
CN102073864B (en) * 2010-12-01 2015-04-22 北京邮电大学 Football item detecting system with four-layer structure in sports video and realization method thereof

Also Published As

Publication number Publication date
CN102427507A (en) 2012-04-25

Similar Documents

Publication Publication Date Title
CN102427507B (en) Football video highlight automatic synthesis method based on event model
Duarte et al. How2sign: a large-scale multimodal dataset for continuous american sign language
CN110245259B (en) Video labeling method and device based on knowledge graph and computer readable medium
US10277946B2 (en) Methods and systems for aggregation and organization of multimedia data acquired from a plurality of sources
Ramanishka et al. Multimodal video description
Tapaswi et al. Book2movie: Aligning video scenes with book chapters
JP5691289B2 (en) Information processing apparatus, information processing method, and program
CN102110399B (en) A kind of assist the method for explanation, device and system thereof
WO2012020667A1 (en) Information processing device, information processing method, and program
Ma et al. Learning to generate grounded visual captions without localization supervision
Oncescu et al. Queryd: A video dataset with high-quality text and audio narrations
Stappen et al. Muse 2020 challenge and workshop: Multimodal sentiment analysis, emotion-target engagement and trustworthiness detection in real-life media: Emotional car reviews in-the-wild
Zhu et al. Languagebind: Extending video-language pretraining to n-modality by language-based semantic alignment
CN112182297A (en) Training information fusion model, and method and device for generating collection video
Chen et al. Sporthesia: Augmenting sports videos using natural language
Narwal et al. A comprehensive survey and mathematical insights towards video summarization
Sah et al. Understanding temporal structure for video captioning
CN113407778A (en) Label identification method and device
Saleem et al. Stateful human-centered visual captioning system to aid video surveillance
Jitaru et al. Lrro: a lip reading data set for the under-resourced romanian language
Stappen et al. MuSe 2020--The First International Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop
Jiao et al. Video highlight detection via region-based deep ranking model
CN114691923A (en) System and method for computer learning
Snoek The authoring metaphor to machine understanding of multimedia
Tian et al. Script-to-Storyboard: A New Contextual Retrieval Dataset and Benchmark

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant