CN100485721C - Method and system for generating video summary description data, and device for browsing the data - Google Patents

Method and system for generating video summary description data, and device for browsing the data Download PDF

Info

Publication number
CN100485721C
CN100485721C CNB008147469A CN00814746A CN100485721C CN 100485721 C CN100485721 C CN 100485721C CN B008147469 A CNB008147469 A CN B008147469A CN 00814746 A CN00814746 A CN 00814746A CN 100485721 C CN100485721 C CN 100485721C
Authority
CN
China
Prior art keywords
video
representative frame
describing
original
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB008147469A
Other languages
Chinese (zh)
Other versions
CN1382288A (en
Inventor
金在坤
张现盛
金纹哲
金镇雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electronics and Telecommunications Research Institute ETRI
Original Assignee
Electronics and Telecommunications Research Institute ETRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electronics and Telecommunications Research Institute ETRI filed Critical Electronics and Telecommunications Research Institute ETRI
Publication of CN1382288A publication Critical patent/CN1382288A/en
Application granted granted Critical
Publication of CN100485721C publication Critical patent/CN100485721C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/738Presentation of query results
    • G06F16/739Presentation of query results in form of a video summary, e.g. the video summary being a video sequence, a composite still image or having synthesized frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • G06F16/745Browsing; Visualisation therefor the internal structure of a single video sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The present invention relates to a video summary description scheme for describing video summary by meta data. The video summary provides overview functionality, which makes feasible to understand overall contents of the original video within short time and navigation and browsing functionalities, which make feasible to search the desired video contents efficiently. According to the present invention the HierarchicalSummary Description Scheme (DS) comprises at least on HighlightLevel DS and selectively comprises the SummaryThemeList DS. The HightlightLevel DS describe highlight level an may have zero or at least one lower HighlightLevel DS. The HighlightLevel DS comprises one or more HighlightSegment DS which is describing highlight segment information constituting the video summary of the highlight level. The HighlightSegment DS comprises the VideoSegmentLocator DS for describing the time information of corresponding segment interval. Also, the HighlightSegment DS may comprise the ImageLocator DS for describing the representative image information of corresponding segment, the SoundLocator DS for describing the representative sound information, and the AudioSegmentLocator DS for describing the audio segment information constituting the audio summary.

Description

Generate method, the system of video summary description data and browse the equipment of these data
Technical field
The present invention relates to a kind of video summary description scheme that is used for efficient overview and browsing video, and relate to a kind of method and system that generates video summary description, to describe video summary according to the video summary description scheme.
Technical field involved in the present invention is content-based video index and browse/search, and with video summarization by basic content, then it is described.
Background technology
The form of summarizing video mainly is divided into dynamic summary and static summary.It is unified description scheme with dynamic summary and static summary description effectively that video presentation scheme of the present invention is used for.
In general, because existing video summary and description scheme provide the video that is included in video summary block information simply, existing video summary and description scheme are subject to by playing the summary video, pass on all videos content.
Yet, under many circumstances, need and visit browsing of relative section again by general view full content sign, and be not only by summary video general view full content.
In addition, existing video summary only provides according to the standard of being determined by the video summary supplier and thinks between important video area.Therefore, if the standard of user and video vendor is different mutually, perhaps the user has particular criteria, and the user just can not obtain their required video summary.
Just, though existing summary video by some grades summary video is provided, allows the user to select the summary video of required level, it makes user's selection degree be subject to the user and can not select by the summary video content.
Title is the United States Patent (USP) 5 of " Method and apparatus for video browsing based on content andstructure (method and apparatus that is used for the video tour of content-based and structure) ", 821,945 represent video with the form of compression, and provide visit to have the function of browse of video of required content by this expression.
Yet, this patent adopts the static summary based on representative frame, though and by using the representative frame of video lens, summary has static summary, but the representative frame of this patent only provides the visual information of representative shot, and this patent has restriction for using summary to convey a message.
Compare with this patent, this video presentation scheme and browsing method use the dynamic summary based on video-frequency band.
The MPEG-7 description scheme (V0.5) that ISO/IEC JTC1/SC29/WG11 MPEG-7 output document N2844 announced in July, 1999 proposes the video summary description scheme.Because this scheme is described the block information of each video-frequency band of dynamic summary video, therefore, although provide a description the basic function of dynamic summary, this scheme has the problem of following aspect.
At first, shortcoming is that it can not provide from the visit to original video of the video-frequency band of forming the summary video.Just, the user wants according to summary content and the general view by the summary video, and the visit original video is to understand more detailed information.Yet existing scheme can not satisfy these needs.
Secondly, existing scheme can not provide enough audio summaries representation functions.
At last, a shortcoming is under the situation of expression based on the summary of incident, is repeated in this description with complexity of searching inevitable.
Summary of the invention
An object of the present invention is to provide scalable video summary description scheme, between its each video area in being included in the summary video, comprise representative frame information and representative acoustic information, and make summary that the user is provided the selection to the summary video content based on the user customizable incident, effectively browse and have feasibility and a kind of video summary description data creation method and system that uses description scheme.
In order to realize this purpose, classification summary (HierarchicalSumm-ary) DS that can carry out example according to the present invention comprises highlight grade (HighlightLevel) DS who describes highlight grade at least, and highlight grade DS comprises highlight section (HighlightSegment) DS of the highlight segment information of describing the summary video of forming this highlight grade at least.
Best, highlight grade DS comprises the highlight grade DS of at least one even lower level.
Better, highlight section DS comprises that one is described the temporal information of described corresponding highlight section or video-frequency band finger URL (VideoSegmentLocator) DS of video itself.
Best, highlight section DS further comprises visual finger URL (ImageLocator) DS of the representative frame of describing described corresponding highlight section.
Better, highlight section DS further comprises sound localization symbol (SoundLocator) DS of the representative acoustic information of describing described corresponding highlight section.
Best, highlight section DS further comprises the visual finger URL DS of the representative frame of describing described corresponding highlight section and describes the sound localization symbol DS of the representative acoustic information of described corresponding highlight section.
Better, visual finger URL DS describe with the corresponding video area of described corresponding highlight section between the temporal information or the pictorial data of representative frame.
Best, highlight section DS further comprises audio section finger URL (AudioSegmentLocator) DS of the audio section information of describing the audio summaries of forming described corresponding highlight section.
Better, audio section finger URL DS describes temporal information or the voice data between the audio zone of described corresponding highlight section.
Best, classification summary DS comprises summary the component list (SummaryComponent-List) of describing and enumerating all summary component type (SummaryComponentType) that are included among the classification summary DS.
In addition, best, classification summary DS comprises and enumerates incident or the theme that is included in the summary, and summary topic list (SummaryThemeList) DS of ID is described, describe summary then, and allow the user to browse the summary video by incident or the theme in described summary topic list, described based on incident.
Better, summary topic list DS comprises that the summary topic (SummaryTheme) of arbitrary number is as element, and described summary topic comprises the id attribute of expression corresponding incident or theme, and summary topic further comprises the incident of describing upper level or father ID (parentID) attribute of theme id.
Best, have common incident or theme if form all the highlight sections and the highlight grade of corresponding highlight grade, highlight grade DS comprises theme id collection (themeIds) attribute of describing common event or the described id attribute of theme.
Better, highlight section DS comprises theme id collection (themeIds) attribute of describing described id attribute, and describes the incident or the theme of corresponding highlight section.
According to the present invention, provide a kind of being used for to comprise the steps: that by the method for input original video (a) analyzes the original video of input and produce the video analysis result according to video summary description scheme generation video summary description data; (b) definition is used to select the summary rule between the summary video area; (c) according to described original video analysis result and described summary rule, from original video select can summarize video content video summary interval and form summary video block information; (d) according to information extraction representative frame between described summary video area; (e) produce the video summary description data according to classification summary DS, described classification summary DS makes that can carry out classification according to described summary video block information and representative frame browses.
In addition, a kind of by the system of input original video according to video summary description scheme generation video summary description data according to the invention provides, comprising: video analysis device is used to analyze the original video of input and produces the video analysis result; Summary rule definition device is used to define the summary rule that is used for selecting between the summary video area; Selecting arrangement between the summary video area, be used for according to from the video analysis result of described video analysis device with from the summary rule of described summary rule definition device, selection can be summarized between the video area of video content of original video, and forms summary video block information; The representative frame extraction element is used for according to exporting the representative frame of representing between the summary video area from the summary video block information of selecting arrangement between described summary video area; With the video summary description device, be used for by input using classification summary DS to generate the video summary description data from the summary video block information of selecting arrangement between described summary video area with from the representative frame information of described representative frame extraction element.
In addition, according to the present invention, a kind of equipment that is used for browsing video summary description data is provided, wherein, described video summary description data have the classification summary DS that is used to describe video summary, wherein, described classification summary DS comprises: highlight grade DS comprises that at least one describes and highlight section DS corresponding to the relevant information of the highlight section between a summary video area; And summary topic list DS, being used for enumerating the incident that is included in summary or theme and making the user according to described incident executive overview with browse, wherein said browsing apparatus comprises: the video playback parts are used to play original video or video summary; Original video representative frame parts are used to play the representative frame of original video; The first video summary representative frame parts are used for the first summary level in displaying video interval; The second video summary representative frame parts are used for the second summary level in displaying video interval, and wherein, the second summary level is summarized tricklelyer than the first summary level; The level alternative pack is selected the first summary level or the second summary level, thereby is made the video playback parts can play selected summary level; With the incident alternative pack, enumerate incident or theme, for the user provides summary topic list DS to browse desirable incident.
Description of drawings
With reference to the accompanying drawings embodiments of the invention are described, wherein:
Fig. 1 be illustrate be used for according to the description of the invention scheme generate the video summary description data system block scheme;
Fig. 2 adopts UML (Unified Modeling Language, unified modeling language) that the figure of the data structure of the classification summary DS that describes video summary description scheme of the present invention is shown;
Fig. 3 is used to play and browses the input user interface constitutional diagram of instrument of the summary video of the video summary description data that the description scheme identical with Fig. 2 describe;
Fig. 4 illustrates classification data of browsing and the constitutional diagram of controlling stream that uses summary video of the present invention.
Embodiment
Describe the present invention by preferred embodiment with reference to the accompanying drawings, wherein identical reference number is used for identifying identical or similar part.
Fig. 1 be illustrate be used for according to the description of the invention scheme generate the video summary description data system block scheme.
As shown in Figure 1, the device that is used to generate video description data of the present invention comprises and selects between feature extraction part 101, event detection part 102, interlude test section 103, summary video area that part 104, summary rule definition part 105, representative frame are extracted part 106, representative sound extracts part 107 and video summary description part 108.
Feature extraction part 101 is extracted by the input primitive character and is generated the required feature of summary video.General features comprises that shot boundary, video camera move, caption area, positive zone etc.
Extract characterization step, by extracting feature, these characteristic types and the video time interval that detects these features with (characteristic type, characteristic sequence number, time interval) form, are being outputed to detection incident step.
For example, under the situation that video camera moves, (video camera moves, 1,100~150) is illustrated in and detects the information that video camera first moves in 100~150 frames.
Event detection part 102 detects the critical event that is included in the original video.Because these incidents must be represented original video content well, and are the benchmark that is used to generate the summary video, therefore generally these incidents are carried out different definition according to the original video kind.
These incidents can be represented higher meaning layer, maybe can be the visual signatures that can directly infer higher meaning.For example, under the situation of football video, goal, shooting, captions, playback etc. can be defined as incident.
Event detection part 102 is with the type and the time interval of (event type, sequence of events number, time interval) output incident that detects.For example, occur in the event information of the shooting of first between 200 to 300 frames with the form output expression of (shooting, 1,200~300).
Interlude test section 103 according to the incident that is detected, is divided into video the interlude of the bigger unit of ratio incident that flows based on plot.After detecting critical event, detect interlude, comprise the incident of following of following critical event simultaneously.For example, under the situation of football video, scoring and shoot can be critical event, and the incident of following of coach's seat scene, spectators' scene, the composition critical events such as celebration scene, goal playback scenario of scoring.
Just, according to scoring and shooting detection interlude.
Detect information with (interlude number, time interval, priority, feature camera lens, dependent event information) form output interlude.At this, interlude number is the sequence number of interlude, and time interval represents with the camera lens to be the interlude time interval of unit.Priority is represented the importance degree of interlude.The feature camera lens represents to comprise that the mirror of most important information in the camera lens of forming interlude is No.1, and the event number of the dependent event information representation incident relevant with interlude.For example, be ( interlude 1,4~6,1 interlude being detected information representation, 5, score 1, captions 3) situation under, this information representation first interlude comprises the 4th~6 camera lens, priority is the highest (1), and the feature camera lens is the 5th camera lens, and dependent event is first goal and the 3rd captions.
Select part 104 to select to select between the video area of the fine summary original video content of energy between the summary video area according to the interlude that is detected.Predetermined summary rule by summary rule definition part 105 is carried out the benchmark of selecting the interval.
105 definition of summary rule definition part are used to select the rule in summary interval, and output is used to select the control signal in summary interval.Summary rule definition part 105 also will output to video summary description part 108 with the summary events type on the basis that elects between the summary video area.
Selecting part 104 between the summary video area is that unit exports the temporal information between selected summary video area with the frame, and corresponding event type between output and video area.Just, (100~200, score), the video-frequency band that forms such as (500~700, shooting) is represented to be elected to be between the summary video area is 100~200 frames, 500~700 frames etc., and two sections incident is respectively goal and shooting.In addition, can export information, help visit the additional video of only forming between the summary video area as filename.
If finish summary video interval selection,, extract part 106 and representative sound extraction part 107 extraction representative frame and representative sound from representative frame respectively by using summary video block information.
Representative frame is extracted part 106 and is exported visual frame number or the output image data of representing between the summary video area.
Representative sound extracts part 107 and exports voice data or the output sound time interval of representing between the summary video area.
Video summary description part 108 is described relevant information according to hierachical summary describing plan of the present invention shown in Figure 2, so that efficient overview and function of browse have feasibility.
The main information of hierachical summary describing plan comprise the summary video the summary events type, the temporal information between each summary video area is described, representative frame, representative sound and each interval event type.
Video summary description part 108 is according to description scheme output video summary description data shown in Figure 2.
Fig. 2 adopts UML (Unified Modeling Language, unified modeling language) that the figure of data structure of the classification summary DS of description of the invention video summary description scheme is shown.
Classification summary DS 201 describes the video summary of being made up of one or more highlight grade DS 202 and or zero summary topic list DS 203.
Summary topic list DS is formed the theme of summary or the information of incident by enumerating description, provides based on the general view of incident and the function of browsing.Highlight grade DS 202 is made up of some highlight section DS 204 and zero or several highlight grade DS, and wherein the number of highlight section DS 204 is the video interval number of the summary video of forming that grade.
Highlight section DS describe with each summary video area between corresponding information.Highlight section DS is made up of a video-frequency band finger URL DS 205, zero or some visual finger URL DS 206, zero or some sound localization symbol DS 207 and audio section finger URL 208.
Provide more detailed description below about classification summary DS.
Classification summary DS has summary the component list attribute, and this attribute is clearly represented the type of summarization that comprised by classification summary DS.
Obtain summary the component list according to the summary component type, and describe this tabulation by enumerating all included summary component type.
Exist as key frame, key video sequence fragment, crucial audio fragment, critical event in summary the component list and do not have and retrain these five types.
Key frame is represented the key frame summary be made up of representative frame.The key video sequence segment is represented by the interval key video sequence segment summary of forming that collects of key video sequence.Critical event is represented the summary formed by between the video area corresponding to incident or theme.Crucial audio-frequency fragments is represented by collecting the crucial audio-frequency fragments summary of forming between representative audio zone.And, the defined by the user type of summarization of no constraint representation except that described summary.
In addition, in order to describe the summary based on incident, classification summary DS may comprise and enumerates incident (or theme) that is included in the summary and the summary topic list DS of describing ID.
Summary topic list comprises the summary topic of arbitrary number as element.Summary topic has the id attribute of an ID type, and optionally has a father id attribute.
Summary topic list DS allows the user to browse the summary video according to each incident of describing or some themes in summary topic list.Just, the application tool of input data of description makes the user select required theme by analyzing summary topic list DS and this information being offered the user.
At this moment, these themes are being enumerated as under the situation of simple format,, may just be not easy to find out the required theme of user if the theme number is very big.
Therefore, by subject heading list being shown the tree structure that is similar to ToC (Table of Content, contents table), the user can browse each theme after finding out required theme effectively.
For this reason, the present invention allows father id attribute optionally to be used in the summary topic.Father id represents the upper strata element (upper strata theme) in the tree structure.
Classification summary DS of the present invention comprises a plurality of highlight grade DS, and each highlight grade DS comprises one or more highlight section DS corresponding to the video-frequency band of forming the summary video (or interval).
Highlight grade DS has the theme id set attribute of IDREFS type.
It is common in child's highlight grade DS of corresponding highlight grade DS that theme id collects description, or be included in theme and the incident id of all the highlight section DS in this highlight grade, and this id is described in described summary topic list DS.
Theme id collection can be represented some incidents, and when the summary of carrying out based on incident, by allowing theme id set representations form common type of theme in the highlight section of that grade, solve identical id and unnecessarily in forming all sections of that grade, repeat this problem.
Highlight section DS comprises a video-frequency band finger URL DS and one or more visual finger URL DS, and zero or a sound localization accord with DS and zero or an audio section finger URL DS.
At this, video-frequency band finger URL DS describes the temporal information or the video itself of the video-frequency band of forming the summary video.Image finger URL DS describes the image data information of the representative frame of video-frequency band.Sound localization symbol DS describes the acoustic information in the corresponding video-frequency band of expression interval.Audio section finger URL DS describes interval temporal information or the audio-frequency information itself of forming audio summaries.
Highlight section DS has theme id set attribute.Theme id collection is described, and uses the id that is defined in the summary topic list, and which theme described in described summary topic list DS or incident are relevant with corresponding highlight section.
Theme id collection can be represented a plurality of incidents, and it is an effective technology of the present invention, by allowing a highlight section have a plurality of themes, solve when the summary use based on incident is had now method, the inevitable description that video-frequency band caused of describing each incident (or theme) repeats this problem.
When describing the highlight section of forming the summary video, employing is different from the method for existing hierachical summary describing plan, temporal information between the highlight video area is only described, in order to describe the video block information of each highlight section, representative frame information, representative acoustic information, by adopting video-frequency band finger URL DS, image section finger URL DS and sound finger URL DS, the present invention is used to describe the highlight section DS that forms the summary video by introducing, makes the general view by highlight section video and uses the representative frame of section and the navigation of representative sound and browse and be able to effective use.
By adopting the sound localization symbol DS that can describe corresponding to the representative sound between video area, under actual conditions by representing the characteristic sounds between video area, for example, host's comment in rifle sound, yaup, the football (for example, score and shooting), actor name, specific word etc. in the drama, whether by roughly understanding this interval at short notice is the important interval that comprises required content, perhaps should comprise any content in the interval, effectively browse, and the displaying video interval is not possible.
Fig. 3 is used to play and browses the input user interface constitutional diagram of instrument of the summary video of the video summary description data that the description scheme identical with Fig. 2 describe.
Video playback part 301 is according to user's controls playing original video or summary video.The representative frame that original video representative frame part 305 shows in the original video camera lens.Just, its image of being dwindled by a series of sizes is formed.
Do not adopt classification summary DS of the present invention, and adopt additional description scheme to describe the representative frame of original video camera lens, and can when providing this data of description together, use in company with the summary description data of describing by classification summary DS of the present invention.
The user visits the original video camera lens corresponding with representative frame by clicking representative frame.
Summary videl stage 0 representative frame part and representative sound part 307 and summary videl stage 1 representative frame part and representative sound part 306 show frame and the acoustic information between each video area of representing summary videl stage 0 and summary videl stage 1 respectively.Just, the icon image of its a series of pictures and sounds of representative of being dwindled by size is formed.
If the representative frame of user click summary video representative frame part and representative sound part, user capture is corresponding to the original video interval of representative frame.At this, under the situation of clicking representative the sound's icon corresponding, play the representative sound between this video area with the representative frame of summary video.
Summary video control section 302 input users select control to play the summary video.Under the situation that multistage summary video is provided, the user selects a part 303 to select the summary of required level by level, carries out general view and browses.Incident selects part 304 that incident and the theme that is provided by summary topic list is provided, and the required incident of user by selecting, carries out general view and browses.Generally speaking, this has realized the summary of customization type.
Fig. 4 illustrates classification data of browsing and the constitutional diagram of controlling stream that uses summary video of the present invention.
By using the user interface of Fig. 3, the way access browsing data of employing Fig. 4 is carried out and is browsed.Browsing data is representative frame, original video 406 and the original video representative frame 405 of summary video, summary video.
Suppose that the summary video has two levels.Much less, the summary video can have than two more levels.Summary videl stage 0 401 is to summarize than the 1 403 shorter times of summary videl stage.Just, summary videl stage 1 comprises more contents than summary videl stage 0.Summary videl stage 0 representative frame 402 is representative frame of summary videl stage 0, and summary videl stage 1 representative frame 404 is representative frame of summary videl stage 1.
Summary video and original video are play by the video playback part 301 of Fig. 3.Summary videl stage 0 representative frame shows in summary videl stage 0 representative frame and representative sound part 306.Summary videl stage 1 representative frame shows in summary videl stage 1 representative frame and representative sound part 307, and the original video representative frame shows in original video representative frame part 305.
Classification browsing method shown in Figure 4 can have various types of hierarchical paths, shown in following example:
Situation 1:(1)-(2)
Situation 2:(1)-(3)-(5)
Situation 3:(1)-(3)-(4)-(6)
Situation 4:(7)-(5)
Situation 5:(7)-(4)-(6)
Comprehensively navigation scheme is as follows.
At first, by watching the summary video of original video, understand comprehensive content of original video.At this, the summary video can be play summary videl stage 0 or summary videl stage 1.When after watching the summary video, wanting to browse in more detail, identify between interested video area by summary video representative frame.If the scene identity that to be ready searching in summary video representative frame, between the video area of the original video that is connected by direct visit representative frame, is play it.And more detailed if desired information, the user is by understanding the representative frame of next stage, or understands the content of original video representative frame by classification, can visit required original video.
Though these classification browser technologies are the required content of browsing and access when playing original video, spend long time possibly, directly visit the content of original video by the classification representative frame, can lower the browsing time significantly.
Existing general video index and browser technology are that unit divides original video with the camera lens, and after formation is represented the representative frame of each camera lens, by watch required camera lens from representative frame, visit camera lens.
In this case, because the camera lens number of original video is very big, in numerous representative frame, browses required content and require a great deal of time and energy.
In the present invention, constitute the classification representative frame by using summary video representative frame, the required video of fast access is feasible.
Situation 1: play summary videl stage 0, and directly visit original video from summary videl stage 0 representative frame.
Situation 2: play summary videl stage 0, and select most interested representative frame from summary videl stage 0 representative frame, and with this representative frame near the required scene of sign in corresponding summary videl stage 1 representative frame, before the visit original video, to understand more detailed information, visit original video then.
Situation 3: be difficult under the situation of summary videl stage 1 representative frame visit original video in situation 2, select most interested representative frame, to obtain more details, and original video representative frame by contiguous this representative frame, identify required scene, use the representative frame visit original video of primitive frame then.
Situation 4 and 5 is that path and above-mentioned situation are similar with the situation of playback summary videl stage 1 beginning.
When being applied to server/client environment, the present invention can provide wherein a plurality of client access a server, and the system that can carry out the video general view and browse.Original video is input to server, according to hierachical summary describing plan, produces the video summary description data, and is equipped with the summary video description data generation system of described original video of link and video summary description data.Client computer is used the video summary description data by the communication network access server, and video is carried out general view, and by the visit original video, and video is browsed and navigated.
Although the present invention describes according to preferential embodiment, these embodiment do not provide constraints to the present invention, and only play the example effect.In addition, it should be appreciated by those skilled in the art, under the situation that does not break away from the spirit and scope of the present invention that are defined by the following claims, can the embodiment at this be made amendment and change.

Claims (3)

1. one kind is used for comprising the steps: by the method for input original video according to video summary description scheme generation video summary description data
(a) analyze the original video of input and produce the video analysis result;
(b) definition is used to select the summary rule between the summary video area;
(c) according to described original video analysis result and described summary rule, from original video select can summarize video content video summary interval and form summary video block information;
(d) according to information extraction representative frame between described summary video area; With
(e) produce the video summary description data according to hierachical summary describing plan, described hierachical summary describing plan makes that can carry out classification according to described summary video block information and representative frame browses.
2. one kind is passed through to import the system of original video according to video summary description scheme generation video summary description data, comprising:
Video analysis device is used to analyze the original video of input and produces the video analysis result;
Summary rule definition device is used to define the summary rule that is used for selecting between the summary video area;
Selecting arrangement between the summary video area, be used for according to from the video analysis result of described video analysis device with from the summary rule of described summary rule definition device, selection can be summarized between the video area of video content of original video, and output summary video block information;
The representative frame extraction element is used for according to exporting the representative frame of representing between the summary video area from the summary video block information of selecting arrangement between described summary video area; With
The video summary description device, be used for by input using hierachical summary describing plan to generate the video summary description data from the summary video block information of selecting arrangement between described summary video area with from the representative frame information of described representative frame extraction element.
3. equipment that is used for browsing video summary description data, wherein, described video summary description data have the hierachical summary describing plan that is used to describe video summary,
Wherein, described hierachical summary describing plan comprises:
The highlight grade description scheme comprises that at least one describes and highlight section description scheme corresponding to the relevant information of the highlight section between a summary video area; With
The summary topic list description scheme is used for enumerating the incident that is included in summary or theme and makes the user according to described incident executive overview with browse,
Wherein said browsing apparatus comprises:
The video playback parts are used to play original video or video summary;
Original video representative frame parts are used to play the representative frame of original video;
The first video summary representative frame parts are used for the first summary level in displaying video interval;
The second video summary representative frame parts are used for the second summary level in displaying video interval, and wherein, the second summary level is summarized tricklelyer than the first summary level;
The level alternative pack is selected the first summary level or the second summary level, thereby is made the video playback parts can play selected summary level; With
The incident alternative pack is enumerated incident or theme, for the user provides the summary topic list description scheme to browse desirable incident.
CNB008147469A 1999-10-11 2000-09-29 Method and system for generating video summary description data, and device for browsing the data Expired - Fee Related CN100485721C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1999/43712 1999-10-11
KR19990043712 1999-10-11

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN2008101619850A Division CN101398843B (en) 1999-10-11 2000-09-29 Device and method for browsing video summary description data

Publications (2)

Publication Number Publication Date
CN1382288A CN1382288A (en) 2002-11-27
CN100485721C true CN100485721C (en) 2009-05-06

Family

ID=19614707

Family Applications (2)

Application Number Title Priority Date Filing Date
CN2008101619850A Expired - Fee Related CN101398843B (en) 1999-10-11 2000-09-29 Device and method for browsing video summary description data
CNB008147469A Expired - Fee Related CN100485721C (en) 1999-10-11 2000-09-29 Method and system for generating video summary description data, and device for browsing the data

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN2008101619850A Expired - Fee Related CN101398843B (en) 1999-10-11 2000-09-29 Device and method for browsing video summary description data

Country Status (7)

Country Link
EP (1) EP1222634A4 (en)
JP (1) JP4733328B2 (en)
KR (1) KR100371813B1 (en)
CN (2) CN101398843B (en)
AU (1) AU7689200A (en)
CA (1) CA2387404A1 (en)
WO (1) WO2001027876A1 (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001333353A (en) * 2000-03-16 2001-11-30 Matsushita Electric Ind Co Ltd Data processing method, recording medium and program for executing data processing method via computer
US7134074B2 (en) 1998-12-25 2006-11-07 Matsushita Electric Industrial Co., Ltd. Data processing method and storage medium, and program for causing computer to execute the data processing method
US20020108112A1 (en) * 2001-02-02 2002-08-08 Ensequence, Inc. System and method for thematically analyzing and annotating an audio-visual sequence
US7432940B2 (en) 2001-10-12 2008-10-07 Canon Kabushiki Kaisha Interactive animation of sprites in a video production
KR100464076B1 (en) * 2001-12-29 2004-12-30 엘지전자 주식회사 Video browsing system based on keyframe
JPWO2003088665A1 (en) 2002-04-12 2005-08-25 三菱電機株式会社 Metadata editing apparatus, metadata reproduction apparatus, metadata distribution apparatus, metadata search apparatus, metadata regeneration condition setting apparatus, and metadata distribution method
CN101127899B (en) * 2002-04-12 2015-04-01 三菱电机株式会社 Hint information description method
JP4218319B2 (en) * 2002-11-19 2009-02-04 日本電気株式会社 Video browsing system and method
JP4228662B2 (en) * 2002-11-19 2009-02-25 日本電気株式会社 Video browsing system and method
US8392834B2 (en) 2003-04-09 2013-03-05 Hewlett-Packard Development Company, L.P. Systems and methods of authoring a multimedia file
EP1538536A1 (en) 2003-12-05 2005-06-08 Sony International (Europe) GmbH Visualization and control techniques for multimedia digital content
WO2005069172A1 (en) * 2004-01-14 2005-07-28 Mitsubishi Denki Kabushiki Kaisha Summarizing reproduction device and summarizing reproduction method
JP4525437B2 (en) * 2005-04-19 2010-08-18 株式会社日立製作所 Movie processing device
CN100455011C (en) * 2005-10-11 2009-01-21 华为技术有限公司 Method for providing media resource pre-review information
US8301669B2 (en) 2007-01-31 2012-10-30 Hewlett-Packard Development Company, L.P. Concurrent presentation of video segments enabling rapid video file comprehension
JP5092469B2 (en) * 2007-03-15 2012-12-05 ソニー株式会社 Imaging apparatus, image processing apparatus, image display control method, and computer program
US8238719B2 (en) 2007-05-08 2012-08-07 Cyberlink Corp. Method for processing a sports video and apparatus thereof
CN101753945B (en) * 2009-12-21 2013-02-06 无锡中星微电子有限公司 Program previewing method and device
US10679671B2 (en) 2014-06-09 2020-06-09 Pelco, Inc. Smart video digest system and method
US9998799B2 (en) * 2014-08-16 2018-06-12 Sony Corporation Scene-by-scene plot context for cognitively impaired
KR101640317B1 (en) * 2014-11-20 2016-07-19 소프트온넷(주) Apparatus and method for storing and searching image including audio and video data
CN104391960B (en) * 2014-11-28 2019-01-25 北京奇艺世纪科技有限公司 A kind of video labeling method and system
KR102350917B1 (en) * 2015-06-15 2022-01-13 한화테크윈 주식회사 Surveillance system
KR102592904B1 (en) * 2016-02-19 2023-10-23 삼성전자주식회사 Apparatus and method for summarizing image
US10409279B2 (en) * 2017-01-31 2019-09-10 GM Global Technology Operations LLC Efficient situational awareness by event generation and episodic memory recall for autonomous driving systems

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3407840B2 (en) * 1996-02-13 2003-05-19 日本電信電話株式会社 Video summarization method
JPH1169281A (en) * 1997-08-15 1999-03-09 Media Rinku Syst:Kk Summary generating device and video display device
JPH1188807A (en) * 1997-09-10 1999-03-30 Media Rinku Syst:Kk Video software reproducing method, video software processing method, medium recording video software reproducing program, medium recording video software processing program, video software reproducing device, video software processor and video software recording medium
US5956026A (en) * 1997-12-19 1999-09-21 Sharp Laboratories Of America, Inc. Method for hierarchical summarization and browsing of digital video
AU2674499A (en) * 1998-02-13 1999-08-30 Fast Tv Processing and delivery of audio-video information
US6278446B1 (en) * 1998-02-23 2001-08-21 Siemens Corporate Research, Inc. System for interactive organization and browsing of video

Also Published As

Publication number Publication date
AU7689200A (en) 2001-04-23
JP2003511801A (en) 2003-03-25
CN1382288A (en) 2002-11-27
CA2387404A1 (en) 2001-04-19
KR20010050596A (en) 2001-06-15
JP4733328B2 (en) 2011-07-27
CN101398843B (en) 2011-11-30
CN101398843A (en) 2009-04-01
EP1222634A4 (en) 2006-07-05
KR100371813B1 (en) 2003-02-11
EP1222634A1 (en) 2002-07-17
WO2001027876A1 (en) 2001-04-19

Similar Documents

Publication Publication Date Title
CN100485721C (en) Method and system for generating video summary description data, and device for browsing the data
US7181757B1 (en) Video summary description scheme and method and system of video summary description data generation for efficient overview and browsing
US10635709B2 (en) Searching for segments based on an ontology
Dimitrova et al. Applications of video-content analysis and retrieval
Money et al. Video summarisation: A conceptual framework and survey of the state of the art
Tseng et al. Using MPEG-7 and MPEG-21 for personalizing video
CN102483742B (en) For managing the system and method for internet media content
US8392834B2 (en) Systems and methods of authoring a multimedia file
Elmagarmid et al. Video Database Systems: Issues, Products and Applications
US20010020981A1 (en) Method of generating synthetic key frame and video browsing system using the same
KR100493674B1 (en) Multimedia data searching and browsing system
US20110243529A1 (en) Electronic apparatus, content recommendation method, and program therefor
CN101553814B (en) Method and apparatus for generating a summary of a video data stream
US8931002B2 (en) Explanatory-description adding apparatus, computer program product, and explanatory-description adding method
JP2001028722A (en) Moving picture management device and moving picture management system
US20040181545A1 (en) Generating and rendering annotated video files
US8051446B1 (en) Method of creating a semantic video summary using information from secondary sources
JP2010514302A (en) Method for creating a new summary for an audiovisual document that already contains a summary and report and receiver using the method
Goularte et al. Interactive multimedia annotations: enriching and extending content
JP5552987B2 (en) Search result output device, search result output method, and search result output program
JP6603929B1 (en) Movie editing server and program
Barbieri et al. Video summarization: methods and landscape
Manzato et al. Multimedia content personalization based on peer-level annotation
Manzato et al. Supporting multimedia recommender systems with peer-level annotations
Miyamori et al. Webified video: media conversion from tv program to web content and their integrated viewing method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090506

Termination date: 20140929

EXPY Termination of patent right or utility model